Transformer networks were first developed for language translation and have recently shown great success, especially with the rise of the GPT family. In this talk, I will discuss how Transformers can be applied to data analysis in particle collider experiments. The key idea behind Transformers is the attention mechanism; I will explain different types of attention, including self-attention, cross-attention and sparse attention. Finally, I will discuss recent progress in multi-agent AI approaches that combine pretrained large language models with particle physics tasks.
References: arXiv:2401.00452 , arXiv:2505.03258, arXiv:2601.21015
SOM group meeting organisers