In this talk, an overview of the statistical inference methods used in high-energy physics (HEP) is presented, with a focus on its practical implementation in Large Hadron Collider (LHC) analyses using binned data.
The first part will introduce the conceptual foundations of simulation-based inference, emphasizing the challenges posed by the high-dimensional data produced at the LHC. To address this, summary observables are used to reduce the complexity of the data, with machine learning playing a key role in this task. This enables the construction of histogram-based likelihood functions, which are the key element for extracting physical parameters from data. The implementation of systematic uncertainties in the likelihood, as well as techniques for parameter estimation and the evaluation of discovery significance will also be discussed.
In the second part of the talk, these concepts will be illustrated through a concrete example: the measurement of the Higgs boson production cross-section in association with top quarks (ttH) in multi-lepton final states using the ATLAS Run 2 dataset.