https://doi.org/10.1140/epjp/s13360-024-05910-9
Regular Article
Boosted top tagging and its interpretation using Shapley values
1
Center for High Energy Physics, Indian Institute of Science, Bengaluru, Karnataka, India
2
Department of Physics, School of Engineering and Sciences, SRM University AP, 522240, Amaravati, Mangalagiri, India
3
Bethe Center for Theoretical Physics and Physikalisches Institut der Universität Bonn, Nußallee 12, 53115, Bonn, Germany
Received:
28
July
2024
Accepted:
9
December
2024
Published online:
27
December
2024
Top tagging has emerged as a fast-evolving subject due to the top quark’s significant role in probing physics beyond the standard model. For the reconstruction of top jets, machine learning models have shown a substantial improvement in the classification performance compared to the previous methods. In this work, we build top taggers using N-Subjettiness ratios and several energy correlation observables as input features to train the eXtreme Gradient BOOSTed decision tree (XGBOOST). The study finds that tighter parton-level matching leads to more accurate tagging. However, in real experimental data, where the parton-level data are unknown, this matching cannot be done. We train the XGBOOST models without performing this matching and show that this difference impacts the taggers’ effectiveness. Additionally, we test the tagger under different simulation conditions, including changes in centre-of-mass energy, parton distribution functions (PDFs), and pileup effects, demonstrating its robustness with performance deviations of less than 1%. Furthermore, we use the SHapley Additive exPlanation (SHAP) framework to calculate the importance of the features of the trained models. It helps us to estimate how much each feature of the data contributed to the model’s prediction and what regions are of more importance for each input variable. Finally, we combine all the tagger variables to form a hybrid tagger and interpret the results using the Shapley values.
Copyright comment Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
© The Author(s), under exclusive licence to Società Italiana di Fisica and Springer-Verlag GmbH Germany, part of Springer Nature 2024
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.