Explainable equivariant neural networks for particle physics: PELICAN

Bogatskiy, Alexander; Hoffman, Timothy; Miller, David W.; Offermann, Jan T.; Liu, Xiaoyang

doi:10.1007/JHEP03(2024)113

Explainable equivariant neural networks for particle physics: PELICAN

Regular Article - Theoretical Physics
Open access
Published: 20 March 2024

Volume 2024, article number 113, (2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of High Energy Physics Aims and scope Submit manuscript

Explainable equivariant neural networks for particle physics: PELICAN

Download PDF

431 Accesses
5 Citations
Explore all metrics

A preprint version of the article is available at arXiv.

Abstract

PELICAN is a novel permutation equivariant and Lorentz invariant or covariant aggregator network designed to overcome common limitations found in architectures applied to particle physics problems. Compared to many approaches that use non-specialized architectures that neglect underlying physics principles and require very large numbers of parameters, PELICAN employs a fundamentally symmetry group-based architecture that demonstrates benefits in terms of reduced complexity, increased interpretability, and raw performance. We present a comprehensive study of the PELICAN algorithm architecture in the context of both tagging (classification) and reconstructing (regression) Lorentz-boosted top quarks, including the difficult task of specifically identifying and measuring the W-boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state. We also extend the application of PELICAN to the tasks of identifying quark-initiated vs. gluon-initiated jets, and a multi-class identification across five separate target categories of jets. When tested on the standard task of Lorentz-boosted top-quark tagging, PELICAN outperforms existing competitors with much lower model complexity and high sample efficiency. On the less common and more complex task of 4-momentum regression, PELICAN also outperforms hand-crafted, non-machine learning algorithms. We discuss the implications of symmetry-restricted architectures for the wider field of machine learning for physics.

Article PDF

Interpretable deep learning models for the inference and classification of LHC data

Article Open access 02 May 2024

Energy flow networks: deep sets for particle jets

Article Open access 15 January 2019

Shared Data and Algorithms for Deep Learning in Fundamental Physics

Article Open access 03 May 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

J. Gallicchio and M.D. Schwartz, Quark and Gluon Jet Substructure, JHEP 04 (2013) 090 [arXiv:1211.7038] [INSPIRE].
Article ADS Google Scholar
A.J. Larkoski, J. Thaler and W.J. Waalewijn, Gaining (Mutual) Information about Quark/Gluon Discrimination, JHEP 11 (2014) 129 [arXiv:1408.3122] [INSPIRE].
Article ADS Google Scholar
P.T. Komiske, E.M. Metodiev and M.D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination, JHEP 01 (2017) 110 [arXiv:1612.01551] [INSPIRE].
Article ADS Google Scholar
E. Alvarez, M. Spannowsky and M. Szewc, Unsupervised Quark/Gluon Jet Tagging With Poissonian Mixture Models, Front. Artif. Intell. 5 (2022) 852970 [arXiv:2112.11352] [INSPIRE].
Article Google Scholar
G. Kasieczka, N. Kiefer, T. Plehn and J.M. Thompson, Quark-Gluon Tagging: Machine Learning vs Detector, SciPost Phys. 6 (2019) 069 [arXiv:1812.09223] [INSPIRE].
Article ADS Google Scholar
J.M. Butterworth, A.R. Davison, M. Rubin and G.P. Salam, Jet substructure as a new Higgs search channel at the LHC, Phys. Rev. Lett. 100 (2008) 242001 [arXiv:0802.2470] [INSPIRE].
Article ADS Google Scholar
D.E. Kaplan, K. Rehermann, M.D. Schwartz and B. Tweedie, Top Tagging: A Method for Identifying Boosted Hadronically Decaying Top Quarks, Phys. Rev. Lett. 101 (2008) 142001 [arXiv:0806.0848] [INSPIRE].
Article ADS Google Scholar
A. Butter et al., The Machine Learning landscape of top taggers, SciPost Phys. 7 (2019) 014 [arXiv:1902.09914] [INSPIRE].
Article ADS Google Scholar
J. Thaler and L.-T. Wang, Strategies to Identify Boosted Tops, JHEP 07 (2008) 092 [arXiv:0806.0023] [INSPIRE].
Article ADS Google Scholar
D.E. Soper and M. Spannowsky, Finding top quarks with shower deconstruction, Phys. Rev. D 87 (2013) 054012 [arXiv:1211.3140] [INSPIRE].
Article ADS Google Scholar
I. Feige, M.D. Schwartz, I.W. Stewart and J. Thaler, Precision Jet Substructure from Boosted Event Shapes, Phys. Rev. Lett. 109 (2012) 092001 [arXiv:1204.3898] [INSPIRE].
Article ADS Google Scholar
Y.-T. Chien and I. Vitev, Jet Shape Resummation Using Soft-Collinear Effective Theory, JHEP 12 (2014) 061 [arXiv:1405.4293] [INSPIRE].
Article ADS Google Scholar
S. Marzani, L. Schunk and G. Soyez, The jet mass distribution after Soft Drop, Eur. Phys. J. C 78 (2018) 96 [arXiv:1712.05105] [INSPIRE].
Article ADS Google Scholar
F.A. Dreyer, G.P. Salam and G. Soyez, The Lund Jet Plane, JHEP 12 (2018) 064 [arXiv:1807.04758] [INSPIRE].
Article ADS Google Scholar
R. Kogler, Advances in Jet Substructure at the LHC: Algorithms, Measurements and Searches for New Physical Phenomena, Springer Tracts Mod. Phys. (STMP) 284 (2021). [INSPIRE].
J. Thaler and K. Van Tilburg, Identifying Boosted Objects with N-subjettiness, JHEP 03 (2011) 015 [arXiv:1011.2268] [INSPIRE].
Article ADS Google Scholar
CMS collaboration, First Measurement of Hadronic Event Shapes in pp Collisions at \(\sqrt{s}\) = 7 TeV, Phys. Lett. B 699 (2011) 48 [arXiv:1102.0068] [INSPIRE].
P.T. Komiske, E.M. Metodiev and J. Thaler, Energy flow polynomials: A complete linear basis for jet substructure, JHEP 04 (2018) 013 [arXiv:1712.07124] [INSPIRE].
Article ADS Google Scholar
R. Kogler et al., Jet Substructure at the Large Hadron Collider: Experimental Review, Rev. Mod. Phys. 91 (2019) 045003 [arXiv:1803.06991] [INSPIRE].
Article ADS Google Scholar
S. Marzani, G. Soyez and M. Spannowsky, Looking inside jets: an introduction to jet substructure and boosted-object phenomenology, Springer (2019) [https://doi.org/10.1007/978-3-030-15709-8] [INSPIRE].
M. Feickert and B. Nachman, A Living Review of Machine Learning for Particle Physics, arXiv:2102.02770 [INSPIRE].
A. Butter, G. Kasieczka, T. Plehn and M. Russell, Deep-learned Top Tagging with a Lorentz Layer, SciPost Phys. 5 (2018) 028 [arXiv:1707.08966] [INSPIRE].
Article ADS Google Scholar
M. Erdmann, E. Geiser, Y. Rath and M. Rieger, Lorentz Boost Networks: Autonomous Physics-Inspired Feature Engineering, 2019 JINST 14 P06006 [arXiv:1812.09722] [INSPIRE].
P.T. Komiske, E.M. Metodiev and J. Thaler, Energy Flow Networks: Deep Sets for Particle Jets, JHEP 01 (2019) 121 [arXiv:1810.05165] [INSPIRE].
Article ADS Google Scholar
A. Bogatskiy et al., Lorentz Group Equivariant Neural Network for Particle Physics, arXiv:2006.04780 [INSPIRE].
S. Gong et al., An efficient Lorentz equivariant graph neural network for jet tagging, JHEP 07 (2022) 030 [arXiv:2201.08187] [INSPIRE].
Article MathSciNet ADS Google Scholar
A. Bogatskiy et al., Symmetry Group Equivariant Architectures for Physics, in the proceedings of the Snowmass 2021, Seattle, U.S.A., July 17–26 (2022) [arXiv:2203.06153] [INSPIRE].
J.T. Offermann, T. Hoffman and A. Bogatskiy, Top Jet W-Momentum Reconstruction Dataset, Zenodo 1.0.0 (2023).
E. Witkowski and D. Whiteson, Learning Broken Symmetries with Resimulation and Encouraged Invariance, arXiv:2311.05952 [INSPIRE].
S. Qiu et al., Holistic approach to predicting top quark kinematic properties with the covariant particle transformer, Phys. Rev. D 107 (2023) 114029 [arXiv:2203.05687] [INSPIRE].
Article ADS Google Scholar
H. Weyl, The Classical Groups. Their Invariants and Representations, 2 ed., Princeton University Press, Princeton, N.J., U.S.A. (1946).
B. Gripaios, W. Haddadin and C.G. Lester, Lorentz and permutation invariants of particles, J. Phys. A 54 (2021) 155201 [arXiv:2003.05487] [INSPIRE].
Article MathSciNet ADS Google Scholar
M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R.R. Salakhutdinov and A.J. Smola, Deep Sets, in proceedings of 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, U.S.A. (2017) pp. 3391–3401 http://papers.nips.cc/paper/6931-deep-sets.pdf.
ATLAS collaboration, Deep Sets based Neural Networks for Impact Parameter Flavour Tagging in ATLAS, ATL-PHYS-PUB-2020-014 (2020) [INSPIRE].
E. Wagstaff et al., Universal Approximation of Functions on Sets, arXiv:2107.01959.
H. Qu, C. Li and S. Qian, Particle Transformer for Jet Tagging, arXiv:2202.03772 [INSPIRE].
H. Maron, H. Ben-Hamu, N. Shamir and Y. Lipman, Invariant and Equivariant Graph Networks, arXiv:1812.09902.
H. Pan and R. Kondor, Permutation equivariant layers for higher order interactions, in proceedings of The 25th International Conference on Artificial Intelligence and Statistics (AISTATS), Proc. Machine Learning Research 151 (2022) pp. 5987–6001, https://proceedings.mlr.press/v151/pan22a.html.
G. Corso et al., Principal Neighbourhood Aggregation for Graph Nets, arXiv:2004.05718.
N. Keriven and G. Peyré, Universal Invariant and Equivariant Graph Neural Networks, Adv. Neural Inf. Process. 32 (2019) 7090 [arXiv:1905.04943].
Google Scholar
A. Sannai, Y. Takai and M. Cordonnier, Universal approximations of permutation invariant/equivariant functions by deep neural networks, arXiv:1903.01939.
H. Maron, E. Fetaya, N. Segol and Y. Lipman, On the Universality of Invariant Networks, in proceedings of the 36th International Conference on Machine Learning, Proc. Machine Learning Research 97 (2019) 4363, http://proceedings.mlr.press/v97/maron19a.html [arXiv:1901.09342].
H. Maron, H. Ben-Hamu, H. Serviansky and Y. Lipman, Provably Powerful Graph Networks, in H.M. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E.B. Fox and R. Garnett eds., Advances in Neural Information Processing Systems 32 (NeurIPS 2019), https://proceedings.neurips.cc/paper/2019/hash/bb04af0f7ecaee4aae62035497da1387-Abstract.html [arXiv:1905.11136].
T. Sun, A. Hands and R. Kondor, P-tensors: a General Formalism for Constructing Higher Order Message Passing Networks, arXiv:2306.10767.
S. Villar et al., Scalars are universal: Equivariant machine learning, structured like classical physics, Adv. Neural Inf. Process. Syst. 34 (2021) 28848 [arXiv:2106.06610] [INSPIRE].
Google Scholar
G. Kasieczka, T. Plehn, J. Thompson and M. Russel, Top Quark Tagging Reference Dataset, Zenodo Dataset v0 (2019).
C. Bierlich et al., A comprehensive guide to the physics and usage of PYTHIA 8.3, SciPost Phys. Codeb. 2022 (2022) 8 [arXiv:2203.11601] [INSPIRE].
DELPHES 3 collaboration, DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02 (2014) 057 [arXiv:1307.6346] [INSPIRE].
I. Loshchilov and F. Hutter, Decoupled Weight Decay Regularization, arXiv:1711.05101 [INSPIRE].
J. Pearkes, W. Fedorko, A. Lister and C. Gay, Jet Constituents for Deep Neural Network Based Top Quark Tagging, arXiv:1704.02124 [INSPIRE].
J.M. Munoz, I. Batatia and C. Ortner, Boost invariant polynomials for efficient jet tagging, Mach. Learn. Sci. Tech. 3 (2022) 04LT05 [arXiv:2207.08272] [INSPIRE].
R. Das, G. Kasieczka and D. Shih, Feature Selection with Distance Correlation, arXiv:2212.00046 [INSPIRE].
H. Qu and L. Gouskos, ParticleNet: Jet Tagging via Particle Clouds, Phys. Rev. D 101 (2020) 056019 [arXiv:1902.08570] [INSPIRE].
Article ADS Google Scholar
T. Chen and C. Guestrin, XGBoost: A Scalable Tree Boosting System, arXiv:1603.02754 [https://doi.org/10.1145/2939672.2939785] [INSPIRE].
M. Cacciari, G.P. Salam and G. Soyez, FastJet User Manual, Eur. Phys. J. C 72 (2012) 1896 [arXiv:1111.6097] [INSPIRE].
Article ADS Google Scholar
M. Cacciari, G.P. Salam and G. Soyez, The anti-k_t jet clustering algorithm, JHEP 04 (2008) 063 [arXiv:0802.1189] [INSPIRE].
Article ADS Google Scholar
P. Komiske, E. Metodiev and J. Thaler, Pythia8 quark and gluon jets for energy flow, Zenodo dataset v1 (2019).
V. Mikuni and F. Canelli, ABCNet: An attention-based method for particle tagging, Eur. Phys. J. Plus 135 (2020) 463 [arXiv:2001.05311] [INSPIRE].
Article Google Scholar
P. Konar, V.S. Ngairangbam and M. Spannowsky, Energy-weighted message passing: an infra-red and collinear safe graph neural network algorithm, JHEP 02 (2022) 060 [arXiv:2109.14636] [INSPIRE].
Article MathSciNet ADS Google Scholar
E.A. Moreno et al., JEDI-net: a jet identification algorithm based on interaction networks, Eur. Phys. J. C 80 (2020) 58 [arXiv:1908.05318] [INSPIRE].
Article ADS Google Scholar
V. Mikuni and F. Canelli, Point cloud transformers applied to collider physics, Mach. Learn. Sci. Tech. 2 (2021) 035027 [arXiv:2102.05073] [INSPIRE].
Article Google Scholar
J. Duarte et al., Fast inference of deep neural networks in FPGAs for particle physics, 2018 JINST 13 P07027 [arXiv:1804.06913] [INSPIRE].
M. Pierini, J.M. Duarte, N. Tran and M. Freytsis, HLS4ML LHC Jet Dataset (100 particles), Zenodo Dataset v1 (2020).
M. Pierini, J.M. Duarte, N. Tran, M. Freytsis and A. Bogatskiy, Converted HLS4ML LHC Jet Dataset (100 particles), OSF Dataset v1 (2023).
J.T. Offermann, X. Liu and T. Hoffman, HEPData4ML, GitHub (2023), https://github.com/janTOffermann/HEPData4ML.
S. Qiu et al., Parton labeling without matching: unveiling emergent labelling capabilities in regression models, Eur. Phys. J. C 83 (2023) 622 [arXiv:2304.09208] [INSPIRE].
Article ADS Google Scholar
Particle Data Group collaboration, Review of Particle Physics, PTEP 2022 (2022) 083C01 [INSPIRE].
C. Frye, A.J. Larkoski, J. Thaler and K. Zhou, Casimir Meets Poisson: Improved Quark/Gluon Discrimination with Counting Observables, JHEP 09 (2017) 083 [arXiv:1704.06266] [INSPIRE].
Article ADS Google Scholar
Y.L. Dokshitzer, G.D. Leder, S. Moretti and B.R. Webber, Better jet clustering algorithms, JHEP 08 (1997) 001 [hep-ph/9707323] [INSPIRE].
M. Wobisch and T. Wengler, Hadronization corrections to jet cross-sections in deep inelastic scattering, in the proceedings of the Workshop on Monte Carlo Generators for HERA Physics (Plenary Starting Meeting), Hamburg, Germany, April 27–30 (1998) [hep-ph/9907280] [INSPIRE].
D. Athanasakos et al., Is infrared-collinear safe information all you need for jet classification?, arXiv:2305.08979 [INSPIRE].
W. Azizian and M. Lelarge, Expressive Power of Invariant and Equivariant Graph Neural Networks, arXiv:2006.15646.
N. Dym and H. Maron, On the Universality of Rotation Equivariant Point Cloud Networks, arXiv:2010.02449.

Download references

Acknowledgments

The authors would like to thank the Data Science Institute at the University of Chicago for its generous support of this research. TH is supported by the Department of Physics at the University of Chicago. DWM and JTO are supported by the National Science Foundation under Grant PHY-2013010. The computations in this work were, in part, run at facilities supported by the Scientific Computing Core at the Flatiron Institute, a division of the Simons Foundation. The Center for Computational Mathematics at the Flatiron Institute is supported by the Simons Foundation. In addition, we are grateful to Andrew Larkoski for insightful comments.

Author information

Authors and Affiliations

Department of Physics, University of Chicago, Chicago, IL, USA
Timothy Hoffman, David W. Miller, Jan T. Offermann & Xiaoyang Liu
Center for Computational Mathematics, Flatiron Institute, New York, NY, USA
Alexander Bogatskiy
Enrico Fermi Institute, Chicago, IL, USA
David W. Miller

Authors

Alexander Bogatskiy
View author publications
You can also search for this author in PubMed Google Scholar
Timothy Hoffman
View author publications
You can also search for this author in PubMed Google Scholar
David W. Miller
View author publications
You can also search for this author in PubMed Google Scholar
Jan T. Offermann
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander Bogatskiy.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

ArXiv ePrint: 2307.16506

Rights and permissions

Open Access . This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Bogatskiy, A., Hoffman, T., Miller, D.W. et al. Explainable equivariant neural networks for particle physics: PELICAN. J. High Energ. Phys. 2024, 113 (2024). https://doi.org/10.1007/JHEP03(2024)113

Download citation

Received: 16 October 2023
Accepted: 25 February 2024
Published: 20 March 2024
DOI: https://doi.org/10.1007/JHEP03(2024)113

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Explainable equivariant neural networks for particle physics: PELICAN

Abstract

Article PDF

Similar content being viewed by others

Interpretable deep learning models for the inference and classification of LHC data

Energy flow networks: deep sets for particle jets

Shared Data and Algorithms for Deep Learning in Fundamental Physics

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Explainable equivariant neural networks for particle physics: PELICAN

Abstract

Article PDF

Similar content being viewed by others

Interpretable deep learning models for the inference and classification of LHC data

Energy flow networks: deep sets for particle jets

Shared Data and Algorithms for Deep Learning in Fundamental Physics

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation