Abstract
PELICAN is a novel permutation equivariant and Lorentz invariant or covariant aggregator network designed to overcome common limitations found in architectures applied to particle physics problems. Compared to many approaches that use non-specialized architectures that neglect underlying physics principles and require very large numbers of parameters, PELICAN employs a fundamentally symmetry group-based architecture that demonstrates benefits in terms of reduced complexity, increased interpretability, and raw performance. We present a comprehensive study of the PELICAN algorithm architecture in the context of both tagging (classification) and reconstructing (regression) Lorentz-boosted top quarks, including the difficult task of specifically identifying and measuring the W-boson inside the dense environment of the Lorentz-boosted top-quark hadronic final state. We also extend the application of PELICAN to the tasks of identifying quark-initiated vs. gluon-initiated jets, and a multi-class identification across five separate target categories of jets. When tested on the standard task of Lorentz-boosted top-quark tagging, PELICAN outperforms existing competitors with much lower model complexity and high sample efficiency. On the less common and more complex task of 4-momentum regression, PELICAN also outperforms hand-crafted, non-machine learning algorithms. We discuss the implications of symmetry-restricted architectures for the wider field of machine learning for physics.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
J. Gallicchio and M.D. Schwartz, Quark and Gluon Jet Substructure, JHEP 04 (2013) 090 [arXiv:1211.7038] [INSPIRE].
A.J. Larkoski, J. Thaler and W.J. Waalewijn, Gaining (Mutual) Information about Quark/Gluon Discrimination, JHEP 11 (2014) 129 [arXiv:1408.3122] [INSPIRE].
P.T. Komiske, E.M. Metodiev and M.D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination, JHEP 01 (2017) 110 [arXiv:1612.01551] [INSPIRE].
E. Alvarez, M. Spannowsky and M. Szewc, Unsupervised Quark/Gluon Jet Tagging With Poissonian Mixture Models, Front. Artif. Intell. 5 (2022) 852970 [arXiv:2112.11352] [INSPIRE].
G. Kasieczka, N. Kiefer, T. Plehn and J.M. Thompson, Quark-Gluon Tagging: Machine Learning vs Detector, SciPost Phys. 6 (2019) 069 [arXiv:1812.09223] [INSPIRE].
J.M. Butterworth, A.R. Davison, M. Rubin and G.P. Salam, Jet substructure as a new Higgs search channel at the LHC, Phys. Rev. Lett. 100 (2008) 242001 [arXiv:0802.2470] [INSPIRE].
D.E. Kaplan, K. Rehermann, M.D. Schwartz and B. Tweedie, Top Tagging: A Method for Identifying Boosted Hadronically Decaying Top Quarks, Phys. Rev. Lett. 101 (2008) 142001 [arXiv:0806.0848] [INSPIRE].
A. Butter et al., The Machine Learning landscape of top taggers, SciPost Phys. 7 (2019) 014 [arXiv:1902.09914] [INSPIRE].
J. Thaler and L.-T. Wang, Strategies to Identify Boosted Tops, JHEP 07 (2008) 092 [arXiv:0806.0023] [INSPIRE].
D.E. Soper and M. Spannowsky, Finding top quarks with shower deconstruction, Phys. Rev. D 87 (2013) 054012 [arXiv:1211.3140] [INSPIRE].
I. Feige, M.D. Schwartz, I.W. Stewart and J. Thaler, Precision Jet Substructure from Boosted Event Shapes, Phys. Rev. Lett. 109 (2012) 092001 [arXiv:1204.3898] [INSPIRE].
Y.-T. Chien and I. Vitev, Jet Shape Resummation Using Soft-Collinear Effective Theory, JHEP 12 (2014) 061 [arXiv:1405.4293] [INSPIRE].
S. Marzani, L. Schunk and G. Soyez, The jet mass distribution after Soft Drop, Eur. Phys. J. C 78 (2018) 96 [arXiv:1712.05105] [INSPIRE].
F.A. Dreyer, G.P. Salam and G. Soyez, The Lund Jet Plane, JHEP 12 (2018) 064 [arXiv:1807.04758] [INSPIRE].
R. Kogler, Advances in Jet Substructure at the LHC: Algorithms, Measurements and Searches for New Physical Phenomena, Springer Tracts Mod. Phys. (STMP) 284 (2021). [INSPIRE].
J. Thaler and K. Van Tilburg, Identifying Boosted Objects with N-subjettiness, JHEP 03 (2011) 015 [arXiv:1011.2268] [INSPIRE].
CMS collaboration, First Measurement of Hadronic Event Shapes in pp Collisions at \(\sqrt{s}\) = 7 TeV, Phys. Lett. B 699 (2011) 48 [arXiv:1102.0068] [INSPIRE].
P.T. Komiske, E.M. Metodiev and J. Thaler, Energy flow polynomials: A complete linear basis for jet substructure, JHEP 04 (2018) 013 [arXiv:1712.07124] [INSPIRE].
R. Kogler et al., Jet Substructure at the Large Hadron Collider: Experimental Review, Rev. Mod. Phys. 91 (2019) 045003 [arXiv:1803.06991] [INSPIRE].
S. Marzani, G. Soyez and M. Spannowsky, Looking inside jets: an introduction to jet substructure and boosted-object phenomenology, Springer (2019) [https://doi.org/10.1007/978-3-030-15709-8] [INSPIRE].
M. Feickert and B. Nachman, A Living Review of Machine Learning for Particle Physics, arXiv:2102.02770 [INSPIRE].
A. Butter, G. Kasieczka, T. Plehn and M. Russell, Deep-learned Top Tagging with a Lorentz Layer, SciPost Phys. 5 (2018) 028 [arXiv:1707.08966] [INSPIRE].
M. Erdmann, E. Geiser, Y. Rath and M. Rieger, Lorentz Boost Networks: Autonomous Physics-Inspired Feature Engineering, 2019 JINST 14 P06006 [arXiv:1812.09722] [INSPIRE].
P.T. Komiske, E.M. Metodiev and J. Thaler, Energy Flow Networks: Deep Sets for Particle Jets, JHEP 01 (2019) 121 [arXiv:1810.05165] [INSPIRE].
A. Bogatskiy et al., Lorentz Group Equivariant Neural Network for Particle Physics, arXiv:2006.04780 [INSPIRE].
S. Gong et al., An efficient Lorentz equivariant graph neural network for jet tagging, JHEP 07 (2022) 030 [arXiv:2201.08187] [INSPIRE].
A. Bogatskiy et al., Symmetry Group Equivariant Architectures for Physics, in the proceedings of the Snowmass 2021, Seattle, U.S.A., July 17–26 (2022) [arXiv:2203.06153] [INSPIRE].
J.T. Offermann, T. Hoffman and A. Bogatskiy, Top Jet W-Momentum Reconstruction Dataset, Zenodo 1.0.0 (2023).
E. Witkowski and D. Whiteson, Learning Broken Symmetries with Resimulation and Encouraged Invariance, arXiv:2311.05952 [INSPIRE].
S. Qiu et al., Holistic approach to predicting top quark kinematic properties with the covariant particle transformer, Phys. Rev. D 107 (2023) 114029 [arXiv:2203.05687] [INSPIRE].
H. Weyl, The Classical Groups. Their Invariants and Representations, 2 ed., Princeton University Press, Princeton, N.J., U.S.A. (1946).
B. Gripaios, W. Haddadin and C.G. Lester, Lorentz and permutation invariants of particles, J. Phys. A 54 (2021) 155201 [arXiv:2003.05487] [INSPIRE].
M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R.R. Salakhutdinov and A.J. Smola, Deep Sets, in proceedings of 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, U.S.A. (2017) pp. 3391–3401 http://papers.nips.cc/paper/6931-deep-sets.pdf.
ATLAS collaboration, Deep Sets based Neural Networks for Impact Parameter Flavour Tagging in ATLAS, ATL-PHYS-PUB-2020-014 (2020) [INSPIRE].
E. Wagstaff et al., Universal Approximation of Functions on Sets, arXiv:2107.01959.
H. Qu, C. Li and S. Qian, Particle Transformer for Jet Tagging, arXiv:2202.03772 [INSPIRE].
H. Maron, H. Ben-Hamu, N. Shamir and Y. Lipman, Invariant and Equivariant Graph Networks, arXiv:1812.09902.
H. Pan and R. Kondor, Permutation equivariant layers for higher order interactions, in proceedings of The 25th International Conference on Artificial Intelligence and Statistics (AISTATS), Proc. Machine Learning Research 151 (2022) pp. 5987–6001, https://proceedings.mlr.press/v151/pan22a.html.
G. Corso et al., Principal Neighbourhood Aggregation for Graph Nets, arXiv:2004.05718.
N. Keriven and G. Peyré, Universal Invariant and Equivariant Graph Neural Networks, Adv. Neural Inf. Process. 32 (2019) 7090 [arXiv:1905.04943].
A. Sannai, Y. Takai and M. Cordonnier, Universal approximations of permutation invariant/equivariant functions by deep neural networks, arXiv:1903.01939.
H. Maron, E. Fetaya, N. Segol and Y. Lipman, On the Universality of Invariant Networks, in proceedings of the 36th International Conference on Machine Learning, Proc. Machine Learning Research 97 (2019) 4363, http://proceedings.mlr.press/v97/maron19a.html [arXiv:1901.09342].
H. Maron, H. Ben-Hamu, H. Serviansky and Y. Lipman, Provably Powerful Graph Networks, in H.M. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E.B. Fox and R. Garnett eds., Advances in Neural Information Processing Systems 32 (NeurIPS 2019), https://proceedings.neurips.cc/paper/2019/hash/bb04af0f7ecaee4aae62035497da1387-Abstract.html [arXiv:1905.11136].
T. Sun, A. Hands and R. Kondor, P-tensors: a General Formalism for Constructing Higher Order Message Passing Networks, arXiv:2306.10767.
S. Villar et al., Scalars are universal: Equivariant machine learning, structured like classical physics, Adv. Neural Inf. Process. Syst. 34 (2021) 28848 [arXiv:2106.06610] [INSPIRE].
G. Kasieczka, T. Plehn, J. Thompson and M. Russel, Top Quark Tagging Reference Dataset, Zenodo Dataset v0 (2019).
C. Bierlich et al., A comprehensive guide to the physics and usage of PYTHIA 8.3, SciPost Phys. Codeb. 2022 (2022) 8 [arXiv:2203.11601] [INSPIRE].
DELPHES 3 collaboration, DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02 (2014) 057 [arXiv:1307.6346] [INSPIRE].
I. Loshchilov and F. Hutter, Decoupled Weight Decay Regularization, arXiv:1711.05101 [INSPIRE].
J. Pearkes, W. Fedorko, A. Lister and C. Gay, Jet Constituents for Deep Neural Network Based Top Quark Tagging, arXiv:1704.02124 [INSPIRE].
J.M. Munoz, I. Batatia and C. Ortner, Boost invariant polynomials for efficient jet tagging, Mach. Learn. Sci. Tech. 3 (2022) 04LT05 [arXiv:2207.08272] [INSPIRE].
R. Das, G. Kasieczka and D. Shih, Feature Selection with Distance Correlation, arXiv:2212.00046 [INSPIRE].
H. Qu and L. Gouskos, ParticleNet: Jet Tagging via Particle Clouds, Phys. Rev. D 101 (2020) 056019 [arXiv:1902.08570] [INSPIRE].
T. Chen and C. Guestrin, XGBoost: A Scalable Tree Boosting System, arXiv:1603.02754 [https://doi.org/10.1145/2939672.2939785] [INSPIRE].
M. Cacciari, G.P. Salam and G. Soyez, FastJet User Manual, Eur. Phys. J. C 72 (2012) 1896 [arXiv:1111.6097] [INSPIRE].
M. Cacciari, G.P. Salam and G. Soyez, The anti-kt jet clustering algorithm, JHEP 04 (2008) 063 [arXiv:0802.1189] [INSPIRE].
P. Komiske, E. Metodiev and J. Thaler, Pythia8 quark and gluon jets for energy flow, Zenodo dataset v1 (2019).
V. Mikuni and F. Canelli, ABCNet: An attention-based method for particle tagging, Eur. Phys. J. Plus 135 (2020) 463 [arXiv:2001.05311] [INSPIRE].
P. Konar, V.S. Ngairangbam and M. Spannowsky, Energy-weighted message passing: an infra-red and collinear safe graph neural network algorithm, JHEP 02 (2022) 060 [arXiv:2109.14636] [INSPIRE].
E.A. Moreno et al., JEDI-net: a jet identification algorithm based on interaction networks, Eur. Phys. J. C 80 (2020) 58 [arXiv:1908.05318] [INSPIRE].
V. Mikuni and F. Canelli, Point cloud transformers applied to collider physics, Mach. Learn. Sci. Tech. 2 (2021) 035027 [arXiv:2102.05073] [INSPIRE].
J. Duarte et al., Fast inference of deep neural networks in FPGAs for particle physics, 2018 JINST 13 P07027 [arXiv:1804.06913] [INSPIRE].
M. Pierini, J.M. Duarte, N. Tran and M. Freytsis, HLS4ML LHC Jet Dataset (100 particles), Zenodo Dataset v1 (2020).
M. Pierini, J.M. Duarte, N. Tran, M. Freytsis and A. Bogatskiy, Converted HLS4ML LHC Jet Dataset (100 particles), OSF Dataset v1 (2023).
J.T. Offermann, X. Liu and T. Hoffman, HEPData4ML, GitHub (2023), https://github.com/janTOffermann/HEPData4ML.
S. Qiu et al., Parton labeling without matching: unveiling emergent labelling capabilities in regression models, Eur. Phys. J. C 83 (2023) 622 [arXiv:2304.09208] [INSPIRE].
Particle Data Group collaboration, Review of Particle Physics, PTEP 2022 (2022) 083C01 [INSPIRE].
C. Frye, A.J. Larkoski, J. Thaler and K. Zhou, Casimir Meets Poisson: Improved Quark/Gluon Discrimination with Counting Observables, JHEP 09 (2017) 083 [arXiv:1704.06266] [INSPIRE].
Y.L. Dokshitzer, G.D. Leder, S. Moretti and B.R. Webber, Better jet clustering algorithms, JHEP 08 (1997) 001 [hep-ph/9707323] [INSPIRE].
M. Wobisch and T. Wengler, Hadronization corrections to jet cross-sections in deep inelastic scattering, in the proceedings of the Workshop on Monte Carlo Generators for HERA Physics (Plenary Starting Meeting), Hamburg, Germany, April 27–30 (1998) [hep-ph/9907280] [INSPIRE].
D. Athanasakos et al., Is infrared-collinear safe information all you need for jet classification?, arXiv:2305.08979 [INSPIRE].
W. Azizian and M. Lelarge, Expressive Power of Invariant and Equivariant Graph Neural Networks, arXiv:2006.15646.
N. Dym and H. Maron, On the Universality of Rotation Equivariant Point Cloud Networks, arXiv:2010.02449.
Acknowledgments
The authors would like to thank the Data Science Institute at the University of Chicago for its generous support of this research. TH is supported by the Department of Physics at the University of Chicago. DWM and JTO are supported by the National Science Foundation under Grant PHY-2013010. The computations in this work were, in part, run at facilities supported by the Scientific Computing Core at the Flatiron Institute, a division of the Simons Foundation. The Center for Computational Mathematics at the Flatiron Institute is supported by the Simons Foundation. In addition, we are grateful to Andrew Larkoski for insightful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
ArXiv ePrint: 2307.16506
Rights and permissions
Open Access . This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Bogatskiy, A., Hoffman, T., Miller, D.W. et al. Explainable equivariant neural networks for particle physics: PELICAN. J. High Energ. Phys. 2024, 113 (2024). https://doi.org/10.1007/JHEP03(2024)113
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/JHEP03(2024)113