Abstract
Autoencoders have been proposed as a powerful tool for model-independent anomaly detection in high-energy physics. The operating principle is that events which do not belong to the space of training data will be reconstructed poorly, thus flagging them as anomalies. We point out that in a variety of examples of interest, the connection between large reconstruction error and anomalies is not so clear. In particular, for data sets with nontrivial topology, there will always be points that erroneously seem anomalous due to global issues. Conversely, neural networks typically have an inductive bias or prior to locally interpolate such that undersampled or rare events may be reconstructed with small error, despite actually being the desired anomalies. Taken together, these facts are in tension with the simple picture of the autoencoder as an anomaly detector. Using a series of illustrative low-dimensional examples, we show explicitly how the intrinsic and extrinsic topology of the dataset affects the behavior of an autoencoder and how this topology is manifested in the latent space representation during training. We ground this analysis in the discussion of a mock “bump hunt” in which the autoencoder fails to identify an anomalous “signal” for reasons tied to the intrinsic topology of n-particle phase space.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
J. Cogan, M. Kagan, E. Strauss and A. Schwarztman, Jet-images: computer vision inspired techniques for jet tagging, JHEP 02 (2015) 118 [arXiv:1407.5675] [INSPIRE].
L. de Oliveira, M. Kagan, L. Mackey, B. Nachman and A. Schwartzman, Jet-images — deep learning edition, JHEP 07 (2016) 069 [arXiv:1511.05190] [INSPIRE].
D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning internal representations by error propagation, in Parallel distributed processing: explorations in the microstructure of cognition. Volume 1: foundations, D. E. Rumelhart, J. L. McClelland and the PDP research group eds., MIT Press, Cambridge, MA, U.S.A. (1986).
M. A. Pimentel, D. A. Clifton, L. Clifton and L. Tarassenko, A review of novelty detection, Signal Proc. 99 (2014) 215.
B. Nachman, Anomaly detection for physics analysis and less than supervised learning, arXiv:2010.14554 [INSPIRE].
M. Feickert and B. Nachman, A living review of machine learning for particle physics, arXiv:2102.02770 [INSPIRE].
G. Kasieczka et al., The LHC olympics 2020: a community challenge for anomaly detection in high energy physics, arXiv:2101.08320 [INSPIRE].
P. Baldi, K. Bauer, C. Eng, P. Sadowski and D. Whiteson, Jet substructure classification in high-energy physics with deep neural networks, Phys. Rev. D 93 (2016) 094034 [arXiv:1603.09349] [INSPIRE].
J. Barnard, E. N. Dawe, M. J. Dolan and N. Rajcic, Parton shower uncertainties in jet substructure analyses with deep neural networks, Phys. Rev. D 95 (2017) 014018 [arXiv:1609.00607] [INSPIRE].
P. T. Komiske, E. M. Metodiev and M. D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination, JHEP 01 (2017) 110 [arXiv:1612.01551] [INSPIRE].
ATLAS collaboration, Quark versus gluon jet tagging using jet images with the ATLAS detector, Tech. Rep. ATL-PHYS-PUB-2017-017, CERN, Geneva, Switzerland (2017).
G. Kasieczka, T. Plehn, M. Russell and T. Schell, Deep-learning top taggers or the end of QCD?, JHEP 05 (2017) 006 [arXiv:1701.08784] [INSPIRE].
W. Bhimji, S. A. Farrell, T. Kurth, M. Paganini, Prabhat and E. Racah, Deep neural networks for physics analysis on low-level whole-detector data at the LHC, J. Phys. Conf. Ser. 1085 (2018) 042034 [arXiv:1711.03573] [INSPIRE].
S. Macaluso and D. Shih, Pulling out all the tops with computer vision and deep learning, JHEP 10 (2018) 121 [arXiv:1803.00107] [INSPIRE].
J. Guo, J. Li, T. Li, F. Xu and W. Zhang, Deep learning for R-parity violating supersymmetry searches at the LHC, Phys. Rev. D 98 (2018) 076017 [arXiv:1805.10730] [INSPIRE].
D. Guest, J. Collado, P. Baldi, S.-C. Hsu, G. Urban and D. Whiteson, Jet flavor classification in high-energy physics with deep neural networks, Phys. Rev. D 94 (2016) 112002 [arXiv:1607.08633] [INSPIRE].
G. Louppe, K. Cho, C. Becot and K. Cranmer, QCD-aware recursive neural networks for jet physics, JHEP 01 (2019) 057 [arXiv:1702.00748] [INSPIRE].
T. Cheng, Recursive neural networks in quark/gluon tagging, Comput. Softw. Big Sci. 2 (2018) 3 [arXiv:1711.02633] [INSPIRE].
S. Egan, W. Fedorko, A. Lister, J. Pearkes and C. Gay, Long Short-Term Memory (LSTM) networks with jet constituents for boosted top tagging at the LHC, arXiv:1711.09059 [INSPIRE].
K. Fraser and M. D. Schwartz, Jet charge and machine learning, JHEP 10 (2018) 093 [arXiv:1803.08066] [INSPIRE].
L. G. Almeida, M. Backović, M. Cliche, S. J. Lee and M. Perelstein, Playing tag with ANN: boosted top identification with pattern recognition, JHEP 07 (2015) 086 [arXiv:1501.05968] [INSPIRE].
J. Pearkes, W. Fedorko, A. Lister and C. Gay, Jet constituents for deep neural network based top quark tagging, arXiv:1704.02124 [INSPIRE].
T. Roxlo and M. Reece, Opening the black box of neural nets: case studies in stop/top discrimination, arXiv:1804.09278 [INSPIRE].
J. A. Aguilar-Saavedra, J. H. Collins and R. K. Mishra, A generic anti-QCD jet tagger, JHEP 11 (2017) 163 [arXiv:1709.01087] [INSPIRE].
H. Lüo, M.-X. Luo, K. Wang, T. Xu and G. Zhu, Quark jet versus gluon jet: fully-connected neural networks with high-level features, Sci. China Phys. Mech. Astron. 62 (2019) 991011 [arXiv:1712.03634] [INSPIRE].
L. Moore, K. Nordström, S. Varma and M. Fairbairn, Reports of my demise are greatly exaggerated: N -subjettiness taggers take on jet images, SciPost Phys. 7 (2019) 036 [arXiv:1807.04769] [INSPIRE].
P. T. Komiske, E. M. Metodiev and J. Thaler, Energy flow polynomials: a complete linear basis for jet substructure, JHEP 04 (2018) 013 [arXiv:1712.07124] [INSPIRE].
P. T. Komiske, E. M. Metodiev and J. Thaler, Energy flow networks: deep sets for particle jets, JHEP 01 (2019) 121 [arXiv:1810.05165] [INSPIRE].
P. T. Komiske, E. M. Metodiev and J. Thaler, Cutting multiparticle correlators down to size, Phys. Rev. D 101 (2020) 036019 [arXiv:1911.04491] [INSPIRE].
G. Kasieczka, S. Marzani, G. Soyez and G. Stagnitto, Towards machine learning analytics for jet substructure, JHEP 09 (2020) 195 [arXiv:2007.04319] [INSPIRE].
K. Datta and A. Larkoski, How much information is in a jet?, JHEP 06 (2017) 073 [arXiv:1704.08249] [INSPIRE].
A. Butter, G. Kasieczka, T. Plehn and M. Russell, Deep-learned top tagging with a Lorentz layer, SciPost Phys. 5 (2018) 028 [arXiv:1707.08966] [INSPIRE].
K. Datta and A. J. Larkoski, Novel jet observables from machine learning, JHEP 03 (2018) 086 [arXiv:1710.01305] [INSPIRE].
F. A. Dreyer, G. P. Salam and G. Soyez, The Lund jet plane, JHEP 12 (2018) 064 [arXiv:1807.04758] [INSPIRE].
P. T. Komiske, E. M. Metodiev and J. Thaler, Metric space of collider events, Phys. Rev. Lett. 123 (2019) 041801 [arXiv:1902.02346] [INSPIRE].
A. J. Larkoski and E. M. Metodiev, A theory of quark vs. gluon discrimination, JHEP 10 (2019) 014 [arXiv:1906.01639] [INSPIRE].
C. Cesarotti and J. Thaler, A robust measure of event isotropy at colliders, JHEP 08 (2020) 084 [arXiv:2004.06125] [INSPIRE].
P. T. Komiske, E. M. Metodiev and J. Thaler, The hidden geometry of particle collisions, JHEP 07 (2020) 006 [arXiv:2004.04159] [INSPIRE].
Y. S. Lai, D. Neill, M. Płoskoń and F. Ringer, Explainable machine learning of the underlying physics of high-energy particle collisions, arXiv:2012.06582 [INSPIRE].
T. Cai, J. Cheng, N. Craig and K. Craig, Linearized optimal transport for collider events, Phys. Rev. D 102 (2020) 116019 [arXiv:2008.08604] [INSPIRE].
J. Thaler and K. Van Tilburg, Identifying boosted objects with N -subjettiness, JHEP 03 (2011) 015 [arXiv:1011.2268] [INSPIRE].
D. P. Kingma and M. Welling, Auto-encoding variational bayes, arXiv:1312.6114 [INSPIRE].
M. Farina, Y. Nakai and D. Shih, Searching for new physics with deep autoencoders, Phys. Rev. D 101 (2020) 075021 [arXiv:1808.08992] [INSPIRE].
T. Heimel, G. Kasieczka, T. Plehn and J. M. Thompson, QCD or what?, SciPost Phys. 6 (2019) 030 [arXiv:1808.08979] [INSPIRE].
O. Cerri, T.Q. Nguyen, M. Pierini, M. Spiropulu and J.-R. Vlimant, Variational autoencoders for new physics mining at the Large Hadron Collider, JHEP 05 (2019) 036 [arXiv:1811.10276] [INSPIRE].
J. Hajer, Y.-Y. Li, T. Liu and H. Wang, Novelty detection meets collider physics, Phys. Rev. D 101 (2020) 076015 [arXiv:1807.10261] [INSPIRE].
T. S. Roy and A. H. Vijay, A robust anomaly finder based on autoencoders, arXiv:1903.02032 [INSPIRE].
A. Blance, M. Spannowsky and P. Waite, Adversarially-trained autoencoders for robust unsupervised new physics searches, JHEP 10 (2019) 047 [arXiv:1905.10384] [INSPIRE].
T. Cheng, J.-F. Arguin, J. Leissner-Martin, J. Pilette and T. Golling, Variational autoencoders for anomalous jet tagging, arXiv:2007.01850 [INSPIRE].
S. E. Park, D. Rankin, S.-M. Udrescu, M. Yunus and P. Harris, Quasi anomalous knowledge: searching for new physics with embedded knowledge, arXiv:2011.03550 [INSPIRE].
M. Crispim Romão, N. F. Castro and R. Pedro, Finding new physics without learning about it: anomaly detection as a tool for searches at colliders, Eur. Phys. J. C 81 (2021) 27 [arXiv:2006.05432] [INSPIRE].
CMS collaboration, Measurement of the properties of a Higgs boson in the four-lepton final state, Phys. Rev. D 89 (2014) 092007 [arXiv:1312.5353] [INSPIRE].
ATLAS collaboration, Measurements of Higgs boson production and couplings in the four-lepton channel in pp collisions at center-of-mass energies of 7 and 8 TeV with the ATLAS detector, Phys. Rev. D 91 (2015) 012006 [arXiv:1408.5191] [INSPIRE].
A. Bogatskiy, B. Anderson, J. T. Offermann, M. Roussi, D. W. Miller and R. Kondor, Lorentz group equivariant neural network for particle physics, arXiv:2006.04780 [INSPIRE].
G. Kanwar et al., Equivariant flow-based sampling for lattice gauge theory, Phys. Rev. Lett. 125 (2020) 121601 [arXiv:2003.06413] [INSPIRE].
D. Boyda et al., Sampling using SU(N) gauge equivariant flows, Phys. Rev. D 103 (2021) 074504 [arXiv:2008.05456] [INSPIRE].
C. Olah, Neural networks, manifolds and topology, https://colah.github.io/posts/2014-03-NN-Manifolds-Topology/, (2014).
E. O. Korman, Autoencoding topology, arXiv:1803.00156.
M. Moor, M. Horn, B. Rieck and K. Borgwardt, Topological autoencoders, in International conference on machine learning, PMLR, (2020), pg. 7045 [arXiv:1906.00722].
M. Hajij and K. Istvan, Topology and neural networks, arXiv:2008.13697.
A. J. Larkoski and T. Melia, Covariantizing phase space, Phys. Rev. D 102 (2020) 094014 [arXiv:2008.06508] [INSPIRE].
Particle Data Group collaboration, Review of particle physics, PTEP 2020 (2020) 083C01 [INSPIRE].
G. Carlsson, Topology and data, Bull. Amer. Math. Soc. 46 (2009) 255.
F. Rosenblatt, Principles of neurodynamics: perceptrons and the theory of brain mechanism, Tech. rep., Cornell Aeronautical Lab Inc., U.S.A. (1961).
M. Minsky and S. A. Papert, Perceptrons: an introduction to computational geometry, MIT Press, Cambridge, MA, U.S.A. (1988).
ATLAS collaboration, Dijet resonance search with weak supervision using \( \sqrt{s} \) = 13 TeV pp collisions in the ATLAS detector, Phys. Rev. Lett. 125 (2020) 131801 [arXiv:2005.02983] [INSPIRE].
T. S. Cohen, M. Geiger, J. Köhler and M. Welling, Spherical CNNs, arXiv:1801.10130.
R. Kondor, Z. Lin and S. Trivedi, Clebsch-Gordan nets: a fully Fourier space spherical convolutional neural network, arXiv:1806.09231.
F. Camastra and A. Staiano, Intrinsic dimension estimation: advances and open problems, Informat. Sci. 328 (2016) 26.
U. Sharma and J. Kaplan, A neural scaling law from the dimension of the data manifold, arXiv:2004.10802.
S. L. Smith, P.-J. Kindermans and Q. V. Le, Don’t decay the learning rate, increase the batch size, in International conference on learning representations, (2018) [arXiv:1711.00489].
D. P. Kingma and J. Ba, Adam: a method for stochastic optimization, arXiv:1412.6980 [INSPIRE].
D. Hendrycks and K. Gimpel, Gaussian Error Linear Units (GELUs), arXiv:1606.08415.
P. Ramachandran, B. Zoph and Q. V. Le, Searching for activation functions, arXiv:1710.05941.
M. Mahowald, On the embeddability of the real projective spaces, Proc. Amer. Math. Soc. 13 (1962) 763.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
ArXiv ePrint: 2102.08380
Rights and permissions
Open Access . This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Batson, J., Haaf, C.G., Kahn, Y. et al. Topological obstructions to autoencoding. J. High Energ. Phys. 2021, 280 (2021). https://doi.org/10.1007/JHEP04(2021)280
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/JHEP04(2021)280