Abstract
We apply computer vision with deep learning — in the form of a convolutional neural network (CNN) — to build a highly effective boosted top tagger. Previous work (the “DeepTop” tagger of Kasieczka et al) has shown that a CNN-based top tagger can achieve comparable performance to state-of-the-art conventional top taggers based on high-level inputs. Here, we introduce a number of improvements to the DeepTop tagger, including architecture, training, image preprocessing, sample size and color pixels. Our final CNN top tagger outperforms BDTs based on high-level inputs by a factor of ∼ 2-3 or more in background rejection, over a wide range of tagging efficiencies and fiducial jet selections. As reference points, we achieve a QCD background rejection factor of 500 (60) at 50% top tagging efficiency for fully-merged (non-merged) top jets with pT in the 800-900 GeV (350-450 GeV) range. Our CNN can also be straightforwardly extended to the classification of other types of jets, and the lessons learned here may be useful to others designing their own deep NNs for LHC applications.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
G.P. Salam, Towards Jetography, Eur. Phys. J. C 67 (2010) 637 [arXiv:0906.1833] [INSPIRE].
A. Abdesselam et al., Boosted objects: A Probe of beyond the Standard Model physics, Eur. Phys. J. C 71 (2011) 1661 [arXiv:1012.5412] [INSPIRE].
A. Altheimer et al., Jet Substructure at the Tevatron and LHC: New results, new tools, new benchmarks, J. Phys. G 39 (2012) 063001 [arXiv:1201.0008] [INSPIRE].
J. Shelton, Jet Substructure, in Proceedings, Theoretical Advanced Study Institute in Elementary Particle Physics: Searching for New Physics at Small and Large Scales (TASI 2012), Boulder, Colorado, June 4-29, 2012, pp. 303-340 (2013) [DOI:https://doi.org/10.1142/9789814525220_0007] [arXiv:1302.0260] [INSPIRE].
A. Altheimer et al., Boosted objects and jet substructure at the LHC. Report of BOOST2012, held at IFIC Valencia, 23rd-27th of July 2012, Eur. Phys. J. C 74 (2014) 2792 [arXiv:1311.2708] [INSPIRE].
D. Adams et al., Towards an Understanding of the Correlations in Jet Substructure, Eur. Phys. J. C 75 (2015) 409 [arXiv:1504.00679] [INSPIRE].
M. Cacciari, Phenomenological and theoretical developments in jet physics at the LHC, Int. J. Mod. Phys. A 30 (2015) 1546001 [arXiv:1509.02272] [INSPIRE].
A.J. Larkoski, I. Moult and B. Nachman, Jet Substructure at the Large Hadron Collider: A Review of Recent Advances in Theory and Machine Learning, arXiv:1709.04464 [INSPIRE].
J.M. Butterworth, A.R. Davison, M. Rubin and G.P. Salam, Jet substructure as a new Higgs search channel at the LHC, Phys. Rev. Lett. 100 (2008) 242001 [arXiv:0802.2470] [INSPIRE].
D.E. Kaplan, K. Rehermann, M.D. Schwartz and B. Tweedie, Top Tagging: A Method for Identifying Boosted Hadronically Decaying Top Quarks, Phys. Rev. Lett. 101 (2008) 142001 [arXiv:0806.0848] [INSPIRE].
T. Plehn, G.P. Salam and M. Spannowsky, Fat Jets for a Light Higgs, Phys. Rev. Lett. 104 (2010) 111801 [arXiv:0910.5472] [INSPIRE].
T. Plehn, M. Spannowsky, M. Takeuchi and D. Zerwas, Stop Reconstruction with Tagged Tops, JHEP 10 (2010) 078 [arXiv:1006.2833] [INSPIRE].
G. Kasieczka, T. Plehn, T. Schell, T. Strebler and G.P. Salam, Resonance Searches with an Updated Top Tagger, JHEP 06 (2015) 203 [arXiv:1503.05921] [INSPIRE].
D.E. Soper and M. Spannowsky, Finding physics signals with shower deconstruction, Phys. Rev. D 84 (2011) 074002 [arXiv:1102.3480] [INSPIRE].
D.E. Soper and M. Spannowsky, Finding top quarks with shower deconstruction, Phys. Rev. D 87 (2013) 054012 [arXiv:1211.3140] [INSPIRE].
T. Plehn and M. Spannowsky, Top Tagging, J. Phys. G 39 (2012) 083001 [arXiv:1112.4441] [INSPIRE].
G. Kasieczka, Boosted Top Tagging Method Overview, in 10th International Workshop on Top Quark Physics (TOP2017) Braga, Portugal, September 17-22, 2017, 2018, arXiv:1801.04180 [INSPIRE].
ATLAS collaboration, Search for W ′ → tb decays in the hadronic final state using pp collisions at \( \sqrt{s}=13 \) TeV with the ATLAS detector, Phys. Lett. B 781 (2018) 327 [arXiv:1801.07893] [INSPIRE].
ATLAS collaboration, Search for resonances decaying into top-quark pairs using fully hadronic decays in pp collisions with ATLAS at \( \sqrt{s}=7 \) TeV, JHEP 01 (2013) 116 [arXiv:1211.2202] [INSPIRE].
CMS collaboration, Search for ttH production in the \( \overline{\mathrm{H}}\to \mathrm{b}\overline{\mathrm{b}} \) decay channel with \( \sqrt{s}=13 \) TeV pp collisions at the CMS experiment, CMS-PAS-HIG-16-004.
ATLAS collaboration, Measurements of \( t\overline{t} \) differential cross-sections of highly boosted top quarks decaying to all-hadronic final states in pp collisions at \( \sqrt{s}=13 \) TeV using the ATLAS detector, Phys. Rev. D 98 (2018) 012003 [arXiv:1801.02052] [INSPIRE].
CMS collaboration, Search for dark matter in events with energetic, hadronically decaying top quarks and missing transverse momentum at \( \sqrt{s}=13 \) TeV, JHEP 06 (2018) 027 [arXiv:1801.08427] [INSPIRE].
ATLAS collaboration, Search for top squarks in final states with one isolated lepton, jets and missing transverse momentum using 36.1 fb −1 of \( \sqrt{s}=13 \) TeV pp collision data with the ATLAS detector, ATLAS-CONF-2017-037.
CMS collaboration, Search for direct production of supersymmetric partners of the top quark in the all-jets final state in proton-proton collisions at \( \sqrt{s}=13 \) TeV, JHEP 10 (2017) 005 [arXiv:1707.03316] [INSPIRE].
M. Nielsen, Neural Networks and Deep Learning, http://neuralnetworksanddeeplearning.com/.
I. Goodfellow, Y. Bengio and A. Courville, Deep Learning, MIT Press (2016) [http://www.deeplearningbook.org].
Y. LeCun, Y. Bengio and G. Hinton, Deep learning, Nature 521 (2015) 436.
R. Hadsell et al., Learning long-range vision for autonomous off-road driving, J. Field Robot. 26 (2009) 120.
C. Farabet, C. Couprie, L. Najman and Y. LeCun, Scene parsing with multiscale feature learning, purity trees, and optimal covers, CoRR abs/1202.2160 (2012) [arXiv:1202.2160].
O. Vinyals, A. Toshev, S. Bengio and D. Erhan, Show and tell: A neural image caption generator, CoRR abs/1411.4555 (2014) [arXiv:1411.4555].
A. Farhadi et al., Every picture tells a story: Generating sentences from images, in Computer Vision — ECCV 2010, K. Daniilidis, P. Maragos and N. Paragios eds., Springer Berlin Heidelberg, Berlin, Heidelberg (2010), pp. 15-29. [https://www.cs.cmu.edu/~afarhadi/papers/sentence.pdf].
Y. Taigman, M. Yang, M. Ranzato and L. Wolf, Deepface: Closing the gap to human-level performance in face verification, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701-1708. June, 2014 [DOI:https://doi.org/10.1109/CVPR.2014.220].
K. He, X. Zhang, S. Ren and J. Sun, Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, arXiv:1502.01852 [INSPIRE].
J. Cogan, M. Kagan, E. Strauss and A. Schwarztman, Jet-Images: Computer Vision Inspired Techniques for Jet Tagging, JHEP 02 (2015) 118 [arXiv:1407.5675] [INSPIRE].
L.G. Almeida, M. Backović, M. Cliche, S.J. Lee and M. Perelstein, Playing Tag with ANN: Boosted Top Identification with Pattern Recognition, JHEP 07 (2015) 086 [arXiv:1501.05968] [INSPIRE].
L. de Oliveira, M. Kagan, L. Mackey, B. Nachman and A. Schwartzman, Jet-images — deep learning edition, JHEP 07 (2016) 069 [arXiv:1511.05190] [INSPIRE].
P. Baldi, K. Bauer, C. Eng, P. Sadowski and D. Whiteson, Jet Substructure Classification in High-Energy Physics with Deep Neural Networks, Phys. Rev. D 93 (2016) 094034 [arXiv:1603.09349] [INSPIRE].
P.T. Komiske, E.M. Metodiev and M.D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination, JHEP 01 (2017) 110 [arXiv:1612.01551] [INSPIRE].
G. Kasieczka, T. Plehn, M. Russell and T. Schell, Deep-learning Top Taggers or The End of QCD?, JHEP 05 (2017) 006 [arXiv:1701.08784] [INSPIRE].
A.J. Larkoski, S. Marzani, G. Soyez and J. Thaler, Soft Drop, JHEP 05 (2014) 146 [arXiv:1402.2657] [INSPIRE].
J. Thaler and K. Van Tilburg, Identifying Boosted Objects with N-subjettiness, JHEP 03 (2011) 015 [arXiv:1011.2268] [INSPIRE].
CMS collaboration, Top Tagging with New Approaches, CMS-PAS-JME-15-002.
ATLAS collaboration, Performance of Top Quark and W Boson Tagging in Run 2 with ATLAS, ATLAS-CONF-2017-064.
ATLAS collaboration, Identification of Hadronically-Decaying W Bosons and Top Quarks Using High-Level Features as Input to Boosted Decision Trees and Deep Neural Networks in ATLAS at \( \sqrt{s}=13 \) TeV, ATL-PHYS-PUB-2017-004.
G. Louppe, K. Cho, C. Becot and K. Cranmer, QCD-Aware Recursive Neural Networks for Jet Physics, arXiv:1702.00748 [INSPIRE].
J. Pearkes, W. Fedorko, A. Lister and C. Gay, Jet Constituents for Deep Neural Network Based Top Quark Tagging, arXiv:1704.02124 [INSPIRE].
A. Butter, G. Kasieczka, T. Plehn and M. Russell, Deep-learned Top Tagging with a Lorentz Layer, SciPost Phys. 5 (2018) 028 [arXiv:1707.08966] [INSPIRE].
S. Egan, W. Fedorko, A. Lister, J. Pearkes and C. Gay, Long Short-Term Memory (LSTM) networks with jet constituents for boosted top tagging at the LHC, arXiv:1711.09059 [INSPIRE].
T. Sjöstrand et al., An Introduction to PYTHIA 8.2, Comput. Phys. Commun. 191 (2015) 159 [arXiv:1410.3012] [INSPIRE].
DELPHES 3 collaboration, J. de Favereau et al., DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02 (2014) 057 [arXiv:1307.6346] [INSPIRE].
M. Cacciari, G.P. Salam and G. Soyez, FastJet User Manual, Eur. Phys. J. C 72 (2012) 1896 [arXiv:1111.6097] [INSPIRE].
A. Hocker et al., TMVA — Toolkit for Multivariate Data Analysis, PoS(ACAT)040 [physics/0703039] [INSPIRE].
T. Schaul, S. Zhang and Y. LeCun, No More Pesky Learning Rates, arXiv:1206.1106.
M.D. Zeiler, ADADELTA: an adaptive learning rate method, CoRR abs/1212.5701 (2012) [arXiv:1212.5701].
D.P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, arXiv:1412.6980 [INSPIRE].
F. Chollet et al., Keras: Deep learning library for python - convnets, recurrent neural networks, and more, http://github.com/fchollet/keras.
M. Abadi, TensorFlow: Large-scale machine learning on heterogeneous systems, (2015) [http://tensorflow.org/].
C. Cortes, L.D. Jackel, S.A. Solla, V. Vapnik and J.S. Denker, Learning curves: Asymptotic values and rate of convergence, in Advances in Neural Information Processing Systems 6, J.D. Cowan, G. Tesauro and J. Alspector eds., Morgan-Kaufmann (1994), pp. 327-334 [http://papers.nips.cc/paper/803-learning-curves-asymptotic-values-and-rate-of-convergence.pdf].
CMS collaboration, Status of b-tagging and vertexing tools for 2011 data analysis, CMS-PAS-BTV-11-002.
A. Rizzi, F. Palla and G. Segneri, Track impact parameter based b-tagging with CMS, CMS-NOTE-2006-019.
M. Bahr et al., HERWIG++ Physics and Manual, Eur. Phys. J. C 58 (2008) 639 [arXiv:0803.0883] [INSPIRE].
J. Bellm et al., HERWIG 7.0/HERWIG++ 3.0 release note, Eur. Phys. J. C 76 (2016) 196 [arXiv:1512.01178] [INSPIRE].
G.E.H.A. Krizhevsky and I. Sutskever, ImageNet Classification with Deep Convolutional Neural Networks, https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf.
R.B. Girshick, Fast R-CNN, CoRR abs/1504.08083 (2015) [arXiv:1504.08083].
K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, CoRR abs/1409.1556 (2014) [arXiv:1409.1556].
K. He, X. Zhang, S. Ren and J. Sun, Deep residual learning for image recognition, CoRR abs/1512.03385 (2015) [arXiv:1512.03385].
C. Szegedy et al., Going Deeper with Convolutions, arXiv:1409.4842 [INSPIRE].
I.J. Goodfellow et al., Generative Adversarial Networks, arXiv:1406.2661 [INSPIRE].
M. Paganini, L. de Oliveira and B. Nachman, CaloGAN: Simulating 3D high energy particle showers in multilayer electromagnetic calorimeters with generative adversarial networks, Phys. Rev. D 97 (2018) 014021 [arXiv:1712.10321] [INSPIRE].
L. de Oliveira, M. Paganini and B. Nachman, Controlling Physical Attributes in GAN-Accelerated Simulation of Electromagnetic Calorimeters, in 18th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2017), Seattle, WA, U.S.A., August 21-25, 2017 (2017) [arXiv:1711.08813] [INSPIRE].
M. Paganini, L. de Oliveira and B. Nachman, Accelerating Science with Generative Adversarial Networks: An Application to 3D Particle Showers in Multilayer Calorimeters, Phys. Rev. Lett. 120 (2018) 042003 [arXiv:1705.02355] [INSPIRE].
L. de Oliveira, M. Paganini and B. Nachman, Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis, Comput. Softw. Big Sci. 1 (2017) 4 [arXiv:1701.05927] [INSPIRE].
GEANT4 collaboration, S. Agostinelli et al., GEANT4: A Simulation toolkit, Nucl. Instrum. Meth. A 506 (2003) 250 [INSPIRE].
T. Cohen, M. Freytsis and B. Ostdiek, (Machine) Learning to Do More with Less, JHEP 02 (2018) 034 [arXiv:1706.09451] [INSPIRE].
L.M. Dery, B. Nachman, F. Rubbo and A. Schwartzman, Weakly Supervised Classification in High Energy Physics, JHEP 05 (2017) 145 [arXiv:1702.00414] [INSPIRE].
E.M. Metodiev, B. Nachman and J. Thaler, Classification without labels: Learning from mixed samples in high energy physics, JHEP 10 (2017) 174 [arXiv:1708.02949] [INSPIRE].
P.T. Komiske, E.M. Metodiev, B. Nachman and M.D. Schwartz, Learning to classify from impure samples with high-dimensional data, Phys. Rev. D 98 (2018) 011502 [arXiv:1801.10158] [INSPIRE].
Open Access
This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Additional information
ArXiv ePrint: 1803.00107
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Macaluso, S., Shih, D. Pulling out all the tops with computer vision and deep learning. J. High Energ. Phys. 2018, 121 (2018). https://doi.org/10.1007/JHEP10(2018)121
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/JHEP10(2018)121