Learning likelihood ratios with neural network classifiers

Rizvi, Shahzar; Pettee, Mariel; Nachman, Benjamin

doi:10.1007/JHEP02(2024)136

Learning likelihood ratios with neural network classifiers

Regular Article - Theoretical Physics
Open access
Published: 20 February 2024

Volume 2024, article number 136, (2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of High Energy Physics Aims and scope Submit manuscript

Learning likelihood ratios with neural network classifiers

Download PDF

330 Accesses
4 Citations
Explore all metrics

A preprint version of the article is available at arXiv.

Abstract

The likelihood ratio is a crucial quantity for statistical inference in science that enables hypothesis testing, construction of confidence intervals, reweighting of distributions, and more. Many modern scientific applications, however, make use of data- or simulation-driven models for which computing the likelihood ratio can be very difficult or even impossible. By applying the so-called “likelihood ratio trick,” approximations of the likelihood ratio may be computed using clever parametrizations of neural network-based classifiers. A number of different neural network setups can be defined to satisfy this procedure, each with varying performance in approximating the likelihood ratio when using finite training data. We present a series of empirical studies detailing the performance of several common loss functionals and parametrizations of the classifier output in approximating the likelihood ratio of two univariate and multivariate Gaussian distributions as well as simulated high-energy particle physics datasets.

Article PDF

Reducing the Dependence of the Neural Network Function to Systematic Uncertainties in the Input Space

Article Open access 23 February 2020

On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent

Article 08 April 2024

Neural Networks with Linear Threshold Activations: Structure and Algorithms

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

J. Neyman and E.S. Pearson, On the Problem of the Most Efficient Tests of Statistical Hypotheses, Phil. Trans. Roy. Soc. Lond. A 231 (1933) 289 [INSPIRE].
A. Andreassen et al., OmniFold: A Method to Simultaneously Unfold All Observables, Phys. Rev. Lett. 124 (2020) 182001 [arXiv:1911.09107] [INSPIRE].
Article ADS Google Scholar
A. Rogozhnikov, Reweighting with Boosted Decision Trees, J. Phys. Conf. Ser. 762 (2016) 012036 [arXiv:1608.05806] [INSPIRE].
Article Google Scholar
D. Martschei, M. Feindt, S. Honc and J. Wagner-Kuhr, Advanced event reweighting using multivariate analysis, J. Phys. Conf. Ser. 368 (2012) 012028 [INSPIRE].
A. Andreassen, I. Feige, C. Frye and M.D. Schwartz, JUNIPR: a Framework for Unsupervised Machine Learning in Particle Physics, Eur. Phys. J. C 79 (2019) 102 [arXiv:1804.09720] [INSPIRE].
Article ADS Google Scholar
A. Andreassen and B. Nachman, Neural Networks for Full Phase-space Reweighting and Parameter Tuning, Phys. Rev. D 101 (2020) 091901 [arXiv:1907.08209] [INSPIRE].
Article ADS Google Scholar
LHCb collaboration, Observation of the decays \( {\Lambda}_b^0\to {\chi}_{c1}p{K}^{-} \) and \( {\Lambda}_b^0\to {\chi}_{c2}p{K}^{-} \), Phys. Rev. Lett. 119 (2017) 062001 [arXiv:1704.07900] [INSPIRE].
ATLAS collaboration, Search for pair production of higgsinos in final states with at least three b-tagged jets in \( \sqrt{s} \) = 13 TeV pp collisions using the ATLAS detector, Phys. Rev. D 98 (2018) 092002 [arXiv:1806.04030] [INSPIRE].
L. Fischer, R. Naab and A. Trettin, Treating detector systematics via a likelihood free inference method, 2023 JINST 18 P10019 [arXiv:2305.02257] [INSPIRE].
B. Nachman and J. Thaler, Learning from many collider events at once, Phys. Rev. D 103 (2021) 116013 [arXiv:2101.07263] [INSPIRE].
Article ADS MathSciNet Google Scholar
T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning, Springer (2009) [https://doi.org/10.1007/978-0-387-84858-7] [INSPIRE].
M. Sugiyama, T. Suzuki and T. Kanamori, Density Ratio Estimation in Machine Learning, Cambridge University Press (2012) [https://doi.org/10.1017/cbo9781139035613].
B.K. Miller, C. Weniger and P. Forré, Contrastive Neural Ratio Estimation, arXiv:2210.06170 [INSPIRE].
K. Cranmer, J. Pavez and G. Louppe, Approximating Likelihood Ratios with Calibrated Discriminative Classifiers, arXiv:1506.02169 [INSPIRE].
B. Nachman, A guide for deploying Deep Learning in LHC searches: How to achieve optimality and account for uncertainty, SciPost Phys. 8 (2020) 090 [arXiv:1909.03081] [INSPIRE].
Article ADS MathSciNet Google Scholar
A. Andreassen et al., Parameter estimation using neural networks in the presence of detector effects, Phys. Rev. D 103 (2021) 036001 [arXiv:2010.03569] [INSPIRE].
Article ADS Google Scholar
J. Hollingsworth and D. Whiteson, Resonance Searches with Machine Learned Likelihood Ratios, arXiv:2002.04699 [INSPIRE].
J. Brehmer, K. Cranmer, G. Louppe and J. Pavez, Constraining Effective Field Theories with Machine Learning, Phys. Rev. Lett. 121 (2018) 111801 [arXiv:1805.00013] [INSPIRE].
Article ADS Google Scholar
J. Brehmer, K. Cranmer, G. Louppe and J. Pavez, A Guide to Constraining Effective Field Theories with Machine Learning, Phys. Rev. D 98 (2018) 052004 [arXiv:1805.00020] [INSPIRE].
Article ADS Google Scholar
J. Brehmer, F. Kling, I. Espejo and K. Cranmer, MadMiner: Machine learning-based inference for particle physics, Comput. Softw. Big Sci. 4 (2020) 3 [arXiv:1907.10621] [INSPIRE].
Article Google Scholar
F.A. Di Bello et al., Efficiency Parameterization with Neural Networks, Comput. Softw. Big Sci. 5 (2021) 14 [arXiv:2004.02665] [INSPIRE].
Article ADS Google Scholar
A. Andreassen, B. Nachman and D. Shih, Simulation Assisted Likelihood-free Anomaly Detection, Phys. Rev. D 101 (2020) 095004 [arXiv:2001.05001] [INSPIRE].
Article ADS Google Scholar
M. Erdmann et al., Adversarial Neural Network-based data-simulation corrections for jet-tagging at CMS, J. Phys. Conf. Ser. 1525 (2020) 012094 [INSPIRE].
R.T. D’Agnolo et al., Learning multivariate new physics, Eur. Phys. J. C 81 (2021) 89 [arXiv:1912.12155] [INSPIRE].
Article ADS Google Scholar
S. Diefenbacher et al., DCTRGAN: Improving the Precision of Generative Models with Reweighting, 2020 JINST 15 P11004 [arXiv:2009.03796] [INSPIRE].
H1 collaboration, Unbinned deep learning jet substructure measurement in high Q2ep collisions at HERA, Phys. Lett. B 844 (2023) 138101 [arXiv:2303.13620] [INSPIRE].
K. Kong, K.T. Matchev, S. Mrenna and P. Shyamsundar, New Machine Learning Techniques for Simulation-Based Inference: InferoStatic Nets, Kernel Score Estimation, and Kernel Likelihood Ratio Estimation, arXiv:2210.01680 [INSPIRE].
G. Klambauer, T. Unterthiner, A. Mayr and S. Hochreiter, Self-Normalizing Neural Networks, Adv. Neural Inf. Process. Syst. 30 (2017) 1 [https://proceedings.neurips.cc/paper_files/paper/2017/file/5d44ee6f2c3f71b73125876103c8f6c4-Paper.pdf].
R.T. D’Agnolo and A. Wulzer, Learning New Physics from a Machine, Phys. Rev. D 99 (2019) 015014 [arXiv:1806.02350] [INSPIRE].
Article ADS Google Scholar
M. Stoye et al., Likelihood-free inference with an improved cross-entropy estimator, arXiv:1808.00973 [INSPIRE].
G.V. Moustakides and K. Basioti, Training Neural Networks for Likelihood/Density Ratio Estimation, arXiv:1911.00405.
D.-A. Clevert, T. Unterthiner and S. Hochreiter, Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), arXiv:1511.07289 [INSPIRE].
I. Kobyzev, S.J.D. Prince and M.A. Brubaker, Normalizing Flows: An Introduction and Review of Current Methods, IEEE Trans. Pattern Anal. Machine Intell. 43 (2021) 3964 [arXiv:1908.09257] [INSPIRE].
Article Google Scholar
B. Nachman and D. Shih, Anomaly Detection with Density Estimation, Phys. Rev. D 101 (2020) 075042 [arXiv:2001.04990] [INSPIRE].
Article ADS Google Scholar
M. Algren et al., Flow Away your Differences: Conditional Normalizing Flows as an Improvement to Reweighting, arXiv:2304.14963 [INSPIRE].
N. Jeffrey and B.D. Wandelt, Evidence Networks: simple losses for fast, amortized, neural Bayesian model comparison, Mach. Learn. Sci. Tech. 5 (2024) 015008 [arXiv:2305.11241] [INSPIRE].
Article Google Scholar
F. Chollet et al., Keras, https://github.com/fchollet/keras.
M. Abadi et al., TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, arXiv:1603.04467.
D.P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, arXiv:1412.6980 [INSPIRE].
A. Andreassen, P. Komiske, E. Metodiev, B. Nachman and J. Thaler, Pythia/Herwig + Delphes Jet Datasets for OmniFold Unfolding, (2019) [https://doi.org/10.5281/zenodo.3548091].
M. Bähr et al., Herwig++ Physics and Manual, Eur. Phys. J. C 58 (2008) 639 [arXiv:0803.0883] [INSPIRE].
Article ADS Google Scholar
J. Bellm et al., Herwig 7.0/Herwig++ 3.0 release note, Eur. Phys. J. C 76 (2016) 196 [arXiv:1512.01178] [INSPIRE].
J. Bellm et al., Herwig 7.1 Release Note, arXiv:1705.06919 [INSPIRE].
T. Sjöstrand, S. Mrenna and P.Z. Skands, A Brief Introduction to PYTHIA 8.1, Comput. Phys. Commun. 178 (2008) 852 [arXiv:0710.3820] [INSPIRE].
Article ADS Google Scholar
T. Sjöstrand, S. Mrenna and P.Z. Skands, PYTHIA 6.4 Physics and Manual, JHEP 05 (2006) 026 [hep-ph/0603175] [INSPIRE].
T. Sjöstrand et al., An introduction to PYTHIA 8.2, Comput. Phys. Commun. 191 (2015) 159 [arXiv:1410.3012] [INSPIRE].
Article ADS Google Scholar
ATLAS collaboration, ATLAS Pythia 8 tunes to 7 TeV data, ATL-PHYS-PUB-2014-021, CERN, Geneva (2014).
DELPHES 3 collaboration, DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02 (2014) 057 [arXiv:1307.6346] [INSPIRE].
D.J. Rezende and S. Mohamed, Variational Inference with Normalizing Flows, arXiv:1505.05770 [INSPIRE].
A. Paszke et al., PyTorch: An Imperative Style, High-Performance Deep Learning Library, arXiv:1912.01703 [INSPIRE].
C. Durkan, A. Bekasov, I. Murray and G. Papamakarios, nflows: normalizing flows in PyTorch, https://doi.org/10.5281/zenodo.4296287.
F. Pedregosa et al., Scikit-learn: Machine Learning in Python, J. Mach. Learn. Res. 12 (2011) 2825 [arXiv:1201.0490] [INSPIRE].
MathSciNet Google Scholar

Download references

Acknowledgments

We are grateful to Jesse Thaler for very helpful feedback about figures-of-merit and ways to generalize our loss functions. We thank Dag Gillberg for the suggestion to compare NNs with BDTs and Vinicius Mikuni for the idea of using normalizing flows to model the LR of the physics datasets. We thank Lindsey Gray for his idea of exploring the space of loss functionals via parametrizations with splines. M.P. thanks Shirley Ho and the Flatiron Institute for their hospitality while preparing this paper. S.R., M.P., and B.N. are supported by the Department of Energy, Office of Science under contract number DE-AC02-05CH11231.

Author information

Authors and Affiliations

Department of Statistics, University of California, Berkeley, CA, 94720, USA
Shahzar Rizvi
Physics Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Mariel Pettee & Benjamin Nachman
Berkeley Institute for Data Science, University of California, Berkeley, CA, 94720, USA
Benjamin Nachman

Authors

Shahzar Rizvi
View author publications
You can also search for this author in PubMed Google Scholar
Mariel Pettee
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Nachman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shahzar Rizvi.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

ArXiv ePrint: 2305.10500

Rights and permissions

Open Access . This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Rizvi, S., Pettee, M. & Nachman, B. Learning likelihood ratios with neural network classifiers. J. High Energ. Phys. 2024, 136 (2024). https://doi.org/10.1007/JHEP02(2024)136

Download citation

Received: 03 July 2023
Revised: 08 January 2024
Accepted: 22 January 2024
Published: 20 February 2024
DOI: https://doi.org/10.1007/JHEP02(2024)136

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Learning likelihood ratios with neural network classifiers

Abstract

Article PDF

Similar content being viewed by others

Reducing the Dependence of the Neural Network Function to Systematic Uncertainties in the Input Space

On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent

Neural Networks with Linear Threshold Activations: Structure and Algorithms

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning likelihood ratios with neural network classifiers

Abstract

Article PDF

Similar content being viewed by others

Reducing the Dependence of the Neural Network Function to Systematic Uncertainties in the Input Space

On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent

Neural Networks with Linear Threshold Activations: Structure and Algorithms

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation