Abstract
We demonstrate that deep neural networks with the ReLU activation function can efficiently approximate the solutions of various types of parametric linear transport equations. For non-smooth initial conditions, the solutions of these PDEs are high-dimensional and non-smooth. Therefore, approximation of these functions suffers from a curse of dimension. We demonstrate that through their inherent compositionality deep neural networks can resolve the characteristic flow underlying the transport equations and thereby allow approximation rates independent of the parameter dimension.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Allaire, G., Blanc, X., Després, B., Golse, F.: Transport et diffusion. Ecole Polytechnique (2019)
Ambrosio, L.: Transport equation and cauchy problem for non-smooth vector fields. In: Calculus of Variations and Nonlinear Partial Differential Equations, pp 1–41. Springer, Berlin (2008)
Ambrosio, L., Gigli, N., Savare, G.: Gradient Flows. Birkhäuser-Verlag (2005)
Barron, A.R.: Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory 39(3), 930–945 (1993)
Beck, C., Becker, S., Grohs, P., Jaafari, N., Jentzen, A.: Solving stochastic differential equations and kolmogorov equations by means of deep learning. arXiv:1806.00421 (2018)
Beck, C., Jentzen, A., Kuckuck, B.: Full error analysis for the training of deep neural networks. arXiv:1910.00121 (2019)
Bellman, R.: On the theory of dynamic programming. Proc. Natl. Acad. Sci. U.S.A. 38(8), 716 (1952)
Berner, J., Grohs, P., Jentzen, A.: Analysis of the generalization error: empirical risk minimization over deep artificial neural networks overcomes the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations. arXiv:1809.03062 (2018)
Bölcskei, H., Grohs, P., Kutyniok, G., Petersen, P.: Optimal approximation with sparsely connected deep neural networks. SIAM Journal on Mathematics of Data Science 1(1), 8–45 (2019)
Bouchut, F., Golse, F., Pulvirenti, M.: Kinetic Equations and Asymptotic Theory. Series in Applied Mathematics, 4. Gauthier-Villars, Editions Scientifiques et M’edicales Elsevie (2000)
Chen, M., Jiang, H., Liao, W., Zhao, T.: Efficient approximation of deep relu networks for functions on low dimensional manifolds. In: Advances in Neural Information Processing Systems, pp 8172–8182 (2019)
Courant, R., Hilbert, D.: Methods of Mathematical Physics. Wiley, New York (1989)
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2(4), 303–314 (1989)
Dahmen, W., Gruber, F., Mula, O.: An adaptive nested source term iteration for radiative transfer equations. Mathematics of Computation (2020)
Dahmen, W., Huang, C., Kutyniok, G., Lim, W.-Q., Schwab, C., Welper, G.: Efficient resolution of anisotropic structures. In: Extraction of Quantifiable Information from Complex Systems, pp 25–51. Springer (2014)
Dahmen, W., Kutyniok, G., Lim, W.-Q., Schwab, C., Welper, G.: Adaptive anisotropic Petrov–Galerkin methods for first order transport equations. J. Comput. Appl. Math. 340, 191–220 (2018)
Dahmen, W., Plesken, C., Welper, G.: Double greedy algorithms: reduced basis methods for transport dominated problems. ESAIM: Mathematical Modelling and Numerical Analysis 48(3), 623–663 (2014)
DiPerna, R.J., Lions, P.L.: Ordinary differential equations, transport theory and Sobolev spaces. Invent. Math. 98(3), 511–547 (1989)
E, W., Yu, B.: The deep ritz method: a deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics 6(1), 1–12 (2018)
Egger, H., Schlottbom, M.: A mixed variational framework for the radiative transfer equation. Mathematical Models and Methods in Applied Sciences 22(03), 1150014 (2012)
Elbrächter, D., Grohs, P., Jentzen, A., Schwab, C.: DNN expression rate analysis of high-dimensional PDEs: application to option pricing. arXiv:1809.07669 (2018)
Evans, L.: Partial differential equations. American Mathematical Society (2010)
Fresca, S., Dede, L., Manzoni, A.: A comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized PDEs. arXiv:2001.04001 (2020)
, F. Golse.: Distributions, analyse de Fourier, équations aux dérivées partielles. Ecole polytechnique (2012)
Golse, F.: Lecture notes on mean field kinetic equations. https://www.cmls.polytechnique.fr/perso/golse/M2/PolyKinetic.pdf (2013)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016). http://www.deeplearningbook.org
Grella, K.: Sparse tensor approximation for radiative transport. PhD thesis, ETH Zurich (2013)
Gühring, I., Kutyniok, G., Petersen, P.: Error bounds for approximations with deep ReLU neural networks in Ws,p norms. Analysis and Applications (in Press)
Hartman, P.: Ordinary differential equations. Society for Industrial and Applied Mathematics (2002)
He, J., Li, L., Xu, J., Zheng, C.: ReLU deep Neural Networks and Linear Finite Elements. arXiv:1807.03973 (2018)
Hesthaven, J.S., Rozza, G., Stamm, B., et al.: Certified Reduced Basis Methods for Parametrized Partial Differential Equations, vol. 590. Springer, Berlin (2016)
Horn, R.A., Johnson, C.R. (eds.): Matrix Analysis. Cambridge University Press, Cambridge (1985)
Hornik, K., Stinchcombe, M., White, H., et al.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)
Hutzenthaler, M., Jentzen, A., Kruse, T., Nguyen, T.A.: A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations. arXiv:1901.10854(2019)
John, F.: Partial differential equations. Springer, US (1978)
Kutyniok, G., Petersen, P., Raslan, M., Schneider, R.: A theoretical analysis of deep neural networks and parametric PDEs. arXiv:1904.00377(2019)
Laakmann, F.: A theoretical analysis of high-dimensional parametric transport equations and neural networks. Project thesis, University of Oxford (2019)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Mhaskar, H.N.: Approximation properties of a multilayered feedforward artificial neural network. Adv. Comput. Math. 1(1), 61–80 (1993)
Mhaskar, H.N.: Neural networks for optimal approximation of smooth and analytic functions. Neural Comput. 8(1), 164–177 (1996)
Montanelli, H., Yang, H., Du, Q.: Deep ReLU networks overcome the curse of dimensionality for bandlimited functions. arXiv:1903.00735 (2019)
Mouhot, C.: Hyperbolicity: scalar transport equations, wave equations. University of Cambridge. https://cmouhot.files.wordpress.com/1900/10/chapter41.pdf (2013)
Novak, E., Woźniakowski, H.: Approximation of infinitely differentiable multivariate functions is intractable. J. Complex. 25(4), 398–404 (2009)
Obermeier, A., Grohs, P.: On the approximation of functions with line singularities by ridgelets. Journal of Approximation Theory 237, 30–95 (2019)
Ohlberger, M., Rave, S.: Reduced basis methods: success, limitations and future challenges. arXiv:1511.02021 (2015)
Opschoor, J.A., Petersen, P., Schwab, C.: Deep ReLU networks and high-order finite element methods. Analysis and Applications (in Press)
Opschoor, J.A.A., Schwab, C., Zech, J.: Exponential ReLU DNN expression of holomorphic maps in high dimension. Technical Report 2019-35, Seminar for Applied Mathematics, ETH Zürich, Switzerland (2019)
Petersen, P., Voigtlaender, F.: Optimal approximation of piecewise smooth functions using deep reLU neural networks. Neural Netw. 108, 296–330 (2018)
Poggio, T., Mhaskar, H., Rosasco, L., Miranda, B., Liao, Q.: Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review. Int. J. Autom. Comput. 14(5), 503–519 (2017)
Quarteroni, A., Rozza, G., et al.: Reduced Order Methods for Modeling and Computational Reduction, vol. 9. Springer, Berlin (2014)
Schmidt-Hieber, J.: Deep ReLU network approximation of functions on a manifold. arXiv:1908.00695 (2019)
Schwab, C., Zech, J.: Deep learning in high dimension: neural network expression rates for generalized polynomial chaos expansions in UQ. Anal. Appl. 17(01), 19–55 (2019)
Serre, D.: Systems of Conservation Laws 1. Cambridge University Press (1999)
Shaham, U., Cloninger, A., Coifman, R.R.: Provable approximation properties for deep neural networks. Appl. Comput Harmon. Anal. 44(3), 537–557 (2018)
Sirignano, J., Spiliopoulos, K.: DGM: a deep learning algorithm for solving partial differential equations. J. Comput. Phys. 375, 1339–1364 (2018)
Suzuki, T.: Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality. arXiv:1810.08033 (2018)
Yarotsky, D.: Error bounds for approximations with deep reLU networks. Neural Netw. 94, 103–114 (2017)
Acknowledgments
The authors would like to thank Avi Mayorcas for inspiring discussions in the early stage of this work. The authors thank Francois Golse for advice and various helpful suggestions on the theory of transport equations. P.P is grateful for the hospitality and support of the Institute of Mathematics of the University of Oxford during his visit in January 2020.
Funding
Open Access funding provided by University of Vienna.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Jan Hesthaven
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix: 1: Bounds for \(\|X\|_{C^{k}}, k=0,1\)
Proposition A.1
Let X be defined as in Theorem 3.3. Then, for every compact set \(K\subset \mathbb {R}^{n}\) there holds
with \(\|V\|_{C^{1}}:= \|V\|_{C^{1}([0,T]\times B_{G_{0}}(0)\times [0,1]^{D})}\).
Proof
We start with the definition of X given by
The fundamental theorem of calculus implies
With the help of the sub-linear growth-condition (H2), we conclude
Moreover, by Gronwall’s inequality
Hence,
We have by (A.3a)
Furthermore, applying Leibniz integral rule to (A.4) yields
and therefore
Gronwall’s inequality implies then
The same procedure results for ∇xX and ∇ηX in
Thus, we get after assuming without loss of generality that \(T\geq 1, \|V\|_{C^{1}}\geq 1\)
□
Appendix: 2: Construction of a NN emulating the left Riemann sum
Proposition B.1
Let \(d \in \mathbb {N}_{\geq 2}\), T > 0, \({\Omega } \subset \mathbb {R}^{d-1}\), and let Φ be a NN with d-dimensional input. Then there exists a NN \(\widetilde {I}_{N}({\Phi })\) such that
where c1,c2 > 0 are independent of Φ and \(c_{3} := 3 \|\mathrm {R}({\Phi })\|_{L^{\infty }([0,T]\times {\Omega })}\).
Proof
Let, for \(i \in \{ 0, 1, \dots , N\}\), ti := iT/N. We define, for \(i\in \{0, \dots , N-1\}\),
Then \(\mathrm {R}({\Phi }_{i}^{\text {(shift)}})(t,x) = \mathrm {R}({\Phi })(t_{i}, x)\), for all t ∈ [0,T],x ∈Ω. Moreover, \(W({\Phi }_{i}^{\text {(shift)}})\) ≤ 2W(Φ) + 2d and \(L({\Phi }_{i}^{\text {(shift)}}) = L({\Phi }) + 2\) by Proposition 2.2. Next, we define the following indicator networks for \(i \in \{0, \dots , N-1\}\):
We have that \(W({\Phi }^{\text {(ind)}}_{i}) = 7\), \(L({\Phi }^{\text {(ind)}}_{i}) = 3\) and, for t ∈ [0,T] and x ∈Ω,
Let \(\bar {a} := \|\mathrm {R}({\Phi })\|_{L^{\infty }([0,T]\times {\Omega })}\). Now we set, for \(i \in \{0, \dots , N-1\}\),
We have that
It follows from (B.2) and (B.3) that, for t ∈ [0,T] and x ∈Ω,
In addition, by Propositions 2.2 and 2.3,
Finally, we set
Now we have that t ≤ ti if ⌈tN/T⌉≤ i and t ≥ ti+ 1 if i ≤⌈tN/T⌉− 1. Hence, for t ∈ [0,T] and x ∈Ω,
Since, by (B.6),
we conclude the proof by observing with (B.7) and (B.8) that
□
Proposition B.2 (Left Riemann sum)
Let M > 0, \(n \in \mathbb {N}\), \(f \in C^{1}([0,T]\times [-M,M]^{n}; \mathbb {R})\), and \(N \in \mathbb {N}\). The approximation of the integral of f with respect to its first argument from 0 to t ≤ T, T ≥ 1 by the left Riemann sum is given by
Then
Proof
Let \(N(t) := \max \limits \{i \in \mathbb {N} | t_{i}<t\}\). Then
□
Remark B.3
Equation (B.1) implies that for a NN Φ with n + 1-dimensional input there holds
with \(c = 3\|\mathrm {R}({\Phi })\|_{L^{\infty }([0,T]\times {\Omega })} > 0\).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Laakmann, F., Petersen, P. Efficient approximation of solutions of parametric linear transport equations by ReLU DNNs. Adv Comput Math 47, 11 (2021). https://doi.org/10.1007/s10444-020-09834-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10444-020-09834-7