Abstract
Nonparametrical copula density estimation is a meaningful tool for analyzing the dependence structure of a random vector from given samples. Usually kernel estimators or penalized maximum likelihood estimators are considered. We propose solving the Volterra integral equation
to find the copula density \(\mathrm{c}(u_1, \ldots , u_d) = \frac{\partial ^d \mathrm{C}}{\partial u_1 \cdots \partial u_d}\) of the given copula \(\mathrm{C}\). In the statistical framework, the copula \(\mathrm{C}\) is not available and we replace it by the empirical copula of the pseudo samples, which converges to the unobservable copula \(\mathrm{C}\) for large samples. Hence, we can treat the copula density estimation from given samples as an inverse problem and consider the instability of the inverse operator, which has an important impact if the input data of the operator equation are noisy. The well-known curse of high dimensions usually results in huge nonsparse linear equations after discretizing the operator equation. We present a Petrov–Galerkin projection for the numerical computation of the linear integral equation. A special choice of test and ansatz functions leads to a very special structure of the linear equations, such that we are able to estimate the copula density also in higher dimensions.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Copula Density Estimation as an Inverse Problem
A copula is a multivariate distribution function of a \(d\)-dimensional random vector with uniformly distributed margins. Sklar’s theorem ensures that any joint multivariate distribution \(F\) of a \(d\)-dimensional vector \(\mathbf{{X}}=(X_1, \ldots , X_d)^T\) with margins \(F_j\) (\(j=1, \ldots , d\)) can be expressed as
where the copula is unique on \(range(F_1) \times \cdots \times range(F_d)\), that is for continuous margins \(F_1, \ldots , F_d\) the copula \(\mathrm{C}\) is unique on the whole domain. Consequently, the copula contains the complete dependence structure of the random vector \(\mathbf{{X}}\). For a detailed introduction to copulas and their properties see, for example, [8, 9], Chap. 5] or [10]. In risk management, knowledge of the dependence is of paramount importance.
If the copula is sufficiently smooth, the copula density
exists and then the density gives us the dependence structure in a more convenient way, because usually the graphs of the copulas look very similar and there are only small differences in the slope. For this reason the reconstruction of the copula density is a vibrant field of research in finance and many other scientific fields. Particularly in practical tasks, the dependence structure of more than two random variables is of special interest as the dimension \(d\) is large. In the nonparametric statistical estimation, usually kernel estimators are used, but they have often problems with the boundary bias. There are also spline- or wavelet-based approximation methods, but most of them are only discussed in the two-dimensional case. Likewise, in [12], the authors discuss a penalized nonparametrical maximum likelihood method in the two-dimensional case. A detailed survey of literature about nonparametrical copula density estimation can be found in [6]. However, most of the nonparametrical methods are faced with the curse of dimensionality such that the numerical computations are only for sufficiently low dimensions possible. Actually, many authors discuss only the two-dimensional case in non-parametrical copula density estimation.
In this paper we develop an alternative approach based on the theory of inverse problems. The copula density (1) exists only for absolutely continuous copulas. Obviously, the copula is not observable for a sample \(\mathbf{{X}}_1, \mathbf{{X}}_2, \ldots , \mathbf{{X}}_T\) in the statistical framework, but we can approximate it with the empirical copula
of the margin transformed pseudo samples \(\hat{\mathbf{{U}}}_1, \hat{\mathbf{{U}}}_2, \ldots , \hat{\mathbf{{U}}}_T\) with \(\hat{U}_{kj} = \hat{F_k}(X_{kj})\) where
denotes the empirical margins. It is well-known that the empirical copula uniformly converges to the copula (see [2])
Therefore, we treat the empirical copula as a noisy representation of the unobservable copula \(\mathrm{C}^\delta = \hat{\mathrm{C}}\). The estimation problem of the density is faced with differentiating the empirical copula, which is obviously not smooth. However, for each density it yields the integral equation
which can be seen as a weak formulation of Eq. (1). In the following, we therefore consider the linear Volterra integral operator \(\mathrm{A}\in \fancyscript{L} \left( L^1(\varOmega ) , L^2(\varOmega ) \right) \) and solve the linear operator equation
to find the copula density \(\mathrm{c}\). In the following, we assume attainability which means \(\mathrm{C}\in \fancyscript{R}(\mathrm{A})\), hence we only consider copulas \(\mathrm{C}\in L^2(\varOmega )\) which have a solution \(\mathrm{c}\in L^1(\varOmega )\)
The injective Volterra integral operator is well-studied in the inverse problem literature. Even in the one-dimensional case, this is an ill-posed operator resulting from the noncontinuity of the inverse \(\mathrm{A}^{-1}\), which is the differential operator. Hence, solving Eq. (1) leads to numerical instabilities if the right-hand side of (5) has only a small data error. Because the solution is sensitive to small data errors, regularization methods to overcome the instability are discussed in the inverse problem literature. For a detailed introduction to regularization see, for example, [4, 13].
In Sect. 2 we discuss a discretization of the integral equation (4) and in Sect. 3, we illustrate the numerical instability if we use the empirical copula instead of the exact one and discuss regularization methods for the discretized problem.
The basics to the numerical implementation of the problem and especially the details of the Kronecker multiplication are presented in the authors working paper [14] and a discussion that the Petrov–Galerkin projection is not a simple counting algorithm is done in [15]. This paper gives an summary of the proposed method for effective computation of the right-hand side for larger dimensions and discusses in more detail the analytical aspects of the inverse problem and reasons for the existence of the Kronecker structure.
2 Numerical Approximation
We discuss the numerical computation of the copula density \(\mathrm{c}\in \mathrm {X}= L^1(\varOmega )\) from a given copula \(\mathrm{C}\in \mathrm {Y}= L^2(\varOmega )\), which is in principle a numerical differentiation and in higher dimensions, a very hard problem (see [1]). Moreover, in practical applications, the measured data \(\mathrm{C}^\delta \) have some noise \(\delta \) with \(\left\| \mathrm{C}- \mathrm{C}^\delta \right\| _{\mathrm {Y}} \le \delta \) and very often the function is not smooth enough that is \(\mathrm{C}^\delta \notin C^1(\varOmega )\) even \(\mathrm{C}\in C^1(\varOmega )\), which leads to numerical instabilities making a usual numerical differentiation impossible.
For the sake of convenience, we write
for Eq. (4) as a short form. We propose applying a Petrov–Galerkin projection (see [5]) for some discretization size \(h\) and consider the finite dimensional approximation
where \(\varPhi = \lbrace \phi _1,\phi _2, \ldots , \phi _N \rbrace \) is a basis of the ansatz space \(V_h\). The vector of coefficients \(\mathbf{{c}}=(c_1, \ldots , c_N)^T \in \mathbb {R}^N\) is chosen such that
It is sufficient to fulfill Eq. (7) for \(N\) linear independent test functions \(\psi _i \in \tilde{V_h}\). This yields the system of linear equations
with right-hand side
and the \(N \times N\) matrix \(K\) with
If the exact copula is replaced by the empirical copula, we obtain a noisy representation \(\mathbf{{C}}^\delta \) with
of the exact right-hand side \(\mathbf{{C}}\). A typical phenomenon of ill-posed inverse problems is that the numerically computed solution based on noisy data (10) will be high oszillating without choosing a proper regularization. This problem is not caused by the numerical approximation, but rather by the discontinuity of the inverse operator. In Section 3 this will be illustrated. Figure 3 shows the reconstructed density of the Student copula for exact data (9), whereas Fig. 5 shows it for different noise levels.
In principle, we can choose arbitrary ansatz functions \(\phi _j \in V_h\) and test functions \(\psi _i \in \tilde{V}_h\). However, having the curse of high dimensions in mind, we choose very simple ansatz functions such that the matrix \(K\) gets a very special structure allowing us to solve (8) and compute the approximated copula density also for higher dimensional copulas. Obviously, the approximated density (6) is not smooth and in order to obtain a smoother approximated copula \(\mathrm{C}_h\) with
we choose the test functions as integrated ansatz functions, such that the approximated copula
is smoother than the approximated density.
We discretize the domain \(\varOmega \) by splitting each one-dimensional interval \([0,1]\) in \(n\) equal subintervals of length \(h=\frac{1}{n}\). Hence, we obtain \(N=n^d\) equal-sized hypercubes and call these elements \(e_1, \ldots , e_N\). We number the elements in a specific order, illustrated in Fig. 1 such that if we look at the \((d+1)\)-dimensional problem, the first \(n^d\) elements of the new problem have the same number and location as the elements of the \(d\)-dimensional problem.
We set \(N=n^d\) and choose the ansatz functions
and the test functions \(\psi _i\) as the integrated ansatz functions
In contrast to finite element discretizations, the system matrix \(K\) is not sparse and the system size \(N=n^d\) grows exponentially with the dimension \(d\). A straightforward assembling and solving of the linear system (8) becomes impossible for usual discretizations \(n\). Even in the three-dimensional case, the matrix storage of the system matrix for \(n=80\) needs approximately one terabyte, even when exploiting symmetry, and computing times for assembling and solving such systems will become enormous.
The choices (11) and (12) yield a structure of the \(N \times N\) system matrix \(K\), illustrated in Fig. 2, allowing us to solve (8) also for \(d>2\). The matrixplot shows that the \(n \times n\) system matrix of the one-dimensional case is equivalent to the upper left \(n \times n\) corners of the two- and three-dimensional matrices. Moreover, the other parts of the system matrices are scaled replications of the one-dimensional \(n \times n\) system matrix. This effect is based by a Kronecker factorization of the \(d\)-dimensional system matrix into \(d\) one-dimensional matrices of the one-dimensional problem.
One important reason for this structure is that the chosen ansatz functions decomposed into a product of one-dimensional ansatz functions. In order to illustrate this, we consider the lowest corner \(\mathbf{{b}}^i\) of the \(i\)th element and define the one-dimensional function
This yields
as well as
with the one-dimensional test functions
We only formulate the main result allowing us to compute solutions of (8) also for higher dimensions \(d\). Details and proofs can be found in the working paper [14].
Theorem 1
The system matrix for the \((d+1)\)-dimensional case can be extracted from the one and \(d\)-dimensional system matrices.
Corollary 1
The system matrix \({^{(d)}}K\) is the \(d\)-fold Kronecker product of the \(n \times n\) matrix \({^{(1)}}K\)
and the inverse system matrix of the \(d\)-dimensional problem is the \(d\)-times Kronecker product of the one-dimensional inverse system matrix
Following Corollary 1 we only have to assemble the one-dimensional system matrix \({^{(1)}}K\) of dimension \(n \times n\), compute its inverse \({^{(1)}}K^{-1}\) and have to perform the Kronecker factorization for computing the solution \(\mathbf{{c}} = {^{(d)}}K^{-1} \mathbf{{C}}\) of (8). Details of the algorithm and an effective Kronecker multiplication are written in [14]. Using effective parallelization methods, the running time can be accelerated. Actually, the computation of the right-hand side (9) is the crucial part and much more expensive than solving the linear system, because we have to evaluate \(N=n^d\) different \(d\)-dimensional integrals over the whole domain \(\varOmega \). Note that for our special choice of ansatz functions (6) we have
which also reduces the numerical effort. In higher dimensions, the number of elements \(e_i\) with zero values grows, such that using Eq. (16) instead of (9) improves the running times.
In the most practical relevant case, where the components of the right-hand side (10) are evaluated over the empirical copula (2), the numerical effort can be radically reduced, because the \(d\)-dimensional integral
degenerates in a product of \(d\) one-dimensional integrals
using Eqs. (13) and (14). In this case, the numerical effort is of order \(\mathcal{{O}} \left( N T d \right) \) which is an extreme improvement to \(\mathcal{{O}} \left( N 3^d T+\frac{N^2+N}{2}3^d \right) \), if the \(d\)-dimensional integrals (10) are numerically computed by a usual \(3^d\)-points Gauss formula. We want to point out that the computation of the right-hand side (10) for the empirical copula based on formula (17) is still possible for \(d=9\), whereas the computational effort for computing (16) for an arbitrary given copula \(\mathrm{C}\) is exorbitant, even if the discretization size \(n\) is moderately chosen. The numerical effort is illustrated in Table 1.
Note that contrary to what might be expected, the vector \(\mathbf{{c}}=(c_1, \ldots , c_N)^T\) does not count the number of samples in the elements, even though the approximated solution \(\mathrm{c}_h\) is a piecewise constant function on the elements and the Petrov–Galerkin projection is not simple counting (for more details see [15]).
2.1 Examples
In order to illustrate the computing times and approximation quality, we use the independent copula
which has the exact solution \(\mathrm{c}(u)=1\). Please note that for this example, we used the exact copula as right-hand side without generating samples. So, there is no data noise and hence \(\delta =0\), which allows us to separate the approximation error and the ill-posedness resulting from the uncontinuity of the inverse operator \(\mathrm{C}^{-1}\).
Many authors (see, for example, [11]) look at the integrated square error, which is the squared \(L^2\)-norm of the difference between the copula density and its approximation. For the independent copula, the integrated square error can easily be computed
Actually, this error measure is unsuitable, because the natural space for densities is \(L^1\) instead of \(L^2\) (see [3]) and so we measure the difference in the \(L^1\)-norm, which also can be easily computed for the independence copula
In Table 1, we give the following quantities for different discretization steps \(n\) in dimension \(1\) and dimension \(d\): the system size \(N=n^d\), the computing times \(t_{rhs}\) for assembling the right-hand side, \(t_{solve}\) for solving the system, \(s_{rhs}\) as the number of computing slaves and the \(L^1\)-approximation errors. For the computation of the right-hand side, a parallel OpenMPI implementation was used with \(s_{rhs}\) computing slaves. For solving the system with the Kronecker factorization, a sequential C++ implementation is used. The exact computation of an ordinary right-hand side without using the product structure gets still impossible for \(d \ge 5\) and the times are estimated computing times. In summary, the example of the independence copula shows that for exact data of the right-hand side, the approximation error is suitable but grows with decreasing discretization size \(h = \frac{1}{n}\). We want to point out that this is typical phenomenon of inverse problems, called “regularization by discretization”.
If we consider the more practical relevant case, that the empirical copula, generated by \(T\) independent samples of the independence copula, is used, we are faced with data noise \(\delta > 0\) and ill-posedness. Table 2 shows that the computation based on (17) is still possible for \(d \approx 10\). However, the approximation error increase with the dimension \(d\), which is a direct consequence of the ill-posedness, because the condition number of the system matrix \(K\) is the condition number of the one-dimensional system matrix \({^{(1)}}K\) to the power of \(d\).
Naturally, our proposed method works not only for the rather simple independence copula, it also works quite well for all typical copula families. The approximation error for noise free right-hand sides can be neglected. Figures 3 and 4 show the reconstructed densities for the Student and Frank copula, using exact data for the right-hand side. However, ill-posedness is expected when empirical copulas are used. In [14], numerical results for other copula families, like the Gaussian, Gumbel, or Clayton copula, can also be found. However, ill-posedness is expected when empirical copulas are used and we are faced with data noise, which we discuss in the next section.
3 Ill-Posedness and Regularization
Note that in real problems, the copula \(\mathrm{C}\) is not known and we only have noisy data (10) instead of (9). In order to illustrate the expected numerical instabilities, we have simulated \(T\) samples for each two-dimensional copula and present the nonparametric reconstructed densities using the Petrov–Galerkin projection with grid size \(n=50\). A typical problem of ill-posed inverse problems is, that the numerical instability decreases if the grid size \(n\) decreases, which can also be seen in Table 1. Therefore, we fix the grid size \(n=50\) and look at the influence of sample size \(T\).
Because of (3), the data noise \(\delta \) increases if \(T\) decreases. Figures 5 and 6 show the expected ill-posedness appearing for decreasing sample size \(T\). Of course, this instabilities also occur for the other copula families, but we restrict our illustration here to these two examples. More examples can be found in [14].
To overcome the ill-posedness, an appropriate regularization for the discretized problem (8) is required. Figures 7 and 8 show the reconstructed copula densities for \(T=1\),000 and \(T=10\),000 samples using the well-known Tikhonov regularization. There is no regularization, if the regularization parameter \(\alpha =0\) is chosen. The left-hand side of the figures shows the unregularized solutions. The choice of the regularization parameter \(\alpha = 10^{-8}\) is very naive and arbitrary and serves only as demonstration how the instability can be handled. A better parameter choice should improve the reconstructed densities. It is further work to discuss an appropriate parameter choice rule for Tikhonov regularization as well as other regularization methods.
In order to avoid the complete assembling of the system matrix \(K\) leading to high-dimensional systems for \(d>2\), we are interested in regularization methods using the special structure (15). In particular, all regularization methods based on the singular value or eigenvalue decomposition of \(K\) can be easily handled because the eigenvalue decomposition of the one-dimensional matrix\({^{(1)}}K = V \varLambda V^T\) leads to the eigenvalue decomposition of the system matrix
A typical property of Tikhonov regularization is that true peaks in the density will be smoothed. This effect appears in particular for the Student copula density. Hence, the reconstruction quality should be improved, if other regularization methods are used. In the inverse problem theory, it is well-known that Tikhonov regularization accompanies \(L^2\)-norm penalization of the regularized solutions. Therefore, \(L^1\) penalties or total variation penalties (see [7]) seem more suitable.
Furthermore, the approximated copula
should yield the typical properties of copulas. For example, the requirement
yields the condition \(\sum _{j=1}^N c_j =1\) and the requirements
lead to additional conditions on the vector \(\mathbf {c}\), which all together can be used to build problem specific regularization methods.
References
Anderssen, R.S., Hegland, M.: For numerical differentiation, dimensionality can be a blessing!. Math. Comput. 68(227), 1121–1141 (1999)
Deheuvels, P.: Non parametric tests of independence. In: Raoult J.P. (ed.) Statistique non Paramétrique Asymptotique. Lecture Notes in Mathematics, vol. 821, pp. 95–107. Springer, Berlin Heidelberg (1980). doi:10.1007/BFb0097426
Devroye, L., Györfi, L.: Nonparametric Density Estimation: the L1 View. Wiley Series in Probability and Mathematical Statistics. Wiley, New York (1985)
Engl, H., Hanke, M., Neubauer, A.: Regularization of Inverse Problems. Mathematics and Its Applications. Springer, New York (1996)
Grossmann, C., Roos, H., Stynes, M.: Numerical Treatment of Partial Differential Equations. Universitext. Springer, Berlin (2007)
Kauermann, G., Schellhase, C., Ruppert, D.: Flexible copula density estimation with penalized hierarchical b-splines. Scand. J. Stat. 40(4), 685–705 (2013)
Koenker, R., Mizera, I.: Density estimation by total variation regularization. Adv. Stat. Model. Inference pp. 613–634 (2006)
Mai, J., Scherer, M.: Simulating Copulas: Stoch. Models. Sampling Algorithms and Applications. Series in Quantitative Finance. Imperial College Press, London (2012)
McNeil, A., Frey, R., Embrechts, P.: Quantitative Risk Management: Concepts. Techniques and Tools. Princeton Series in Finance. Princeton University Press, USA (2010)
Nelsen, R.B.: An Introduction to Copulas (Springer Series in Statistics). Springer, New York (2006)
Qu, L., Qian, Y., Xie, H.: Copula density estimation by total variation penalized likelihood. Commun. Stat.—Simul. Comput. 38(9), 1891–1908 (2009). doi:10.1080/03610910903168587
Qu, L., Yin, W.: Copula density estimation by total variation penalized likelihood with linear equality constraints. Computat. Stat. Data Anal. 56(2), 384–398 (2012). doi: http://dx.doi.org/10.1016/j.csda.2011.07.016
Schuster, T., Kaltenbacher, B., Hofmann, B., Kazimierski, K.: Regularization Methods in Banach Spaces. Radon Series on Computational and Applied Mathematics. Walter De Gruyter In., Berlin (2012)
Uhlig, D., Unger, R.: A Petrov-Galerkin projection for copula density estimation. Technical report, TU Chemnitz, Department of Mathematics (2013). http://www.tu-chemnitz.de/mathematik/preprint/2013/PREPRINT.php?year=2013&num=07
Uhlig, D., Unger, R.: The Petrov-Galerkin projection for copula density estimation isn’t counting. Technical report, TU Chemnitz, Department of Mathematics (2014). http://www.tu-chemnitz.de/mathematik/preprint/2014/PREPRINT.php?year=2014&num=03
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is distributed under the terms of the Creative Commons Attribution Noncommercial License, which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Copyright information
© 2015 The Author(s)
About this paper
Cite this paper
Uhlig, D., Unger, R. (2015). Nonparametric Copula Density Estimation Using a Petrov–Galerkin Projection. In: Glau, K., Scherer, M., Zagst, R. (eds) Innovations in Quantitative Risk Management. Springer Proceedings in Mathematics & Statistics, vol 99. Springer, Cham. https://doi.org/10.1007/978-3-319-09114-3_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-09114-3_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09113-6
Online ISBN: 978-3-319-09114-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)