Abstract
A global measure of observation correlations in a network is proposed, together with the auxiliary indices related to non-diagonal elements of the correlation matrix. Based on the above global measure, a specific representation of the correlation matrix is presented, being the result of rigorously proven theorem formulated within the present research. According to the theorem, each positive definite correlation matrix can be expressed by a scale factor and a so-called internal weight matrix. Such a representation made it possible to investigate the structure of the basic reliability measures with regard to observation correlations. Numerical examples carried out for two test networks illustrate the structure of those measures that proved to be dependent on global correlation index. Also, the levels of global correlation are proposed. It is shown that one can readily find an approximate value of the global correlation index, and hence the correlation level, for the expected values of auxiliary indices being the only knowledge about a correlation matrix of interest. The paper is an extended continuation of the previous study of authors that was confined to the elementary case termed uniform correlation. The extension covers arbitrary correlation matrices and a structure of correlation effect.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
It is commonly known that covariance matrices for observations play an important role in constructing the stochastic models for geodetic networks. Possessing reliable values of the elements of covariance matrices in satellite and ground positioning systems is of considerable significance in estimating the actual accuracy of final coordinates (Wang et al. 2012). Also, the design of such systems based on the knowledge of potential effects of observation correlations may contribute to achieving high reliability of the systems.
The covariance matrices, but actually the correlation matrices contained in them, are the weak (in terms of accuracy) elements of stochastic models built for networks. For correlation matrices that are obtained by estimation in data post-processing the accuracy of non-diagonal elements can undergo steady improvement due to the research being carried out on the methods of estimating the covariance matrices (e.g., Ananga et al. 1994; Leandro et al. 2005). The most difficult situation and not solved as yet takes place at the stage of network design where the knowledge on non-diagonal terms in correlation matrices is usually small, or sometimes even none. That calls for the need to work out the methods of constructing something like substitute matrices on basis of the possessed knowledge on correlation terms.
Seeking the tools for investigating the effect of observation correlations on network reliability, we have then to cover both the case of a specified correlation matrix and the substitute one. So, the tools should be applicable to any arbitrary correlation matrix.
The problem of the effect of observation correlations on network internal reliability has been undertaken in Prószyński and Kwaśniak (2016). The investigations were confined to the elementary case termed there uniform correlation, i.e., where all the non-diagonal elements of the correlation matrix are of equal values (positive or negative). Such a matrix that played there a role of a substitute matrix is known in financial mathematics as constant correlation matrix, e.g., Tiit and Helemae (1997), Dufresne (2005). In those publications, one can find a rigorously derived, necessary, and sufficient condition for positive definiteness of such a matrix. The set of constant correlation matrices \((n\times n)\) can be immediately ordered according to the increasing values of the constant term. The problem is with ordering the set of possible configurations of an arbitrary \((n \times n)\) correlation matrix. An appropriate measure would be needed for this purpose.
The study of the elementary case of constant correlation matrices indicated the way to a more generalized approach and is assumed as reference material in the present investigations.
The objectives of the paper, covering the problems outlined above, can be formulated in the following way:
-
to extend the former studies by seeking a global measure of observation correlations covering all possible configurations of correlation matrices. The measure should allow ranking of matrix configurations;
-
to create theoretical basis for operating with any correlation matrices in the analyses of correlation effect;
-
to construct analytical tools for investigating the structure of correlation effect on internal and external reliability of networks for any arbitrary correlation matrices;
-
to propose levels of global correlation in networks.
As in the previous paper of authors, the research is confined to a single outlier case.
Taking into account all the problems to be solved, reflected in the above objectives, the primary motivation to undertake the research was the fact that at the stage of network design the knowledge on non-diagonal terms in a correlation matrix for observations is usually small or none.
2 A proposed global measure of observation correlations in a network
Determinant of a correlation matrix is a measure of collinearity of explanatory variables in linear regression. High correlation between these variables has a negative influence upon the effectiveness of estimates of the model parameters. The determinant takes the values within the interval \(\left\langle {0,1} \right\rangle \). Since 0 corresponds to full correlation (i.e., 1) and 1 corresponds to the lack of correlation (i.e., 0), to get a measure ordered according to the increase of correlation, we introduce for a positive definite matrix \(\mathbf{C}_{\mathrm{s}}\) the following quantity
where \({\uprho }_{\mathrm{G}} \) is termed a global measure of stochastic relationships between the observations in a network, or for short—a measure of global correlation. It only takes positive values like a coefficient of determination (e.g., Anderson-Sprecher 1994). For the quality of any measure, it is recommended that it should be a one-to-one mapping of all possible values of the characterized quantity. Looking from the algebraic point of view, we can observe that the function \(f :\mathbf{C}_{\mathrm{s}} \rightarrow \hbox {det}\mathbf{C}_{\mathrm{s}} \) (or \(f :\mathbf{C}_{\mathrm{s}} \rightarrow {\uprho }_{\mathrm{G}} )\) is not injective (i.e., is not a one-to-one function). This means that although a specified \(\mathbf{C}_{\mathrm{s}} \) configuration corresponds to only one value of \(\hbox {det}{} \mathbf{C}_{\mathrm{s}} \), a specified value of \(\hbox {det}{} \mathbf{C}_{\mathrm{s}}\) (or \({\uprho }_{\mathrm{G}} )\) may correspond to a certain number of \(\mathbf{C}_{\mathrm{s}} \) configurations. Such numbers expressed in terms of frequencies are shown in Fig. 1 on exemplary diagrams obtained by numerical simulation of the \(\mathbf{C}_{\mathrm{s}} \) configurations for \(n~=~3, 4, 5\) (3 sets of 50,000 simulations were used for each graph). The simulations were based on the algorithm termed “accept-reject.” It consisted in drawing each non-diagonal element of the matrix \(\mathbf{C}_{\mathrm{s}}\) from the uniform distribution and rejecting those of the resulting matrix configurations that were not positive definite matrices. As stated in Numpacharoen and Atsawarungruangkit (2012) and Budden et al. (2008), such an algorithm is effective only for small values of n. In the present research, it was found empirically that this algorithm, being free of any algebraic operations, does not affect the frequencies of matrix configurations as is the case with some faster algorithms.
In Fig. 1 we can observe an advantageous trend that with the increase of n, the percent of \(\mathbf{C}_{\mathrm{s}} \) configurations increases significantly only for ever greater values of \({\uprho }_{\mathrm{G}}\) where \(\hbox {det}{} \mathbf{C}_{\mathrm{s}} \) starts asymptotic ascend toward 1. Hence, the quality of the measure in the analyzed aspect of the function injectivity improves with the increase of n.
To approximate the actual numbers of \(\mathbf{C}_{\mathrm{s}}\) configurations corresponding to a specified value of \({\uprho }_{\mathrm{G}} \) more accurately than on basis of the 0.01-width intervals, the numbers sought for were determined for the 0.0001-width intervals located at \(\hbox {det}{} \mathbf{C}_{\mathrm{s}} \) values 0.1, 0.2,..., 0.9 and 0.9999 (see Table 1). The results of the above-mentioned simulations were used for this purpose. The zeroes in Table 1, being more numerous for greater n, are merely the result of the problems with generating the correlation matrices and very low probabilities of the investigated events.
Initially, we tried to get the above curves (hopefully, also for greater n) by means of the probabilistic approach involving determination of density of \(\hbox {det}{} \mathbf{C}_{\mathrm{s}} \), treated as a random variable (e.g., Hanea and Nane 2016). That would require a rigorous probabilistic definition of the elements of \(\mathbf{C}_{\mathrm{s}}\) and would yield the results dependent on that definition. Finally, we chose the above-mentioned algebraic approach as satisfactory to show the degree of departure from injectivity for the function f :\(\mathbf{C}_{\mathrm{s}} \rightarrow \hbox {det}\mathbf{C}_{\mathrm{s}} \).
In spite of the function \(\mathbf{C}_{\mathrm{s}} \rightarrow \hbox {det}{} \mathbf{C}_{\mathrm{s}} \) (or \(\mathbf{C}_{\mathrm{s}} \rightarrow {\uprho }_{\mathrm{G}}\)) being not injective, the use of \({\uprho }_{\mathrm{G}} \) makes possible, although with some ambiguity, the ranking of configurations of a correlation matrix. As follows from Fig. 1, the ambiguities for n = 5 are relatively small for up to \({\uprho }_{\mathrm{G}} < 0.85\) (i.e., frequency \({<}\)2%). However, they will increase abruptly for higher values of \({\uprho }_{\mathrm{G}} \).
With constant correlation matrices, we get a specific measure of observation correlations denoted in Prószyński and Kwaśniak (2016) by the symbol a. The measure provides ambiguity-free ranking of constant correlation matrices. It is related to \(\hbox {det}{} \mathbf{C}_{\mathrm{s}} \) by the formula (Dufresne 2005)
where \(\mathbf{C}_{\mathrm{s}}^n \) denotes \(\mathbf{C}_{\mathrm{s}} (n\times n)\).
Due to (1) we get on basis of (2)
The relationship between a and \({\uprho }_{\mathrm{G}} \) for \(n~=~2, 3, {\ldots }, 50\) is shown in Fig. 2. All the curves approach \({\uprho }_{\mathrm{G}} = 1\) asymptotically. We can see that with the increase of n, the same values of \({\uprho }_{\mathrm{G}} \) correspond to diminishing values of a.
We can easily connect the global measure \({\uprho }_{\mathrm{G}} \) with the index of multiple correlation \({\uprho }_{(i)} \) for the ith observation \((i~=~1, {\ldots }, n)\), i.e.,
where \(\hbox {det}{} \mathbf{C}_{\mathrm{s}(i)} \) is obtained after deleting the ith row and the ith column in \(\mathbf{C}_{\mathrm{s}} \).
3 Auxiliary indices connecting a global measure \({\uprho }_{\mathrm{G}}\) with the elements of correlation matrix
The definition of the global measure (1) does not provide information about the magnitudes of non-diagonal elements of \(\mathbf{C}_{\mathrm{s}} \). Hence, the following auxiliary indices that may provide some more detailed description of a global measure \({\uprho }_{\mathrm{G}} \) are proposed and investigated
where \(\left\{ {\mathbf{C}_{\mathrm{s}} } \right\} _{ij} \) is a non-diagonal element of \(\mathbf{C}_{\mathrm{s}} \); the subscript “qm” denotes quadratic mean.
The indices are related to each individual \(\mathbf{C}_{\mathrm{s}} \) configuration that can be generated for a specified value of \({\uprho }_{\mathrm{G}} \).
Here are the properties of the above indices, proven immediately for a specific case \(n~=~2\) and due to complexity of the task for \(n\ge 3\) confirmed only on basis of numerical testing (for \(n~=~3, 5, 9\)).
Property 1
For a positive definite correlation matrix \(\mathbf{C}_{\mathrm{s}} (n\times n),\hbox { }n\ge 2\) and an arbitrarily chosen value of a global correlation measure \({\uprho }_{\mathrm{G}} \), denoted as , the following relationships occur
where is that smaller of the two real roots of the function satisfying (3).
The specific case (6) can be proved immediately. We can write , where x denotes \(\left\{ {\mathbf{C}_{\mathrm{s}} } \right\} _{12} \), and hence, we get the equality (6).
The numerically based confirmation for a general case (7) is presented in Appendix B.
Property 2
For a positive definite correlation matrix \(\mathbf{C}_{\mathrm{s}} (n\times n),\hbox { }n\ge 2\) and an arbitrarily chosen value of a global correlation measure \({\uprho }_{\mathrm{G}} \), denoted as , the following relationships occur
where (as above) and are real roots of the function satisfying (3).
The specific case (8) can be proved immediately. We can write , where x denotes \(\left\{ {\mathbf{C}_{\mathrm{s}} } \right\} _{12} \). Since \(\left( {\left\{ {\mathbf{C}_{\mathrm{s}} } \right\} _{12} } \right) _{\mathrm{qm}} =\left| {\left\{ {\mathbf{C}_{\mathrm{s}} } \right\} _{12} } \right| \), we get the equality (8).
The numerically based confirmation for a general case (9) is presented in Appendix B.
Both the properties are illustrated in Fig. 3. The intervals given by (7) and (9) are denoted there as MAV and QM, respectively.
The problem of finding \(a^{-}\) and \(a^{+}\) for a specified value of \({\uprho }_{\mathrm{G}} \) and \(n\ge 3\) is a complex task of finding the real roots of the polynomial of the nth order, each root satisfying the condition \(-\frac{1}{n-1}<a<1\). This can be done in an iterative way. We used for this purpose a Solver being a Microsoft Excel add-in program.
For uniform correlation matrices, i.e., \(\mathbf{C}_{\mathrm{s}} (a)\) as in Sect. 2, we have \(\left| {\left\{ {\mathbf{C}_{\mathrm{s}} } \right\} _{ij} } \right| _{\mathrm{max}} =a\) and \( \left( {\left\{ {\mathbf{C}_{\mathrm{s}} (a)} \right\} _{ij} } \right) _{\mathrm{qm}} =a.\)
For proper interpretation of Property 1 and Property 2, we show in Fig. 4 a mutual arrangement of the \({\uprho }_{G}\), \(a^{+}\) and \(\left| {a^{-}} \right| \) curves with the increase of n (except for \(n~=~2\)).
For \(n~=~2\) (not shown in Fig. 4) the \(\hbox {a}^{+}\)and \(\left| {a^{-}} \right| \) curves coincide with the \({\uprho }_{G}\) curve. With the increase of n, the \(a^{+}\)and \(\left| {a^{-}} \right| \) curves move away downward from the \({\uprho }_{G}\) curve while diminishing their mutual separation. For \(n~=~100\), they only slightly depart from the \(\hbox {det}{} \mathbf{C}_{\mathrm{s}} \) axis on a long section, but close to \(\hbox {det}{} \mathbf{C}_{\mathrm{s}}= 0\) the \(a^{+}\) curve starts to ascend asymptotically to 1.
4 A representation of correlation matrix based on the proposed global measure
Using a formula for the inverse of the correlation matrix \(\mathbf{C}_{\mathrm{s}}\) (positive definite), i.e.,
where \(\mathbf{C}_{\mathrm{s}}^{*} \) is the adjugate (positive definite), we may represent \(\mathbf{C}_{\mathrm{s}} (n\times n),\hbox { }n\ge 2\) in the form
where \(\hbox {det}{} \mathbf{C}_{\mathrm{s}} \) is a scale factor common for all the n observations in a network.
Since \(\hbox {det}{} \mathbf{C}_{\mathrm{s}}^{*} \ne 1\), \(\hbox {det}\mathbf{C}_{\mathrm{s}} \) as in (11) is not a complete scale factor, and the remaining part of it must be contained in \(\left( {\mathbf{C}_{\mathrm{s}}^{*} } \right) ^{-1}\). A representation of \(\mathbf{C}_{\mathrm{s}} \) showing a complete scale factor is specified by the theorem below. The theorem was formulated and proved within the present research.
Theorem 1
A positive definite correlation matrix \(\mathbf{C}_{\mathrm{s}} (n\times n),n\ge 2\) can be represented as
where
The proof is immediate.
Assuming \(\mathbf{C}_{\mathrm{s}} =q\cdot \mathbf{R}\), where detR = 1, we get \(\hbox {det}{} \mathbf{C}_{\mathrm{s}} =q^{n}\cdot \hbox {det}\mathbf{R}=q^{n}\) and hence, \(q=\left( {\hbox {det}{} \mathbf{C}_{\mathrm{s}} } \right) ^{1/n}\). On basis of (11) we write \(\hbox {det}{} \mathbf{C}_{\mathrm{s}} \cdot \left( {\mathbf{C}_{\mathrm{s}}^{*} } \right) ^{-1}=\left( {\hbox {det}{} \mathbf{C}_{\mathrm{s}} } \right) ^{\hbox {1/}n}\cdot \mathbf{R}\), and finally we get \(\mathbf{R}=\left( {\hbox {det}{} \mathbf{C}_{\mathrm{s}} } \right) ^{(n-\hbox {1)/}n}\left( {\mathbf{C}_{\mathrm{s}}^{*} } \right) ^{-1}\), what ends the proof.
We can readily prove that the representation (12) applies to non-singular matrices \((n\times n),\hbox { }n\ge 2\) with a determinant greater than zero, and also to non-singular matrices \((n\times n),\hbox { }n\ge 3\) where n is an odd number. The asymmetry of a matrix does not affect this type of representation.
In analogy to the well-known relationship between the covariance matrix and the weight matrix, i.e., \(\mathbf{C}=\sigma _{\mathrm{o}}^2 \cdot \mathbf{P}^{-1}\) with \(\sigma _{\mathrm{o}}^2 \) being a variance factor, we express (11) in a notation that introduces the proposed global correlation measure \({{\uprho } }_{\mathrm{G}} \) and considers R as a rescaled correlation matrix
where
We can readily check that for \({\uprho }_{\mathrm{G}} \rightarrow 0\) or \({\uprho }_{\mathrm{G}} \rightarrow 1\) the complete scale factor \(q({\uprho }_{\mathrm{G}} )\) tends asymptotically to 1 or 0, respectively. The relationship \(q({\uprho }_{\mathrm{G}} )=\left( {1-{\uprho }_{\mathrm{G}}^2 } \right) ^{1/n}\) is shown in Fig. 5a, and as a function of a constant correlation term a in Fig. 5b.
With the increase in global correlation, the scale factor decreases. The greater the values of n, the longer are the intervals where the scale factor does not depart from 1 significantly (see Fig. 5a).
5 Transforming basic reliability expressions to learn about the character of correlation effect
We show that representation of the correlation matrix as in the formula (13) can be useful in analyses of the structure of network reliability measures with respect to observation correlations.
Below, there is a list of basic expressions and quantities used in theory of reliability for correlated observations. They all refer to the modified (i.e., standardized) form of the Gauss–Markov model (GMM), that exposes the correlation matrix \(\mathbf{C}_{\mathrm{s}} \) (Prószyński 2010), i.e.,
where: \(\mathbf{x}(u\times 1)\), \(\mathbf{y}_{\mathrm{s}} (n\times 1)\), \(\mathbf{A}_{\mathrm{s}} (u\times n)\), rank \(\mathbf{A}_{\mathrm{s}} \le u\), \(\mathbf{C}_{\mathrm{s}} (n\times n)\) (positive definite).
Let us remind here that the standardized model such as (14) is obtained by multiplying both sides of the original GMM by \({{\varvec{\upsigma }} }^{-1}\), where \({{\varvec{\upsigma }}}=(\hbox {diag }\mathbf{C})^{1/2}\), and transforming the covariance matrix C (positive definite) accordingly.
Following the essential classification of network reliability (Baarda 1968), the expressions and quantities are listed in two groups, the first pertaining to the internal reliability and the second pertaining to the external reliability
a) internal reliability
\(h_{ii} ,w_{ii}\,\)—the response-based measures of internal reliability for the ith observation
\(r_i\,\)—reliability number for the ith observation; \(\hbox {MDB}_{\mathrm{s},i}\)—minimal detectable bias for the ith observation (standardized), where \(\lambda \) is non-centrality parameter in a global model test (Wang and Chen 1994; Teunissen 1990, 2006)
\({\uprho }_{ij}\)—coefficient of correlation between the outlier w-test statistics for the ith and the jth observation, as a quantity related to internal reliability
b) external reliability
\({\hat{\mathbf{x}}}\)—LS estimator for x as in (14); \(\hbox {MDB}_{\mathrm{s,}i} \) as in (16); \(\Delta {\hat{\mathbf{x}}}_{(i)} \)—the vector of increments in \({\hat{\mathbf{x}}}\) due to \(\hbox {MDB}_{\mathrm{s,}i} \), \({\updelta } _i \)—a measure of external reliability (Wang and Chen 1994), given there with the use of regular inverse of \(\mathbf{C}_{{\hat{\mathbf{x}}}}\).
Additionally, we include in this analysis a correlation-dependent quantity based on a matrix H, but not considered as a measure of network reliability
\(\mathbf{C}_{\mathbf{v} _{\mathrm{s}}}\)—covariance matrix of the vector of standardized LS residuals; \({\upsigma }_{\mathbf{v}_{\mathrm{s,}i} }^2\)—variance of the ith standardized LS residual.
First, we consider the matrices H and K appearing in all the expressions in the groups (a) and (b) as above. Applying (13), we get
We can see that both the matrices H and K do not depend on \(q{(\uprho }_{\mathrm{G}} )\) and, hence, do not depend on a global correlation measure \({\uprho }_{\mathrm{G}} \). They are functions of the structural matrix \(\mathbf{A}_{\mathrm{s}}\) (standardized) and the internal weight matrix \(\mathbf{P}_{\mathrm{s}} \).
Now, basing on the above properties of H and K, we analyze other quantities listed under a) and b).
Ad a) The matrix W, and hence all its diagonal elements \(w_{ii} (i=1,\ldots ,n)\) depend on the structural matrix \(\mathbf{A}_{\mathrm{s}}\) and the internal weight matrix \(\mathbf{P}_{\mathrm{s}} \). Therefore, the measures \(h_{ii} \) and \(w_{ii} \) for a given network can only undergo changes in mutual diversification for different configurations of \(\mathbf{C}_{\mathrm{s}} \) and, obviously, for different accuracies of observations.
where \(\mathbf{H}^{\mathrm{T}}{} \mathbf{P}_s \mathbf{H}\) and \(\left\{ {\mathbf{H}^{\mathrm{T}}{} \mathbf{P}_{\mathrm{s}} \mathbf{H}} \right\} _{ii} \) are functions of \(\mathbf{A}_{\mathrm{s}} \) and \(\mathbf{P}_{\mathrm{s}}\).
We can easily find that the correlation coefficients \({\uprho }_{ij}\) as in (17) do not depend on \({\uprho }_{\mathrm{G}} \), but on \(\mathbf{A}_{\mathrm{s}} \) and \(\mathbf{P}_{\mathrm{s}} \) only, i.e., \({\uprho }_{ij} =\hbox {f}_{ij} (\mathbf{A}_{\mathrm{s}} ,\mathbf{P}_{\mathrm{s}} )\).
Ad b) The vector \({\hat{\mathbf{x}}}\) does not depend on \({\uprho }_{\mathrm{G}} \), but on \(\mathbf{A}_{\mathrm{s}} \) and \(\mathbf{P}_{\mathrm{s}} \) only, i.e., \({\hat{\mathbf{x}}}=\hbox {f}_{\mathrm{x}} (\mathbf{A}_{\mathrm{s}} ,\mathbf{P}_{\mathrm{s}} )\). This result is quite understandable since it is well known that the LS estimator can be obtained equivalently either by using the covariance matrix or the weight matrix.
As derived in Appendix A, the measure of external reliability \(\delta _i \) can be represented in the following form
This means that it does not depend on the global correlation measure \({\uprho }_{\mathrm{G}} \), but on \(\mathbf{A}_{\mathrm{s}} \) and \(\mathbf{P}_{\mathrm{s}} \) only. We notice that there is a common factor \(\lambda \), which depends on type I error \(\upalpha \), type II error \(\upbeta \), and redundancy f of a network.
The additional quantity \(\mathbf{HC}_{\mathrm{s}} \mathbf{H}^{\mathrm{T}}\) will have the representation as below
where \(\mathbf{HP}_{\mathrm{s}}^{-1} \mathbf{H}^{\mathrm{T}}\) and \(\left\{ {\mathbf{HP}_{\mathrm{s}}^{-1} \mathbf{H}^{\mathrm{T}}} \right\} _{ii} \) are functions of \(\mathbf{A}_{\mathrm{s}} \) and \(\mathbf{P}_{\mathrm{s}}\).
Before summing up the results of the above analysis, we denote by \(\hbox {z}_i \left( {\mathbf{A}_{\mathrm{s}} ,\mathbf{C}_{\mathrm{s}} } \right) _{\mathrm{dep}} \) the quantity related to the ith observation \((i=1,{\ldots }, n)\) dependent on the level of global correlation \({\uprho }_{\mathrm{G}} \) and by \(\hbox {z}_i \left( {\mathbf{A}_{\mathrm{s}} ,\mathbf{C}_{\mathrm{s}} } \right) _{\mathrm{ind}} \) the quantity related to the ith observation (\(i~=~1,{\ldots }, n)\) independent of that level.
The results show that the representation of the correlation matrix as in (13) makes it possible to express the reliability measure \(\hbox {z}_{i}\) for the ith observation in the following form:
where \({\upmu }({\uprho }_{\mathrm{G}} )\) can be either of \(\left( {q({\uprho }_{\mathrm{G}} )} \right) ^{-1}\), \(\sqrt{{\lambda }(f)\cdot }\sqrt{q({\uprho }_{\mathrm{G}} )}\)
For additional quantity \({\upsigma }_{\mathbf{v}_{\mathrm{s,}i} }^2 \), being \(\hbox {z}_i \left( {\mathbf{A}_{\mathrm{s}} ,\mathbf{C}_{\mathrm{s}} } \right) _{\mathrm{dep}} \), we get in (25) \({\upmu }({\uprho }_{\mathrm{G}} )=q({\uprho }_{\mathrm{G}} )\).
In the factorization of reliability measure as in (25), \({\upmu }({\uprho }_{\mathrm{G}} )\) is a scale factor common for all the n observations in a network, whereas \(\hbox {f}_i \left( {\mathbf{A}_{\mathrm{s}} ,\hbox { }{} \mathbf{P}_{\mathrm{s}} } \right) \) is a factor related only to the ith observation. The factors \(\hbox {f}_i \left( {\mathbf{A}_{\mathrm{s}} ,\hbox { }{} \mathbf{P}_{\mathrm{s}} } \right) \) for \(i~=~1,~2,\ldots ,n\) form together a set of mutually diversified values, as also is the case in relationship (26). Figure 6 shows graphs of the above-mentioned three types of the function \({\upmu }({\uprho }_{\mathrm{G}} )\), i.e., \({\upmu }_1 ({\uprho }_{\mathrm{G}} )=\left( {q({\uprho }_{\mathrm{G}} )} \right) ^{-1}\), \({\upmu }_2 ({\uprho }_{\mathrm{G}} )\) = \(\sqrt{{\lambda }(f)\cdot }\sqrt{q({\uprho }_{\mathrm{G}} )}\), \({\upmu }_3 ({\uprho }_{\mathrm{G}} )=q({\uprho }_{\mathrm{G}} )\) for \(n~=~2, 6\) and 10. Since beside the statistical parameters, \(\lambda \) is a function of redundancy f in a network, to get a common basis for different n, we assume for constructing the graphs that \(f~=~0.5n\), which corresponds to a level of internal reliability \(h~=~0.5\) for \(\mathbf{C}_{\mathrm{s}} =\mathbf{I}\).
Figure 6 shows that the scale factor \({\upmu }_1 ({\uprho }_{\mathrm{G}} )\) is increasing with the increase in \({\uprho }_{\mathrm{G}} \). The compound scale factor \({\upmu }_2 ({\uprho }_{\mathrm{G}} )\) displaying the resulting decrease with the increase in \({\uprho }_{\mathrm{G}} \) contains two mutually opposing effects, i.e., the increasing one (\(\sqrt{{\lambda } (f)})\) due to network redundancy and the decreasing one (\(\sqrt{q({\uprho }_{\mathrm{G}} )})\) due to the global correlation. The scale factor \({\upmu }_3 ({\uprho }_{\mathrm{G}} )\) is decreasing with the increase of \({\uprho }_{\mathrm{G}} \).
Additionally, by transforming (25) to a form
we get an idea what part of the total value of the reliability measure for the ith observation represents a diversifying effect of observation correlations.
By simple modification of (27), we may also decompose the reliability measure for the ith observation into a part independent of \({\uprho }_{\mathrm{G}} \) and a part dependent on \({\uprho }_{\mathrm{G}} \), as shown below
For completeness, we quote (26), which corresponds to \({\upmu }({\uprho }_{\mathrm{G}} )~=~1\) in (28)
6 Numerical examples illustrating the use of derived formulas
In numerical examples, we maintain the approach applied in Prószyński and Kwaśniak (2016) of expressing each correlation-dependent quantity as a function of the form \(\hbox {z}(\mathbf{A},{{\varvec{\upsigma }}}=\mathbf{1}, \mathbf{C}_{\mathrm{s}} )\) where the column vector \({\varvec{\upsigma }}\) of ones represents unitary accuracies of observations. For simplicity, the form is denoted as \(\hbox {z}(\mathbf{A}_{\mathrm{s}} ,\mathbf{C}_{\mathrm{s}} )\) where \(\mathbf{A}_{\mathrm{s}} =\mathbf{A}\).
A test network in two options, as shown in Fig. 7, is taken from the above-mentioned publication. It is a horizontal network treated as a free network. The brief characteristics of both the options, placed under each sketch, contain the following features:
-
the range of internal reliability indices for \(\mathbf{C}_{\mathrm{s}} =\mathbf{I}\) (in square brackets);
-
number of observations n, redundancy f.
The arrows for GPS vectors in Fig. 7 indicate the orientation of coordinate differences assumed in the corresponding GMM models. The option (b) is the option (a) strengthened with 13 angular observations, what is reflected in a considerable increase in network internal reliability. The coordinates of the network point are given in Table 2.
For generating the correlation matrices, a slightly modified version of an algorithm as that in Davies and Higham (2000) was used, being much faster for greater n than the “accept-reject” algorithm described in Sect. 2.
The examples refer only to quantities that depend on a global correlation \({\uprho }_{\mathrm{G}} \), i.e., \(\hbox {MDB}_{\mathrm{s,}i} \), \(r_i \) and \({\upsigma }_{\mathbf{v}_{\mathrm{s,}i} }^2 \). First, we illustrate the effect of reducing \(\hbox {z}_i \left( {\mathbf{A}_{\mathrm{s}} ,\hbox { }{} \mathbf{C}_{\mathrm{s}} } \right) \) to \(\hbox {z}_i \left( {\mathbf{A}_{\mathrm{s}} ,\hbox { }{} \mathbf{P}_{\mathrm{s}} } \right) \), where \(\hbox {z}_i =\hbox {MDB}_{\mathrm{s,}i} \), by using a scale factor \({\upmu }({\uprho }_{\mathrm{G}} )\) as in the formula (27).
In Figures 8 and 9 we can see that due to the reduction, for quantities being the reliability measures (i.e., \(\hbox {MDB}_{\mathrm{s,}i} \) and \(r_i\)) the graphs for different values of \({\uprho }_{\mathrm{G}} \) become considerably compacted, especially for the latter measure. The compacted graphs enable one to learn about the mutual diversification of the \(\hbox {z}_i \) values for different observations, i.e., for \(i~=~1,\ldots ,n\), independent of the global correlation \({\uprho }_{\mathrm{G}} \). In the case of an additional quantity \({\upsigma }_{\mathbf{v}_{\mathrm{s,}i} }^2 \) (Fig. 10), the values of \({\upsigma }_{\mathbf{v}_{\mathrm{s,}i} }^2 (\mathbf{A}_{\mathrm{s}} ,\mathbf{P}_{\mathrm{s}} )\, (i=1,\ldots , n)\) for different values of \({\uprho }_{\mathrm{G}} \) are more extended along the vertical axis and much less mutually diversified for individual observations than in \({\upsigma }_{\mathbf{v}_{\mathrm{s,}i} }^2 (\mathbf{A}_{\mathrm{s}} ,\mathbf{C}_{\mathrm{s}} )\).
For a more detailed analysis, we could determine the degree of mutual diversifications of the \(\hbox {z}_i \left( {\mathbf{A}_{\mathrm{s}} ,\hbox { }{} \mathbf{P}_{\mathrm{s}} } \right) \) (\(i=1,\ldots ,n\)) values and compare them with the corresponding ones for \(\hbox {z}_i \left( {\mathbf{A}_{\mathrm{s}} ,\hbox { }{} \mathbf{C}_{\mathrm{s}} } \right) \) (\(i~=~1,\ldots ,n\)), for some chosen values of \({\uprho }_{\mathrm{G}} \). The degree of mutual diversifications might then be defined as a mean square deviation from the average value.
Now, the additive representation of \(\hbox {z}_i \left( {\mathbf{A}_{\mathrm{s}} ,\hbox { }{} \mathbf{C}_{\mathrm{s}} } \right) \) as in the formula (28) is illustrated in the examples for a network in Fig. 7a).
For \(\hbox {MDB}_{\mathrm{s,}i} \) and \(r_i \), due to \({\upmu }({\uprho }_{\mathrm{G}} )>1\) (see diagram in Fig. 6), the \({\uprho }_{\mathrm{G}} \)-dependent components assume positive values as shown in Figs. 11 and 12. In the case of \({\upsigma }_{\mathbf{v}_{\mathrm{s,}i} }^2 \), where \({\upmu }({\uprho }_{\mathrm{G}} )<1\) (see diagram in Fig. 6), the \({\uprho }_{\mathrm{G}} \)-dependent components are of negative values (see Fig. 13).
7 Proposed global correlation levels in networks
For a pair of random variables, there are established correlation levels. The problem arises in the case of global correlation discussed in this paper. We may express the correlation levels either directly in terms of the global correlation measure \({\uprho }_{\mathrm{G}} \) or in terms of the constant correlation coefficient a. Since the latter approach refers to a specific type of correlation matrices only, preference should be given to the former approach covering a full set of matrix configurations.
The following global correlation levels are proposed:
-
1.
\(0<{\uprho }_{\mathrm{G}} \le 0.3\) weak correlation
-
2.
\(0.3<{\uprho }_{\mathrm{G}} \le 0.6\) moderate correlation
-
3.
\(0.6<{\uprho }_{\mathrm{G}} <1\) significant/strong correlation
Figure 14 shows the relationship between the above levels and the corresponding intervals of the constant correlation parameter a. As one might expect on basis of the curves in Fig. 2, with the increase of n, the intervals of a corresponding to weak and moderate global correlation are more and more lowered and narrower.
This testifies that the choice of the correlation levels was to a high extent imposed by the character of variability of \({\uprho }_{\mathrm{G}} \) versus a for different values of n. The case \(n=2\) is satisfyingly consistent with the commonly used levels of correlation between two random variables.
The proposed global correlation levels provide a concise description of each given correlation matrix and make possible the comparison of different-size correlation matrices. It seems that they can be helpful in the design of systems with correlated observations. At the phase of design, we usually do not possess a specified correlation matrix, but may have some general knowledge on its elements as, for instance, the expected value of \(\left| {\left\{ {\mathbf{C}_{\mathrm{s}} } \right\} _{ij} } \right| _{\mathrm{max}} \) or \(\left| {\left\{ {\mathbf{C}_{\mathrm{s}} } \right\} _{ij} } \right| _{\mathrm{qm}} \) (see (5)). It is obvious that with this knowledge we may only find the approximate value of the global correlation index and the corresponding correlation level. We can do it by making the simplifying assumption that \(\left| {\left\{ {\mathbf{C}_{\mathrm{s}} } \right\} _{ij} } \right| _{\mathrm{max}} =a_1 \) or \(\left| {\left\{ {\mathbf{C}_{\mathrm{s}} } \right\} _{ij} } \right| _{\mathrm{qm}} =a_2 \), and using the formula (3) for \(a_{1} \) or \(a_{2} \).
We may use the matrices \(\mathbf{C}_{\mathrm{s}} (a_{1} )\) or \(\mathbf{C}_{\mathrm{s}} (a_{2})\) as substitute correlation matrices for determining some approximate reliability characteristics of the designed network.
8 Concluding remarks
Representation of the correlation matrix, based on the proposed measure of network global correlation, made it possible to investigate the structure of correlation-dependent quantities. Each analyzed quantity can be qualified either as dependent on a global correlation measure or as independent of such a measure. Those of the first type can be represented by a scale factor common for all the observations in a network and a factor being, beside a network structure, a function of the internal weight matrix yielding inter-observational diversifications. The quantities of the second type are solely the functions of the network structure and the internal weight matrix. Another way of representing the analyzed quantity of the first type is its decomposition into a part independent of the global correlation measure and a part dependent on that measure.
The above ways of representing the correlation matrices, together with the proposed levels of global correlation, can be applied in the analyses of systems with correlated observations.
For investigating the structure of correlation-dependent quantities at greater values of \({\uprho }_{\mathrm{G}} \) and n, we may use the easily created correlation matrices \(\mathbf{C}_{\mathrm{s}} (a)\) with a constant term a computed for the required values of \({\uprho }_{\mathrm{G}} \) and n. Obviously, the thus-obtained matrices will only be specific substitutes of the correlation matrices sought for.
In the paper the proposed analytical tools were applied to basic reliability measures for a single outlier case. It is easy to deduce that due to generality of the theoretical basis of the tools their potential application field can be much wider. Analyzing the findings of some publications that contributed to development of the theory of network reliability and outlier detection methods, we may state that the tools can also be applied in investigating the effect of observation correlations on the quantities such as for instance:
References
Ananga N, Coleman R, Rizos C (1994) Variance-covariance estimation of GPS networks. Bull Geod 68:77–87
Anderson-Sprecher R (1994) Model comparisons and R\(^{2}\). Am Stat 48(2):113–117
Baarda W (1968) A testing procedure for use in geodetic networks. Publications on Geodesy, New Series, vol. 2, No 5. Netherlands Geodetic Commission, Delft
Budden M, Hadavas P, Hoffman L (2008) On the generation of correlation matrices. Appl Math E-Notes 8: 279–282. c ISSN 1607-2510
Davies PI, Higham NJ (2000) Numerically stable generation of correlation matrices and their factors. BIT Numer Math 40:640. doi:10.1023/A:1022384216930
Dufresne D (2005) Two notes on financial mathematics. Actuarial Research Clearing House, No. 2
Hanea AM, Nane GF (2016) The asymptotic distribution of the determinant of a random correlation matrix. arXiv:1309.7268v4 [math.PR]
Knight NL, Wang J, Rizos C (2010) Generalised measures of reliability for multiple outliers. J Geod 84(10):625–635
Leandro R, Santos M, Cove K (2005) An empirical approach for the estimation of GPS covariance of observations, ION 61st annual meeting, the MITRE Corporation & Draper Laboratory, Cambridge, MA
Numpacharoen K, Atsawarungruangkit A (2012) Generating correlation matrices based on the boundaries of their coefficients. PLoS ONE 7(11):e48902. doi:10.1371/journal.pone.0048902
Prószyński W (2010) Another approach to reliability measures for systems with correlated observations. J Geod 84:547–556
Prószyński W, Kwaśniak M (2016) An attempt to determine the effect of increase of observation correlations on detectability and identifiability of a single gross error. Geod Cartogr 65(2):313–333. doi:10.1515/geocart-2016-0018
Teunissen PJG (1990) Quality control in integrated navigation systems. IEEE Aerosp Electron Syst Mag 5(7):35–41
Teunissen PJG (2006) Testing theory, an introduction, 2nd edn. Delft University Press, Delft
Tiit EM, Helemäe HL (1997) Boundary distributions with fixed marginals. In: Beneš V, Štěpán J (eds) Distributions with given marginals and moment problems. Springer, Dordrecht. doi:10.1007/978-94-011-5532-8_12
Wang J, Chen Y (1994) On the reliability measure of observations. Acta Geodaet. et Cartograph. Sinica, English Edition, pp 42–51
Wang J, Knight NL (2012) New outlier separability test and its application in GNSS positioning. J Glob Position Syst 11(1):46–57
Wang J, Almagbile Y, Wu T, Tsujii T (2012) Correlation analysis for fault detection statistics in integrated GNSS/INS systems. J Glob Position Syst 11(2):89–99
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
Proof for the formula (24).
Taking into account the formulas (18), (21), (22) and adding that
we can write
We can easily check that when H as in (20), \(\mathbf{P}_{\mathrm{s}} \mathbf{H}=\mathbf{H}^{\mathrm{T}}{} \mathbf{P}_{\mathrm{s}} \mathbf{H}\), and so
Coming back to (29), we get
what ends the proof.
Using a regular inverse in K, we would obtain the same formula for \({\updelta }_i \).
Appendix B
1.1 Numerically based confirmation of Property 1 and Property 2 (for \(n\ge 3\))
For generating the correlation matrices, the same algorithm was used as that quoted in Sect. 6. For each diagram, 50,000 simulations were used.
Figure 15a–c shows the results of checking Property 1 as in formula (7).
Figure 16a–c shows the results of checking Property 2 as in formula (9)
All the results confirm the properties. The empty spaces in the “confirmation area” are due to insufficiently great number of the executed simulations. The validity of this statement lies in the empirically observed fact that a certain increase in the number of simulations yielded several additional points within “the confirmation area.” Obviously, the checks at greater n would be recommended to get a stronger empirical support for the Properties.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Prószyński, W., Kwaśniak, M. Analytic tools for investigating the structure of network reliability measures with regard to observation correlations. J Geod 92, 321–332 (2018). https://doi.org/10.1007/s00190-017-1064-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00190-017-1064-4