While it is important to learn about the methods developed by previous generations of scientists, do not let yourselves be silenced by their aura.
— Legendre and Legendre, 2012:xiii.
Abstract
A matrix manipulation new to the quantitative study of develomental stability reveals unexpected morphometric patterns in a classic data set of landmark-based calvarial growth. There are implications for evolutionary studies. Among organismal biology’s fundamental postulates is the assumption that most aspects of any higher animal’s growth trajectories are dynamically stable, resilient against the types of small but functionally pertinent transient perturbations that may have originated in genotype, morphogenesis, or ecophenotypy. We need an operationalization of this axiom for landmark data sets arising from longitudinal data designs. The present paper introduces a multivariate approach toward that goal: a method for identification and interpretation of patterns of dynamical stability in longitudinally collected landmark data. The new method is based in an application of eigenanalysis unfamiliar to most organismal biologists: analysis of a covariance matrix of Boas coordinates (Procrustes coordinates without the size standardization) against their changes over time. These eigenanalyses may yield complex eigenvalues and eigenvectors (terms involving \(i=\sqrt{-1}\)); the paper carefully explains how these are to be scattered, gridded, and interpreted by their real and imaginary canonical vectors. For the Vilmann neurocranial octagons, the classic morphometric data set used as the running example here, there result new empirical findings that offer a pattern analysis of the ways perturbations of growth are attenuated or otherwise modified over the course of developmental time. The main finding, dominance of a generalized version of dynamical stability (negative autoregressions, as announced by the negative real parts of their eigenvalues, often combined with shearing and rotation in a helpful canonical plane), is surprising in its strength and consistency. A closing discussion explores some implications of this novel pattern analysis of growth regulation. It differs in many respects from the usual way covariance matrices are wielded in geometric morphometrics, differences relevant to a variety of study designs for comparisons of development across species.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
At present the literature of organismal growth studies does not offer an appropriate biomathematical toolkit for studying growth stability more than one variable at a time: the kind of analysis that is necessary if we are to understand the variation of configurations of skeletal landmarks over postnatal development. For example, discussions of canalization in today’s evo-devo literature that interpret it as reduction of variance (e.g., Gonzalez and Barbeito-Andrés, 2021) typically adhere to Conrad Waddington’s original trope of 1942, where the representation of the “constancy of the wild type” is by some hypothetical quantity unambiguously occupying the vertical axis of his famous diagram of developmental valleys even as developmental time marches from the back to the front of the diagram. Even if the original formula is multivariate, such as geometric asymmetry, any quantification it affords is presumed reduced to a single quantity before being investigated further. In a different disciplinary context, until the advent of paleogenomics nearly all of “multivariate palaeobiology” (Reyment, 1991) was based on an intuition about which traits are dynamically stable enough over normal growth to serve as numerical characters in a systematic analysis of forms of indeterminate age. It is time to demand more insight from our methods than that. We need to be able to quantify the patterns of developmental stability, both to render them accessible to individual experimental studies of their modification by genes or environment and to sequester them appropriately in studies of either within-group or between-group variation.
It is not that biometricians lack methods for the study of growth. Analysis of human growth, for instance, has a literature spanning centuries (Boyd, 1980), and the classic pattern of “growth spurts” has been known to pediatricians for most of that time (for one exploration familiar to statisticians, see Tuddenham and Snyder, 1954, or consult Tanner and Davies, 1985). There is also, following the pioneering, much-cited work of Potthoff and Roy (1964), a literature of technical biometrics centered on data series of otherwise unrestricted structure that follow specimens from more than one group over developmental time. But nothing in this tradition, to my knowledge, takes advantage of the possibility that not only the physical dimension of time but also the conceptual domain of the variables might be structured in a useful way. Such a structuring is unfamiliar from most of the human growth literature. Consider an example from my own first postdoctoral research group, the tabulations in Riolo et al. (1974). In this otherwise well-designed longitudinal study, the long list of 188 landmark-based dentofacial measurements tracked over time arises from no theory, nor from any geometric morphometric method (for that toolkit did not exist yet), but instead from the accrued literature of orthodontics over the previous years of active invention of “analyses,” each the favorite set of distances, ratios, or angular measures of the eponymous professor who invented it.
The present paper, one in a series attempting to rebuild geometric morphometrics into a tool useful for evolutionary biology, explores the possibility of an explicitly dynamical analysis for studies based in coordinates of a fixed configuration of landmark points over a closely spaced series of ages of the same developing organisms. The topic at hand is not the average growth curve but the quantitative structure of the sample’s variation around it. For this specific purpose there turns out to be a tie between the matrix algebra of covariance analysis and the geometry of landmark data analysis that is much closer than was previously realized and that aligns closely with some of the core formalisms of dynamical systems analysis. This paper demonstrates the method using one familiar data set, the Vilmann neurocranial series of octagons of landmarks in eighteen animals radiographed at eight ages. While I myself have no example of evolutionary comparisons among such series, it is nevertheless possible to draw some conclusions about how such studies might use this new biometrical method in an appropriate evolutionary context. After all, it is the life cycle that evolves, not merely the adult form or its fossils.
The outline of the rest of this paper is as follows. Section 2 introduces the Vilmann data set via a particularly simplistic and therefore unsatisfactory analysis of one of its growth changes. It is concluded that we need a method of summarizing multiple geometrically structured features that corresponds to the fundamental task of a dynamical analysis of multivariate stability. The missing method would center on (but not be limited to) identification of the perturbations of form that are most and least rapidly attenuated over time. Section 3 turns to a technique on loan from nineteenth-century applied mathematics that offers considerable help in this task, in part by radically redefining what is meant by the notion of “attenuation.” I introduce the method (eigenanalysis of a nonsymmetric square matrix) and sketch the way its praxis might apply to covariances of change against form for series of configurations of Boas coordinates (Procrustes coordinates without the size-standardization step) observed over time. Section 4 enumerates the findings for these Vilmann landmark configurations, findings that exploit the diverse mathematical possibilities of this tool, and suggests biological interpretations for several of the strongest pattern findings. A final Sect. 5 returns to the conceptual level of discussion, suggesting relationships of this newly borrowed matrix method with topics such as knockout study design, paleostudies of selection gradients, and other matters of current evo-devo interest. Following this discussion is an algebraic Appendix that engages with the pedagogical challenge here, the need to introduce our community to the general mathematical strategy of which principal component analysis is, alas, too specialized a special case.
The Investigation of Dynamic Stability as a Biometric Problem
It is convenient at this point to introduce the data set serving as the example for every method described in this paper: the Vilmann data set of eight neurocranial landmarks observed from radiographs of “close-bred” (Moss et al., 1984) male laboratory rats at eight different ages. The data set was published in full in Bookstein (1991) and has been analyzed and re-analyzed in a wide range of venues, of which two of the more recent are Bookstein (2018), my morphometrics textbook, and Bookstein (2021), the paper that introduced and formally justified the Boas coordinates on which the analyses to follow rely. Figure 1 shows how the landmarks lie in a stereotyped midsagittal section. The purpose for which this data set was originally collected was not the investigation of normal rodent skull growth per se but its expression in terms of the actual processes of bony deposition and resorption (Moss and Vilmann, 1978; Moss et al., 1980; compare Petrovic, 1972). Back at the time of original publication, Moss et al. (1983) explicitly claimed [emphasis in original] that each rat “constantly regulated its growth about the group mean,” a sentiment equivalent to the claim of dynamical stability that the present paper is using. But they chose to study only the averages of the growth processes they observed rather than modeling this variability in any way. (Perhaps Waddington’s concept of canalization had not yet come to the attention of this research group.)
Our interest is in the behavior over time of various measurements of this configuration. One particularly simple such measurement is its “diameter”—length of the interlandmark segment of largest average length, SES to Opi. Figure 2 displays this single distance for each experimental animal at the first two ages of observation, 7 and 14 days. At upper left we see the individual segments of the eighteen separate growth curves. This display wastes too much of its white space to be instructive, but a shearing of it (upper right) that places both averages at the same height is quite readable. One sees that most animals whose SES–Opi segments at age 7 are greater than the average see that excess distance decrease over the next week, whereas most animals starting out with a distance less than average experience an increase greater than average over the same period, and those near the middle at age 7 shift their rank in both directions. The correlation between size at age 7 days and growth from age 7 to age 14 days, \(r=-0.63,\) is thus quite negative, as is clearer in a different display, the scatterplot at lower left in this figure. This display style, however, does not convey the longitudinal character of the data the way the panel at upper right does.
A note on “regression to the mean.” While a notion of “regression to the mean” in this context of repeated measurement might sound like a plausible interpretation of small negative correlations between individual distances and their changes over the eighteen animals, it cannot account for the large negative correlations here. For a correlation even as large as \(-0.5\) to appear as the outcome of a process of repeated measurement error would require the amplitude of error in its direction to equal the variance of the true growth signal.Footnote 1 To imagine that a biologist as careful and clever as Melvin Moss would tolerate that magnitude of measurement error in a roentgenographic study would be absurd. There are many other reasons as well to reject any such simplistic hypothesis about these autoregressions. If indeed landmarks are more difficult to locate in the fuzzier images of younger animals, that is a reason to optimize their serial locations there by image-based templates, not to just concede less precision at those ages. And some of the large negative correlations to follow arise from highly integrated patterns, often involving attenuations between perturbations of landmark locations at some distance; to me this argues convincingly against any interpretation in terms of independent measurement error at individual landmarks. In this paper I will use the words “autoregression” or “attenuation” for these negative sequelae of perturbation, or else the phrase “reversion toward the mean,” in order to intercept the fallacious causal inference that the phrase “regression to the mean” usually conveys.
The task of this paper is to extend analyses of this style to the more complicated data structure involving multiple landmark locations for multiple organisms observed consistently over a longer series of ages. We need a way of summarizing multiple geometrically structured features as they are distributed across a sample of landmark configurations changing over time. The toolkit of geometric morphometrics (GMM) has several to suggest, but modifications are needed. Notice, for instance, that we have more possible interlandmark distances here than there are degrees of freedom in the data (28 interlandmark segments, but only 13 independent coordinates in the Boas system I will review shortly). Different interlandmark segments will revert over growth to their initial mean values with different slopes that vary by both the length and the orientation of those vectors of separation; the corresponding maps may be integrated over the organism or instead arbitrarily highly localized (patterns of both types will be encountered). Also, while the desiderata of an analysis of these autoregressions are analogous in some ways to the principal component procedures our community has been accustomed to exploiting for decades, nevertheless the rhetoric of the problem—dynamical stability—requires we incorporate some concepts from that branch of applied mathematics, which offers no role for principal components analysis per se.
This article’s general response to such a complicated challenge is as follows. As the interest is in growth, not shape, we eschew GMM’s now-conventional Procrustes shape coordinates in favor of the Boas coordinates (Bookstein, 2021) that restore the scale of millimeters that the Procrustes method abandoned. As our interest is in perturbations and their reduction measureable at the level of the individual specimen, we turn our attention to the method of eigenanalysis of covariance structures, which in general produces vectors that have a dual role as the coefficients of linear combinations that serve as scores. As our topic is not the growth curve but the variation around it, the matrix to which we apply the general toolkit of eigenanalysis must be the covariance matrix of Boas coordinates against their changes over time, not their forms at any particular time separately; and as one of our particular concerns is the phenomenon of attenuation, explicit fractional reversal of particular aspects of perturbations, we will be especially interested in findings that can be interpreted as relatively highly salient examples of that phenomenon.
My topic here is dynamical stability per se, the self-correction (or not) of perturbations from the average over the intervals between observations, not the associated topic of canalization, which in essence focuses instead on the genetic or epigenetic causes of these autoregressive patterns, nor the concept of compensation, which usually refers to an organism’s response to experimental or ecological interventions (such as temporary starvation) in the course of the growth process. The literature of biometrics seems not to offer any mathematically grounded method for integrating findings about growth patterns from the level of description one measurement at a time to the level of summaries over whole measurement systems, not even the particularly highly structured systems arising from GMM. So it is time for GMM to contribute a suggestion of its own, with a worked example like this one.
A New Borrowing from Nineteenth-Century Mathematics for Analyzing the Dynamics of Serial Stability
A method responding to all of these requisites is the eigenanalysis of the seven age-specific form-vs.-change covariance matrices exercised in the next section, using methods of dynamical analysis developed in the nineteenth century for other applications entirely. Those 150-year-old methods are far from obsolescent—their continuing prominence and centrality for the education of engineers actually heightened late in the last century when the formalism of Lyapunov exponents proved the crucial insight into the geometric structure of deterministic chaos. In that context the methods apply to dynamical analyses of systems ranging from galaxies down through solar systems to ecological systems and, within organisms, through physiological cycling (such as the heartbeat) at the level of whole organs down to the level of individual cells. While we are borrowing from a mathematics that overlaps with the mathematics of deterministic chaos for the rest of this paper, we do not need any of the deeper concepts of chaos theory (for which see, e.g., Strogatz, 2014). In particular, while according to the Poincaré–Bendixson Theorem “strange attractors” can emerge only in systems of three or more dimensions, the applications here to growth perturbations will emphasize two-dimensional concepts instead.
I have invested considerable effort in building a pedagogy of this borrowed method, the eigenanalysis of a general square matrix, that might be suitable for assimilation by evolutionary biologists. The Appendix to this article sets down my best current guess about what such a syllabus might require: an explanation of the mathematics beginning with “what everyone knows” about principal components analysis (PCA) but then proceeding to contradict nearly all of those intuitions by releasing most of the constraints that underlie them, one by one.
The one intuition that we must not delete is the intuition that the first principal component of a data set serves as the best single linear summary of its covariation. For this application to dynamical stability we need to state somewhat more crisply what we mean by “best.” This will be a methodology of developmental perturbation centered on responses that are explicit reversals of the perturbation pattern. The method will mandate that the same linear combination we use to describe perturbations away from the average should apply, with the same coefficients, to describe a predictable (covariance-based) attenuation of that perturbation (not a “regression to the mean”). In other words, the crux of the hoped-for biological interpretation will be the possibility of explicitly reifying the idea of a (composite) trait that shows a tendency to return directly or indirectly toward some unperturbed average.
This means we are considering the computational task of predicting deviations from the average change profile across some list of p variables on n cases, a centered data matrix Y, by the centered deviations X of those same cases from their averages for those same p variables. These predictions will be proportional to the net partial predictors (sums of simple regressions) that express their dependence on the separate variables of the measurement scheme—multiple regressions play no role in a system such as the Boas coordinates that submit to three exact linear constraints a-priori. The prediction “machine” here is then just the matrix multiplication of any pattern of deviation by \(A=X^tY/N\), the cross-covariance matrix of the X’s by the Y’s. (The superscript \(^t\) means “transpose,” so that the expression \(X^tY/N\) is the matrix whose \(ij^{\mathrm{th}}\) entry is the average over all the cases of the product of the \(i^{\mathrm{th}}\) of the p X’s by the \(j^{\mathrm{th}}\) of the Y’s; but this is just the definition of their covariance.) We want to understand the prediction pattern by a decomposition into a series of vectors for each of which the action of A (the actual prediction, anticipation of a growth change from an initial configuration) is merely to multiply it by some factor, regardless of what the prediction is doing to any of the other dimensions. In other words, we want the predicted values Aw, where w is any perturbation whatever of the average, to be written as a sum of specimen-specific scores \(\alpha \) (we will be encountering these scores in the scatterplots of all the examples to follow) times a series of particular perturbation patterns. If we notate these individualized directions of prediction as vectors v that have been set (by the PCA convention) to be unit vectors (sum of squares 1.0), then we are asking for a set of vectors \(v_i\) and constants \(\lambda _i\) for which
where each \(v_i\) is a vector of length p and A is the \(p\times p\) covariance matrix \(X^tY/N\) of the values of some list X of variables (it will be their centered Boas coordinates) against the change scores Y of the same list of variables. The values of \(\lambda \) will then serve as multipliers expressing the saliences of the corresponding v’s in accounting for the changes (the Y’s) in terms of the starting forms (the X’s), and the values of \(\alpha \) will be the salience of each pattern for the profile of p scores case by case..
But this is the equation that defines the series of pairs \(\lambda _i\) and \(v_i,\) the eigenvalues and their eigenvectors, that make up an eigenanalysis. If the N-case data sets X and Y are the same, it is what you are already using to extract your principal components from the matrix \(X^tX/N.\) Even for this different sort of matrix A, the cross-covariance \(X^tY/N,\) you can see how useful such a decomposition would be. If, for instance, the absolute values of \(\lambda \) drop quickly over the series, then, case by case, the scores on the first few \(\alpha \)’s are the characteristics of the specimens that are the only important ones for predicting changes from one measurement session to the next.
Still, in biometrics, as in so many other aspects of academic life, “there is no free lunch,” as they say. We have to accept the possibility that the quantity \(\lambda \) might not be a real number, or, interpreted differently, that what is left invariant by the action of the matrix A might be a plane of scores that are rescaled, sheared, and rotated, not a line of scores that are only rescaled. Whether or not you choose to study the Appendix at this time, you should keep in mind the logic by which these eigendecompositions of a form-by-form-change covariance matrix can be interpreted. The trace of each of the covariance matrices \(X^tY/N\) of Boas coordinates against their changes—for the Vilmann data set there are, in all, seven of these matrices—is negative, meaning that the dominant feature of all these perturbation analyses is an undoing of deviations from the mean form. We will further see that all the meaningful eigenvalues of these covariance matrices have negative real part, meaning that in these earliest weeks the net effect of growth on most perturbations of form is to damp them out.
But some of these constants \(\lambda \) may well be complex numbers, and the evolutionary biologist facing such intermittent findings of complex crosscovariance eigenpairs in the course of a morphometric growth study will want to interpret such findings not only in numbers and vectors but also in words. Those interpretations would benefit from a prior acquaintance with models of crosscovariance that predictably produce complex findings, and also from prototypes for interpreting the range of geometric details that those models can produce in the context of dynamic growth analysis that is our concern here. The first of these topics is explored in the next section, while the second is the subject of the section “Interpreting complex eigenvalues, 2” that follows the review of our actual Vilmann computations.
Interpreting Complex Eigenvalues, 1: Getting to the Complex Case
Reviewers of earlier drafts of this essay unanimously urged me to explain the algebraic origins of the complex case at greater length. Indeed this style of finding is unfamiliar to most of this journal’s readers, who will have learned eigenanalysis only as applied to the symmetric positive-definite matrices that drive its application in PCA. For instance, if the task were to understand the pattern of the covariance matrix \(A=X^tX/N\) summarizing the variation of some centered N-case p-variable data set X, PCA would rewrite that matrix as \(A=\Lambda ^t D \Lambda \) where \(\Lambda \), \(p\times p\), is an orthonormal matrix of principal component “loadings” and D is a diagonal matrix of k positive “explained variances” component by component. But, as the Appendix explains, mathematicians have long had recourse to a much more general form of this analysis. Eigenanalysis can be applied to any square matrix of real numbers, whether or not it could have arisen as a variance-covariance matrix \(X^tX/N.\) The more general definition of an eigenanalysis for any \(p\times p\) square matrix A is the production of p constants \(\lambda _i\) and p unit vectors \(E_i\) such that \(AE_i=\lambda _iE_i\) for \(i=1,\ldots ,p.\) (For the application to PCA, one corollary of this theorem is the familiar fact that the loadings of a principal component are proportional to the covariances of the underlying variables with the resulting PC score.) Applied mathematicians have known for just about 200 years that there will almost always be exactly p of these \((\lambda ,E)\) pairs.
But what makes some of these \(\lambda \)’s complex numbers, sums of real numbers by multiples of \(\sqrt{-1}\), and what is the geometric meaning of those situations and its biometric interpretation?
The pedagogy goes best in the simplest setting, that of a \(2\times 2\) matrix of numbers (here, that setting might be the covariance matrix of exactly two morphometric scores against their growth changes). Write that matrix as \(A=\begin{pmatrix}a&{}b\\ c&{}d\\ \end{pmatrix}\) for arbitrary numbers a, b, c, d.Footnote 2 The eigenproblem to be solved is the computation of two numbers \(\lambda _1,~\lambda _2\) and corresponding 2-vectors \(v_1=(x_1,y_1)\) and \(v_2=(x_2,y_2)\) such that \(Av_1=\lambda _1v_1\) and \(Av_2=\lambda _2v_2.\) Such \(\lambda \)’s and v’s almost always exist algebraically; the next few paragraphs will explore the conditions that result in their turning out to be complex numbers.
How does eigenanalysis of the matrix \(\begin{pmatrix}a&{}b\\ c&{}d\\ \end{pmatrix}\) proceed? We are looking for numbers \(\lambda \) and vectors v such that \(Av = \begin{pmatrix}a&{}b\\ c&{}d\\ \end{pmatrix}~v =\lambda ~v\) or, equivalently,
The next step exactly duplicates the logic of PCA. Equation (1) will have nonzero solutions for the vector v only if the determinant of that matrix is zero: the equation \((a-\lambda )(d-\lambda ) -bc =0\) or, rearranging,
But now, in the absence of any constraints on the elements of A, we detour from PCA logic. From the formula for solving any quadratic equation \(Fx^2+Gx+H=0,\) the roots of equation (2) will be complex if and only if the appropriately named discriminant \(G^2-4FH\) of that expression is negative (because the formulas for \(\lambda \) and the v’s involve the square root of that discriminant). For the eigenequation (1), that discriminant is \((a+d)^2-4(ad-bc) = (a-d)^2+4bc \): if this quantity turns out to be negative, the \(\lambda \)’s and the vectors v of our notation will have to be complex numbers, with implications for biological interpretation of analyses in which they arise. If \(a=d,\) this is the simple condition \(bc<0\)—elements (1,2) and (2,1) opposite in sign. If \(a\ne d,\) the condition on bc is more stringent—the product must be less than \(-(a-d)^2/4.\)Footnote 3
The switch of the eigenstructure from real to complex parameters is a simple example of what mathematicians have come to call a catastrophe, an abrupt change in the topological or algebraic properties of the solutions of some equation in the vicinity of some crucial parameter setting. We can familiarize ourselves with the behavior of those complex solutions by carrying out the analysis of a series of \(2\times 2\) matrices of which one entry is varying in a strictly patterned way. For instance, Model A here deals with a pencil of matrices for which the only variation is in the (1,1) element, upper left. Fixing the other elements, write
In one sense the onset of the complex case here is the equivalent for these crosscovariance matrices of the circular case for PCA, where (see e.g. Bookstein, 2018, Fig. 4.14) two eigenvalues differing by less than their standard error ought to be treated as equal rather than ranked. But whereas in PCA the resolution consists of treating a full circle, sphere, or hypersphere of directions as equivariant, in our more general context of growth dynamics the resolution involves somewhat more detailed modifications of those directions over the course of form-growth prediction.
I have set the diagonals of the matrix A at negative values to match the situation for our Vilmann form-growth analyses to follow. For values of x less than \(-5\)—for instance, \(x= -5.1\)—the eigenanalysis yields two real eigenvalues -3.5, -2.6 (notice their sum is \(-5.1+(-1)=-6.1\)) and corresponding real eigenvectors (\(-0.781,0.625\)) and (\(0.625,-0.781\)). (These are not perpendicular—the matrix A in this model is never symmetric.)
At \(x=-5\) (one of those exceptional cases referred to before), the eigenvalues of A both equal \(-3,\) and there is only one eigenvector, along the direction (1, 1). Panel (i) of Fig. 3 displays the immediate vicinity of this Model A catastrophe. In PCA, say of the matrix \(\begin{pmatrix}5&{}0\\ 0&{}5\\ \end{pmatrix}\), while the discriminant is indeed zero, it cannot go lower, so for variation of either diagonal entry or conjoint variation of that offdiagonal zero, the eigenvalues can only diverge from the case of equality, one increasing, the other decreasing. But for the eigenanalysis of a nonsymmetric matrix like \(\begin{pmatrix}-5&{}2\\ -2&{}-1\\ \end{pmatrix},\) likewise of discriminant zero, there is more room for variation: not one but two dimensions for eigenvalues to explore. The filled dots in panel (i) show the effect of modifying A by increasing the absolute difference \(\vert a-d \vert \) of its diagonal elements by 0.01, thus driving the discriminant positive; while the open dots show the effect of decreasing \(\vert a-d \vert \) by the same amount, driving the discriminant negative and hence replacing the pair of real eigenvalues \(-3.005\pm 0.14\) by the complex pair \(-2.995\pm 0.14i.\) (The real component changes because the sum of the eigenvalues must still equal the trace of the matrix A, which has changed by \(\pm 0.01\) between these two scenarios.)
Real and imaginary canonical vectors. For various mathematical manipulations (see, e.g., Bookstein, 2018), the theoretical literature of geometric morphometrics has often found it useful to notate landmark positions in two dimensions using the formalism of complex numbers, combinations of two Cartesian coordinates (x, y) into one complex number \(x+iy\), where \(i=\sqrt{-1}\), that represents the same point of the plane. Nevertheless the individual form coordinates, the Boas variables that go into the multivariate analysis of unscaled form, cannot themselves be complex numbers. To link GMM maneuvers to the novelty of these complex eigenvectors it is useful to introduce a new pair of algebraic constructs, the real and imaginary canonical vectors that can represent any complex eigenvector as a pair of real vectors.Footnote 4 Like the eigenvectors themselves, these canonical vectors have as many elements as there were originally shape coordinates: for the Vilmann data set, that is a length of \(2\times 8 = 16.\) The real canonical vector for an eigenvector \((u_1+iv_1,u_2+iv_2,\ldots ,u_{16}+iv_{16}),\) where the \(u+iv\) are the elements of the complex eigenvector, is the 16-vector of real parts \(\mathbf{r}=(u_1,u_2,\ldots ,u_{16})\), where r is the notation this paper will use in the sequel. Similarly, the imaginary canonical vector is the 16-vector i of imaginary parts \((v_1,v_2,\ldots ,v_{16}).\) Multiplication of r by its eigenvalue yields the vector in the r,i plane to which the action of the matrix under analysis shears it; this vector will be called r’. The same matrix multiplication turns the imaginary canonical vector into its predictand i’.
If \(Av=\lambda v\) for some complex vector v, then also \(A(cv)=\lambda (cv)\) for any complex number c. All the figures in this paper exploit this possibility to normalize each eigenvector the same way that PCA routines usually normalize: by scaling so that the sum of squares of each eigenvector’s elements is 1.0. For eigenvectors that are complex, there is a convenient side-effect to this transformation: the resulting canonical vectors have been made orthogonal. (If \(\Sigma (u+vi)^2 = \Sigma \bigl ((u^2-v^2)+i(2uv)\bigr ) = 1,\) where \(i=\sqrt{-1},\) \(\Sigma uv\) must be zero.) This will greatly simplify Figs. 15 and 16 in Sect. 4. Note that although r and i must be orthogonal in view of the way these eigenvectors were normalized, their images r’ and i’ need not be and usually won’t be.
Panel (ii) of the figure displays the first eigenvalue of these Model A matrices for 16 values of x beginning at \(-4.999\) and continuing between \(-4.75\) and \(-1.25\) at intervals of 0.25. Their real part increases steadily with distance from the catastrophe for this half-line of values of x, whereas the imaginary part decreases over the same range of x. The plot for the second eigenvalue would be the mirror image of this plot in the horizontal line at the top of the graph.
The remaining two panels of Fig. 3 display the other aspect of the eigenanalysis, the canonical vectors, for two instances of Model A. In every case the vectors r and i are orthogonal, but not of equal length. Under the action of A, if the eigenvalue \(a+bi\) is paired with eigenvector \(u+vi\), where \(u=\mathbf{r}\) is the real canonical vector of this eigenpair and \(v=\mathbf{i}\) is the imaginary canonical vector, then the image of this component under the action of A is the vector
The real canonical vector u thus is transformed into the vector \(au-bv,\) which, because u and v are perpendicular, is always oblique to u. Likewise the imaginary canonical vector v is sent by A to \(av-bu,\) which is necessarily oblique to v. In other words, the canonical vectors u and v are never eigenvectors themselves; it is only the plane they span that is invariant under the action of A.
The interpretation of the identity (3) is to decompose the effect of any complex eigenvalue \(a+ib\) of one of our form-change matrices \(X^tY/18\) into the sum of two distinct processes: \((u.v)\rightarrow a(u,v)\) and \((u,v)\rightarrow ib(u,v)\equiv b(-v,u).\) When a is negative, the first of these corresponds to the biologist’s notion of “stabilization,” proportional reversion of both canonical vectors in the same ratio back toward their mean over the growth interval in question. The effect of the term in b, however, is a sheared representation of a rotation of the canonical pair. (For a detailed diagram of this decomposition, see Fig. 18 in the Appendix.) This second term does not align with the biologist’s notion of “compensation.” I will argue near the end of Sect. 5 below that that notion does not extend usefully to this context of multivariate growth prediction.
As panel (iii) of the figure exemplifies, for a value of the discriminant near 0—a value of x near \(-5\) in the model \(A=\begin{pmatrix}x&{}2\\ -2&{}-1\\ \end{pmatrix}\)—the vector r’ or i’ for “growth” lies nearly opposite the vector r or i for “form,” which in our application will mean instances of nearly pure autoregression in the two distinct directions. The situation in Fig. 5 will resemble this instance. By comparison, in panel (iv), for a value of x much closer to 0, when the discriminant of A is substantially more negative, the primed canonical vectors (“growth”) have been rotated by almost \(90^\circ \) with respect to the unprimed (autoregressive) direction, the direction of pure proportional reduction of perturbations of form away from the mean. By virtue of this nearly \(90^\circ \) rotation, r’ can end up almost aligned with i and i’ with -r, the opposite of the original real canonical. Beyond the limits of Model A this will be the case, for instance, whenever the model matrix takes the form of a pure reversion rotation \(\begin{pmatrix}0&{} -b\\ b&{}0\\ \end{pmatrix}\)—call this Model B. Here the real and imaginary canonical vectors are along (1, 0) and (0, 1), respectively, which are transformed into (0, 1) and \((-1,0)\) by the action of the matrix B. B’s eigenvalues are \(\pm i\), meaning, pure \(90^\circ \) rotations of the canonical vectors in their plane. We will encounter this case in the Vilmann example for a late stage of growth (Fig. 15D).
The Method of Serial Eigenanalysis as Applied to a Growth Study via Its Boas Coordinates
The remaining analyses of this paper are based, as I already mentioned, in the Boas coordinates for the Vilmann octagons instead of the interlandmark distance measures the calculations have referred to in Fig. 2. Boas coordinates (which Joe Felsenstein and I named after the American anthropologist Franz Boas, who published the basic idea way back in 1905) are a modification of Procrustes shape coordinates that reverses the division by Centroid Size encoded in Gower’s “generalized Procrustes analysis” so as to be much more useful for studies of organismal growth. I first mentioned these coordinates on pages 412–414 of Bookstein (2018), and there is a lengthy derivation and justification in Bookstein (2021), a paper that demonstrates why our familiar Procrustes analysis basically makes no sense in applications to growth studies. In brief, the reason for this proscription is the inappropriateness of the measure called Centroid Size (root-mean-square of the distance of the landmarks from their specimen-by-specimen average) that the Procrustes method uses to “adjust for size” by an arithmetical division. Division of Cartesian coordinates by Centroid Size not only has no relation to any allometric phenomenon that might actually govern the changes of form in such data sets, but also destroys the interpretability of the “allometric regressions” on size that typically follow after Centroid Size has been divided out already.
Boas coordinates can be understood as the coordinates of a data set of landmark configurations that result when every specimen’s Procrustes coordinates are multiplied by that specimen’s Centroid Size, the quantity that was divided out in the course of producing the Procrustes coordinates in the first place. Some helpful facts about Procrustes coordinates are inherited, with modifications, by these Boas coordinates. They still average exactly zero in both dimensions specimen by specimen and over all specimens, and they still satisfy an additional constraint, corresponding to their having been rotated to a least-squares position with respect to their average. Because the size-standardization step has been omitted, this totals three annihilated dimensions instead of the four that characterize the Procrustes coordinates, so that, for example, the full set of 16 coordinates in Fig. 4 involves 13 “degrees of freedom”—thirteen dimensions of nonzero variance—instead of the 12 that characterize the Procrustes shape coordinate space of the same data set. Also, when interpreted geometrically in terms of mutual displacements of the landmarks, the eigenanalyses of the covariance matrices considered here, covariances of forms against their growth changes, are empirically invariant (or, more precisely, contravariant) against rotation of all these configurations in common. This property is essential because the horizontal has been set arbitrarily in all these geometric displays. (The particular orientation is chosen to simplify a certain formula characterizing the uniform term of the bending-energy spectrum of amplitudes. See the discussion of the J-matrix in Bookstein (2018), pp. 408–411. Intriguingly, the algebra of this orientation is the same as the algebra of the normalizations this paper is applying to the complex eigenvectors.)
Figure 4 displays these coordinates for all eighteen of Vilmann’s animals at all eight ages. At upper left is the pooled distribution of all 144 configurations. The dominant pattern, for which Bookstein (2021) suggested the name “centric allometry,” is the common tendency of all these points to recede more or less directly away from the center over the developmental process. The remaining panels of this figure show the sequence of eight octagons for each of the eighteen animals separately. As this is the same data set on which Fig. 2 drew, we can see, for instance, the animal-to-animal variation in the change of position of the landmark SES (lower right corner of the octagon) between ages 7 and 14 days. The display does not, however, conduce to confirming the finding in Fig. 2, the strong negative correlation between the starting position of SES with respect to Opi and its change of relative position over these seven days: one simply cannot perceive that correlation from the graphical design here.
As sketched in Sect. 3 and elaborated in the Appendix, I carried out an eigenanalysis of each growth interval (age 7 days to age 14 days, age 14 to 21, \(\ldots ,\) age 90 days to age 150 days) as follows. For each of the seven starting ages of these intervals, the eight Boas coordinate pairs of our specimens’ eight landmarks were vectorized (all the \(x-\)coordinates, followed by all the y’s) and the entries mean-centered into one data matrix X of sixteen variables over eighteen specimens. Their change scores from each measurement wave to the next were similarly vectorized and mean-centered into a similar data matrix Y of sixteen age-to-age change scores of the same Boas coordinates over the specimens. The core of the analytic calculus is the eigenanalysis of the crosscovariance matrix \(X^tY/18\), \(16\times 16\), of these two data sets.
Many of the eigenpairs of the full eight-landmark (sixteen-coordinate) data set to be reviewed here are examples of the type of growth regulation simplest to interpret: perturbations that are followed by changes exactly aligned with their reversals. These correspond to the real eigenvalues of the interval-by-interval form-growth eigenanalyses. The strongest instance of these is the pattern of change of form of these animals from age 14 days to age 21 days, the second growth interval, as shown in Fig. 5.
Figure 5 pertains to the first eigenvalue \(\lambda _1\) of the second growth interval and the corresponding eigenvector \(v_1\). The eigenvalue is a real number that is printed twice in the figure: once above the leftmost panel and again as the covariance between the two scores \(X~v_1\) and \(Y~v_1\). (The Appendix shows why these must come out the same.) It may be easier to interpret this covariance via the correlation \(r=-0.59\) plotted at the right for the score for the age-14 forms against the corresponding score for the 14-to-21 changes. The associated eigenvector \(v_1\), representing the displacements of Boas coordinates (or changes) whose linear combination produced those scores, is plotted (at an arbitrary multiple) landmark by landmark by the eight separate vectors in the left panel (two coordinate loadings each, an x and a y, encoded in the displacement of the open circles from the solid dots). These same displacements are interpreted at the same multiple by the thin-plate spline grid at center. Evidently this deformation is primarily an extension of the posterior two-thirds of the cranial base, from ISS to Basion, with smaller adjustments at the other landmarks. That correlation is a substantial one (though we will encounter a value of greater magnitude in just a moment). The eigenanalysis protocol aims not at maximizing this correlation (or even the corresponding covariance) but instead at ensuring the alignment of the linear combination for change with the sum of all of its predictions by the Boas coordinates for the form immediately preceding this growth change.
The complexity of the general task here, the disentangling of form-changes from form-variation, is apparent in the contrast of Fig. 5 with Fig. 6, the same representation for the second eigenvalue-eigenvector pair of this same analysis, age 14 days to 21. From either the displacement vectors (left) or the grid (center) we see a phenomenon centered on the region of the foramen magnum—two landmarks on the posterior cranial base, two on the parietal bone. The eigenvalue corresponding to this somewhat restricted perturbation response is only \(-15.9,\) less than half that in Fig. 5, but the correlation of the corresponding scores is considerably greater, a full \(-0.78\) (Fisher’s \(z = -4.05\)). Geometrically, this eigenvector signifies a regulation of the angle between the cranial base and the parietal bone across the foramen, together with the scale of this subregion. According to the eigenvector, restoration of relatively small foramina goes with reduction of unusually large angles, and obversely. (As this dimension is composite, neither of the traits separately is likely to have been regulated to this extent.) The biomechanics of this tradeoff at this stage of the pup’s life might well be interesting if confirmed in a second, more detailed study of what one hopes would be a larger sample of animals.
It is useful to compare the preceding two analyses to the single analysis conveyed by the first eigenvalue/eigenvector pair for the preceding growth interval, from age 7 days to 14. This analysis exemplifies the complex-eigenpair case described second in the Appendix. A single complex eigenvalue \(-27.3-6.4i\) is associated with a deformation showing striking features at both ends of the cranial base. It incorporates a reversion toward the average, covariance somewhat under \(-30\), with a sheared rotation to be characterized by its effects on the two canonical vectors. For the real canonical vector, which has the larger element of the eigenvalue, the four landmarks nearest the foramen magnum show a local feature of reversion echoing the change of angle between parietal bone and cranial base across the foramen magnum in Figure 6, the second eigenvector of the second growth interval, whereas the anterior cranial base shows a separate response to perturbations of the proportions along this structure. The grid for the second canonical vector is a more integrated gradient aligned across that foramen rather like the opening legs of a nutcracker.
The algebra of these complex eigenvalues, the Appendix informs us, regenerates the real part \(-27.3\) of that eigenvalue as the difference of the two covariances of the canonical scores, \(-31.9\) (the real parts) minus \(-4.5\) (the imaginary parts)—the minus sign arises because the product of two multiples of \(\sqrt{-1}\) is, indeed, \(-1\)—and the imaginary component \(-6.4i\) of that same eigenvalue is the sum of the covariances \(-7.5i\) and 1.1i of the remaining two scatters in this figure. Age-7 scores on both of the canonical linear combinations show strong negative correlations with 7-to-14-day change (\(-0.56\) for the real part, \(-0.67\) for the imaginary), suggesting that both of the grids in the second column of the diagram may be potentially meaningful composite characters of form-regulation in this growing system. The off-diagonal correlations are less than the on-diagonal ones, in keeping with the dominance of the real part of the eigenvalue. Had we taken the conjugate eigenvalue, \(-27.3+6.4i,\) and the conjugate eigenvector with which it is associated, the main effect on this aspect of the analysis would be the change from 1.1i and \(-7.5i\) in the plots at right to \(-1.1i\) and 7.5i, along with the replacement of the lower vector diagram and its thin-plate spline grid by their opposites. The terms \(-31.9\) and \(-4.5i^2\) would not change.
The composite displays in Figs. 5, 6, and 7 are three of the most informative of the entire series. One may demonstrate that dominance by a pair of summary figures that span the entire range of growth changes, 7 intervals by 13 dimensions of Boas coordinate variation, totalling 91 separate eigenvalue/eigenvector pairs (33 real cases, like Fig. 5 or Fig. 6, and 29 complex pairs as in Fig. 7). The first of these summary diagrams, Fig. 8, lays out the eigenvalues separately along a complex line. As the figure shows, the analysis in Fig. 5, the first eigenvector of the second growth interval, encodes the most powerful reversion signal of any interval in this data set, while the first (complex) eigenvector of the first growth interval (Fig. 7), of magnitude nearly as high, is dominated by its real part. Two other large negative real eigenvalues will be reviewed in due course, but the two atypically positive eigenvalues at the right in Fig. 8, which stand for perturbations with positive feedback on subsequent forms, generate correlations too small to be worth trying to interpret.
Figure 9, a different display style for the same series of eigenvalues, takes advantage of one of the theorems that copies over from the PCA context to the context of these generalized eigenanalyses: the sum of the eigenvalues of any analysis is still the trace of the underlying covariance matrix—the sum of all its diagonal entries, each of which is the covariance of one of the Boas coordinates separately with its own change score. (This is the same theorem as the decomposition of total variance of any data set by the full list of its eigenvalues according to a conventional PCA.) At far right, reduced 80% in vertical scale, is the total of these eigenvalues for the seven growth intervals separately. Clearly the first growth interval, 7 to 14 days, is going to be the most interesting, with the next five, while closely clustered, showing a trend of decreasing total attenuation through age 90 days. The figure informs us that attention need be paid mainly to the first two or three eigenvalues of any of these analyses. Except for the first growth interval, all of these traces hew closely to the heavy horizontal line (zero real part, no apparent autoregression in either sense) from about the third eigenvalue onward. The two apparently outlying positive eigenvalues, one for change interval 7 and the other for change interval 2, correspond to deformations not diagrammed here that bend the cranial base into an S-shape. They bear correlations (0.26 and 0.36, respectively) that, although positive, are too small to be viewed as meaningful in a data set of only 18 animals.
Concealed in Fig. 9 is a surprise: the dominant eigenvalue in three out of our seven form-vs.-change analyses is a complex number, the situation illustrated in Fig. 7. This is an unexpectedly high frequency for a scenario so distant from the familiar PCA and partial least squares (PLS) metaphors. The steepness with which these curves rise toward zero following the initial signal at the left margin is likewise unexpected. In words: where there is evidence for dimensions of strong attenuation of perturbations, except for the very first growth interval here (7 to 14 days) the count of those dimensions, allowing for sheared rotation when appropriate (the complex case), seems limited to just one. Section 3 has already noted that the probability of complex eigenvalues in a scenario somewhat similar to these form-growth models can be computed to arbitrary accuracy by a careful numerical integration. The resulting probability of 29.4%—nearly one chance in three—that that first eigenvalue will be complex rather than real is high indeed for a situation lying outside the domain of all the current textbook discussions. That percentage is consistent with the emergence of the complex case in three out of the seven form-growth analyses here.
Some of the other analyses with real eigenvalues toward the left of the chart in Fig. 8 can be reported more briefly. Figure 10, displaying the (real) first eigenvalue of the third growth comparison, shows a persistence from Fig. 5 of the negative autoregression of net cranial base length, SES to Basion, now diffused throughout the height of this octagon so as to apply to the calvarial roof as well. Geometrically it has the appearance of the uniform transformation that dominates the large-scale aspects of this growth trajectory of this octagon over the full interval of observation, from 7 to 150 days (Bookstein, 2018, Fig. 5.84). Another eigenvector (Fig. 11), associated with the first eigenvalue of the final growth comparison (90 to 150 days), continues the attenuation of perturbations of cranial base length combined at this period with a localized reversion toward the mean of its deviation from straightness at SOS. At a correlation of \(-0.47,\) however, this feature may well be too dependent on the pair of possible outliers (upper left in the right-hand panel) to justify a serious attempt at interpretation. And the third eigenvector of the first growth interval, Fig. 12, also shows a strikingly large negative autoregression, this one a sort of mirroring of the highly integrated “nutcracker” in Fig. 5 so as to apply at the other end of the octagon. Only for two of the eigenanalyses (the first eigenpair for change from 21 to 30 days, Fig. 10, and the third eigenpair for change from age 7 to age 14, Fig. 12) does a term suggesting reversion of uniform shear appear as part of the dynamic analysis, even though it dominates the conventional principal component analysis of this octagon except at the very end of development (Bookstein, 2018, Figs. 5.77, 5.84).
Here are two more instances of a complex eigenpair selected from the upper border of the range of such eigenvalues as displayed in Fig. 8. These will underlie the more detailed analyses of their dynamics that concern the corresponding rows in Fig. 16 to follow. Figure 13 displays the dominant feature of the relation between form at 30 days and change from 30 to 40 days. While it shows quite a high negative autoregression (the correlation is \(-0.7\)) in its real part, it continues (as Fig. 6 was) to be concerned with localized relationships involving the angulation of the foramen magnum, combined with a change in the proportional position of Brg between IPP and Lam. The actions of the canonical vectors on the relative position of Lam between Brg and IPP seem to be opposite. Figure 14 is an example of an eigenvalue for which the real part is not so dominant. While the real canonical vector (top row, second column) appears to be an interesting grid, it shows relatively little covariance with either canonical change score. The interesting panel of this figure is the scatter of the two imaginary canonical scores, lower right: a correlation of \(-0.51\) (in spite of the obvious positive outlier) for a feature combining remodeling around the foramen magnum with an anterior shearing across the whole height of the calva.
In this way, a thorough review of the full Vilmann data set by a seemingly more demanding biomathematical method has ended up at a plausible one-sentence verbal summary of the form-growth relationships (“mainly a matter of a small number of age-specific dimensions of substantial autoregression, often involving the foramen magnum, primarily at younger ages”) together with a panoply of age- and region-specific deformational details. The ubiquity of those negative autoregressions, as surveyed in Figs. 8 and 9, is a surprise. One might have expected a pattern of stable individual differences to appear partway through this growth sequence (see, for instance, the analysis of trends in net area of this octagon toward later ages, Fig. 3.28 of Bookstein, 2018). There might also have been dimensions of this octagonal form that showed meaningful positive autoregressions—the divergence of some individuals’ growth trends owing to amplification of some dimensions of meaningful difference—but that is not, in fact, the message the eigenanalyses are sending us. Instead, the growth patterns among these eighteen animals seem overwhelmingly to be dynamically stable, characterized by eigenvalues that are nearly all negative or zero. But the steepness of rise of the traces in Fig. 9 is also a surprise: no growth interval except the first appears to justify a characterization by more than two dimensions of autoregression.
Interpreting Complex Eigenvectors, 2: A Diversity of Vilmann Growth Geometries
Whenever the eigenvalues corresponding to a form-change eigenvector are real, as in Fig. 5 or 6, the biological interpretation likely will be relatively easy to phrase: a linear autoregression of change upon starting score in the direction of morphospace given by that eigenvector as interpreted by its grid. In our Vilmann data set these regressions almost always have negative slope, meaning that differences from the sample mean in the direction specified by the eigenvector tend to be reduced over time according to the covariance given by the eigenvalue. In terms of the aspects of growth-form relations characterized by these real eigenvalues, the formula for the change score as a sum of multiples of changes in Boas coordinates is, by the explicit algebra of its calculation, identical with the formula for the starting score as the same sum of multiples of the initial coordinates. There is only one dimension of form-space involved in the description, the dimension displayed as both displacements and grid in diagrams like Fig. 5, and the succession from state k to state \(k+1\) is represented, as in any other scalar Ornstein–Uhlenbeck dynamic process, by a shift back toward the mean following a linear trend proportional to the initial deviation.
In the case of a pair of complex eigenvalues, however, the interpretation involves two dimensions of form space, not one: a phenomenon taking place within a plane in morphospace, not merely along a line. Such configurations are not rare—they comprise 58 (29 conjugate pairs) of the 91 eigenvalue-eigenvector combinations characterizing the seven form-change covariance matrices of the Vilmann growth data under analysis here. But this alternate diagrammatic rhetoric is much less familiar to the intuition of the organismal biologist trained only in classical multivariate techniques such as PCA. Then I suspect it is worth the reader’s time to survey the specific two-dimensional geometry of some of these canonical (u, v) planes, where prediction of growth from form involves a change of direction of the real canonical vector u to its image \(au-bv\) under the action of A, and likewise a change of the imaginary canonical vector v to its image \(bu+av\) under the action of A. From the 29 complex eigenvalue-eigenvector pairs over our seven intervals of change, Figs. 15 and 16 extract examples that cover most of the alternatives, with a suggested rhetoric for reporting each. Each panel of Fig. 15 is drawn in the plane of the canonical vectors. For the first three panels, the corresponding row of Fig. 16 supplies the grids that name the directions of interest in this plane.
Each panel of this figure is to be interpreted as a presentation in the style of panels (iii) and (iv) of Fig. 3 that matches the eigenvalue and the canonical vectors of the eigenpair it is illustrating. Each figure lies in the plane of those two canonical components, in a configuration positioned with the real canonical vector (labelled “r”) pointing to the right and the imaginary canonical vector (“i”) pointing vertically upward. The lengths of these two vectors in each plot, actually in 16 dimensions, are scaled so that their squares sum arbitrarily to 1.0—the distance between the ends of these two vectors is always 1.0. The heavy vectors labelled r’ and i’ are proportional to the real and imaginary canonical vectors in this same plane (for that is what the fact of their together comprising an eigenvector guarantees) after multiplication by the crosscovariance matrix \(X^tY/18\) where X is form and Y is form-change. The directions r and i were already interpreted as deformations in the discussions of Figs. 7, 13, and 14. Via the additional grid images in the next figure, the growth dynamics of the corresponding eigenpair can be interpreted via the relationships among these four vectors.
Consider first the configuration in panel (A), the first eigenvalue and eigenvector for the first growth interval. This is the eigenpair that was detailed earlier, in Fig. 7. The imaginary part of this eigenvalue is considerably lower in absolute value than the real part. In this example the canonical vectors are quite disparate in length, r longer. The prediction r’ for a perturbation r is not far from a simple attenuation, a predicted change in the direction opposite the perturbation. The prediction i’ for a perturbation i is mildly confounded by a term in -r owing to the dominance of r’s length over i’s. Had we chosen the second eigenpair of this analysis, the complex conjugate of the eigenvalue and the eigenvector, we would get only a reflection of this same diagram. Complex eigenpairs like these have only double weight, not quadruple weight, with respect to the other kind of eigenvalue, the real number that was already diagrammed completely in figures like Fig. 5. In this way the total number of eigendimensions reported continues to equal the total number of nonnull dimensions of the Boas coordinates, in this example, 13.
Panel (B) shows another “almost real” eigenvalue, \(-13.6-5.4i,\) that dominates the analysis of the fourth change period. The canonical vectors for this form-change summary are nearly equal in length. We can now see more clearly how the predictands r’, i’ deviate from the reflections -r, -i of the canonical vectors by a modest joint shift, almost a rotation, that combines each with a small multiple of the other.
An intermediate scenario is characterized by eigenvalues whose real and imaginary parts are not too unequal in magnitude. In this scenario, the predicted perturbation of a form deviating from the initial average by a change driven by the real canonical vector is blended with a commensurate contribution from the imaginary component of the same eigenvector, so that beyond the attenuation given by the real part of the eigenvalue, the prediction more or less rotates the real component of the eigenvector’s effect by about \(45^\circ \) in this r, i plane. And, similarly, the predicted perturbation of a form driven by the imaginary component of that eigenvector blends with a real component of the perturbation prediction, the two predictions roughly sharing the same reversal and rotation in this plane. Panel (C) shows an example of this situation. The reversal of direction of r and i has the intuitive meaning of a stabilization, but the adjustment of both components in the same rotational sense, each by the same multiple of the other, is novel to this context of an eigenvalue \(a+ib\) with a and b roughly equal in magnitude. (Evidently there is also some shearing of that pair of vectors.) In panels (A) and (B) the rotation was by nearly \(180^\circ \); that is why it could be reported as “almost a reversion.” But when an eigenvalue’s real and imaginary components are nearly equal, the angle of rotation is closer to \(135^\circ \) than \(180^\circ \); the deviations from reversion are now substantial.
Finally, consider an eigenvalue that is almost pure imaginary, panel (D). As a fifth eigenpair in a sample of 18 animals only, it almost surely lacks biological meaning. I include it only as an example of one possible geometrical extreme, where the rotation of r’ and i’ in their morphospace plane will approach the value of \(90^\circ \) (Model B in the earlier text). (The example at right on page 133 of Bookstein (2018) is such a configuration, subsampled along the time dimension and then indefinitely extrapolated.) As we follow a complex eigenvalue as it passes from nearly-real to nearly-imaginary that way, panel (A) to panel (D), the prediction of variations along the real axis of the initial eigenvector score rotates steadily into a position of overlap with the positive direction of the imaginary axis of that starting score. (According to the formula, this final position is close to \(-bv,\) but b is negative for eigenvectors like this one.) At the same time, unsymmetrically, the prediction of changes along the positive imaginary axis of the starting score rotates steadily into a position of overlap with the negative real axis of that starting score.
Thus in this scenario, a perturbation of the initial form along the real part of the eigenvector score, the horizontal in panel (D), would be predicted to change in a pattern aligned with its positive imaginary part; whereas a perturbation of a form deviating from the average along the positive imaginary part of this same eigenvector score, vertical here in panel (D), is predicted to change by the negative of that horizontal. This countermands one’s intuition of an “inverse regression”—if the perturbation prediction takes variation on the positive real axis r here to the positive imaginary axis i, shouldn’t it also take the positive imaginary axis i back to that positive real axis r? No, that expectation could apply only in a setting lacking the dynamic context of these growth analyses. Even in the case of panel (A) of Fig. 15, the dominant eigenpair for prediction of growth to 14 days from form at 7 days, the prediction of change in the real component of the eigenscore conflated a bit of the variation in the imaginary component, and conversely. As the imaginary component of the eigenvalue increases, up through panel (C) of Fig. 15, the prediction formulas for a perturbation appear less and less like the only lightly linked stabilizations of panels (A) or (B) of the figure and more and more like the coherent joint rotation seen most clearly, however unrealistically, in this last panel.
Figures 3 and 15 summarize the vector geometry of the complex case, but in order to pass to the rhetoric of biological explanation for analyses of morphometric data, we need to invoke the iconics of the corresponding displacement diagrams or thin-plate spline grids also. Figure 16 pursues this purpose using the thin-plate spline. To understand the meaning of the first three panels in Fig. 15, examine the grid representations in Fig. 16 as they match panels A, B, C of the preceding figure. Keep in mind the two main principles of these eigenanalyses:
just as the forecast growth change of a stabilized growth pattern represented by a real eigenvalue lies along the line encoding the Boas vector for that pattern, but in the opposite direction from the initial state of the form, so in the complex case the forecast growth change of the growth pattern lies in the same plane as the canonical vectors of the eigenvector itself, the real (r) and the imaginary (i);
and
for the complex eigenvalue \(a+ib\) to be an eigenvalue, the action of the crosscovariance matrix \(X^tY/18\) on their eigenvector \(u+iv\), represented by the pair (u.v) of canonical vectors, needs to result in the vector \((au-bv,av+bu)\) in the same (u.v) plane. The terms in a here correspond to a process of stabilization; those in b, to a rotation within this same (r,i) plane.
So we can intuit the meaning of these complex eigenpairs by inspecting the corresponding grids of their real and imaginary canonical vectors.Footnote 5 In the first row of Fig. 16, for instance, is the full set of relevant deformation grids for growth changes from age 7 days to age 14 days along that first (complex) eigenvalue \(-27.3-6.4i.\) You have seen the grids for r and i before, in Fig. 7; here they are joined by their inverses -r and -i. Then this first form-growth eigenpair takes variation along the real part r of the eigenvector to variation along the grid r’, which is almost exactly the same as the inverse transformation -r; whereas the grid for variation along the imaginary component i’ lies mainly along -i with a detectable modification by -r. Apart from that small adjustment, these are almost a pair of real eigenvectors, with nearly the same slopes of reversion along those two directions considered separately.
Biological interpretation. The features of greatest explanatory power for perturbation-aligned growth between ages 7 days and 14 days of Vilmann’s rats are approximately the patterns shown by the grids for r and for i in the first row of Fig. 16. Each predicts a trend of reversion to the mean form (stabilization) that is closely aligned with the inverses -r, -i of these same directions. Thus in row 1 of Fig. 16, the grid for r’ looks like the grid for -r, correction of the overextension of the interval between SES and ISS, together with a proportional correction of the decrease in rear calvarial height-to-width ratio and a stabilization of changes in aperture and orientation of the foramen magnum. The grid for i’ looks like the grid for -i, mainly an increase in anterior calvarial height, but is slightly more extended vertically and less extended horizontally owing to its contribution from -r. Otherwise, both r’ and i’ closely match the canonical grids -r and -i, respectively.
A second example of a strong first eigenpair that is complex but “almost real” was anticipated as panel (B) of Fig. 15 and is elaborated in the second row of Fig. 16. Unlike the situation above in panel (A), the canonical vectors r and i here are now of more nearly equal length. But nevertheless, because the real part of the eigenvalue is about double the imaginary part, only the image i’ under the action of A is observably rotated from a position of simple stabilization.
Biological Interpretation. The canonical vectors here evidently differ a great deal in their degree of integration (Bookstein, 2015). The deformation given by i somewhat resembles a vertical multiple of the first partial warp of the mean form (Bookstein, 2018), a displacement field across the mean form that is highly integrated by virtue of being close to a quadratic in the x-coordinate. The suggestion of stabilization of this direction perhaps suggests a functional interpretation in terms of stiffness of the cranial base axis. The apparent stabilization r’ for the bilocal imaginary canonical vector r, in contrast, suggests a dis-integrated combination of two processes at some distance, one regulating the foramen magnum and the other reproportioning the landmark Lam between IPP and Brg. Recall from the discussion of Fig. 13 that the effects of i and r on this proportion were opposite; then the grid for i’, which combines -i with -r, goes some way toward stabilizing this proportion. The combination remains a highly integrated pattern indeed, except for a small local rearrangement along the parietal bone.
The last row of Fig. 16 diagrams an example for which the two components of the eigenvalue (\(-9\) and \(-7.4\)) are nearly equal and the lengths of the two canonical vectors not too disparate. (Recall from the discussion at Fig. 14 that the imaginary part of this eigenvalue is attenuated by one serious outlying pair of scores inconsistent with the trend shown by the other 16.) Fig. 3 showed us, and panel (C) of Fig. 15 confirms, that this case entails a rotation of both canonical vectors in their plane from the attenuations consequent to the real part \(-9\) of this eigenvalue. Because real and imaginary parts are commensurate, both of the predicted growth change vectors r’,i’ are hybrids of both the canonical vectors, -r and -i. As the second column from the right in this row shows, the real part r’ of the predicted growth change in this plane combines roughly equal fractions of -r and i, reflected real canonical component of the initial form with unreflected imaginary component. At the same time, the imaginary part of this same predicted growth combines the inverse -i of the starting imaginary canonical deviation with an aliquot of that same inverse -r of the original real canonical deviation. Note again that the treatment of the real and imaginary parts of this change is not symmetrical. This is perhaps the most counterintuitive aspect of the whole complex eigenanalysis: these dimensions do not behave as complementary, or, if you will, “compensatory” aspects relating the form and the form-change. The prediction of the real canonical response includes a component that is positive in the imaginary canonical, but the prediction of the imaginary canonical includes a component that is negative in the real canonical part.
Biological interpretation. The grid for r shows its main deformations in the anterior calva, displacing both landmarks there downward and backward; the grid for i emphasizes what is by now a familiar remodeling of the foramen magnum, with a bit of the opposite anterior calval effect. The predicted response r’ to the perturbation r is closely aligned with -r, displacement of the segment SES-Brg upward and forward. The prediction i’ of patterns of variation along the imaginary canonical vector i combines its opposite -i, displacement of Brg posteriorly, with some of the grid -r, an adjustment of the upper calva that shifts Brg relatively anteriorly. The combination of these two terms in the construction of i’ (Fig. 15C) has the effect of partially cancelling out the changes in the anterior calva encoded in the two canonical vectors separately. Rather, the region is left alone, with a predicted reversion only in the neighborhood of Lam. In this way, aspects of orthogonality of a feature to growth prediction (that is, irrelevance of the perturbation to the growth forecast) can emerge from combinations of the two canonical vectors that are not apparent in either one separately.
Panel (D) of Fig. 15 was included only for geometrical completeness—there appear to be no meaningful examples of this possibility in the analysis of the 18 Vilmann animals. The three preceding interpretations are here mainly as examples of how to decode the canonical pairs of a complex eigenvalue as coplanar substrates for the multiplication by their shared eigenvalue. In a sample as small as Vilmann’s it would be inappropriately optimistic to claim that any are stable against sampling except possibly for the first of any interval’s eigenpairs, which could plausibly be suggestive of regulatory processes for the first growth interval (age 7 days to 14) and the fourth (age 30 days to 40). Their credibility would also depend on the magnitude of the correlations in the scatterplots of these figures, which express the explanatory power of the hypothesis of form-growth alignment. The same caveat would likely apply to any real eigenvectors that proved of interest. This caution strongly supports the recommendation in the next section that designers of longitudinal studies make every effort to work with much larger samples than they do at present.
Implications Theoretical and Empirical
I believe the Vilmann data set here is the most diversely analyzed data set that the GMM community has ever encountered. As just my own personal contribution, after beginning with tensor analyses (Bookstein, 1983, 1987), I have previously examined it by partial warp analysis (Bookstein, 1991, 2021, 2018), integration analysis (Bookstein, 2015), factor analysis (Bookstein, 2017), and allometric analysis (Bookstein, 2021), and at every return engagement new patterns have emerged. In part this persistent yield of information is testimony to the crucial role that sound experimental design plays in the biomathematical literature. The centrality of growth designs, in particular, is a principle that has been with us for just about 100 years—it was first emphasized as the core of theoretical biology in publications of the Vienna Vivarium around the year 1920 (see my review in Bookstein, 2019). One of those Vienna publications wrote of the crucial role in developmental biology to be played by serial analysis of locally registered measurements of form:
[D’Arcy] Thompson’s holistic deformations can be made comprehensible if we can visualize a space lattice upon the living form, so as to assess how each little piece changes its shape under conditions that vary by species. Here lies open a rich, nearly undeveloped field that invites a mathematization, one whose erection we hope will begin very soon. At the conclusion, measurement of the physical space grid will have been unified with stereochemical organometry. (Przibram, 1923:14)
In this sentiment Przibram was boldly anticipating a method, the detailed study of development as spatial transformation, that was to become practical first in botany (e.g. Schüepp, 1966) while otherwise awaiting the advent of cinematic recording of embryos by pioneers such as Jacobsen and Gordon (1976). But absent from the corresponding biomathematical literature over this entire century has been a method for the variational analysis of these deformation-based formalisms within species. Findings along this line are worth the investment of a graduate assistant’s time only if they can be processed through some analytic method that exploits both of the criteria of a good longitudinal design: not only the persistence of the same specimens over multiple ages of observation, but also the careful coverage of the form by coordinate data permitting multiscale summaries like these grids, rather than any predetermined list of scalar measurements that afford only tabular summaries measure by measure.
This paper’s method of form-vs.-change eigenanalysis may well turn out to be a crucial part of a new methodological toolkit for empirical studies of postnatal development that might fulfill the Vivarium’s vision of so long ago. The algebra here would indeed conduce to investigations of how those changes of form of “little pieces” depend on the configuration that is observed to be changing, whether grid cell by grid cell or holistically. As one research setting worthy of increased attention, we might highlight the phenomenon of accretionary growth, such as in shelly creatures and perhaps many botanical settings, where developmental information is retrievable in continuous or cyclic longitudinal time from microimaging of adult forms, thereby bringing their biometrics closer to the central themes of evolutionary GMM.
Over the history of multivariate analysis there have been many analogues to this injection into empirical biometrics of mathematics borrowed from disciplines at some distance. In the remarkably early year of 1901 Karl Pearson introduced the method of principal components analysis for the purpose of determining the least variable dimension from a data set of three measured lengths. His paper is clearly modeled after the much earlier work of the British algebraist school on the decomposition of “quadrics” for applications in physics and mechanics, and his demonstration that principal components are eigenvectors of a covariance matrix is essentially identical with the latest contemporary versions of this same derivation. And Harold Hotelling’s rediscovery of this method (1933), which was published in the Journal of Educational Psychology rather than any biometrical outlet, acknowledges that the method was “first studied in connection with the perturbations of the planets,” for which precedence he cites the 19th-century French mathematician Augustin-Louis Cauchy [whom Pearson does not cite, neither does Hotelling cite Pearson]. Ironically, Procrustes analysis itself was a borrowing from psychometrics, the domain of Gower’s original paper of 1975 on “generalized Procrustes analysis.” There is no good reason to eschew mathematical browsing in the search for analytic methods responding to new biometrical queries—the same science of dynamics from which Hotelling says he has borrowed the method of PCA is the source of the modified (unsymmetric) version of the calculus that this paper relies on.
The details of these new eigenanalyses shed some light on the limitations of the procedure suggested by Burnaby (1966) for minimizing unwanted effects of size variation in multivariate systematic investigations. Burnaby’s suggestion is that the systematist search for “growth-invariant discriminant functions,” those that are “orthogonal to vectors representing variation which is extraneous to the desired comparisons.” But it remains to be clarified what is meant by “extraneous to the desired comparisons” in this ransacking (and also in exactly which geometry the fact of orthogonality is to be assessed). Figure 17 reminds us that the description of “growth” for these rodents involves two uncorrelated components of morphometric change from infancy to adulthood (one roughly linear in time, one not at all linear), and that the variation of this sample of organisms within the final “adult” stage of their observation, age 150 days, appears to align with neither of these. Nor does it match the deformation in Fig. 11, which represents the dominant age-90-to-150 stabilizing response to perturbations of the growth trajectory at 90 days. In my judgment, potential discrepancies like these are enough to disqualify any PCA-derived score from serving as a suitable covariate to be regressed out prior to systematic comparisons that might involve growth dynamics.
The language I’m suggesting for reporting these complex eigenpairs does not align well with the suggestion by Mitteroecker and Stansfield (2021) that PLS be exploited to draw a contrast between “stabilization” and “compensation” as broad categories of findings in studies of canalization. That antinomy is not adequate for discussions this detailed of growth dynamics when any of the larger form-growth eigenvalues are complex. A complex eigenvector pair, such as the first pair for the first Vilmann growth interval (Figs. 7 and 16), presents us with two different dimensions with the same optimally simple descriptive properties—a pair of real and imaginary canonical components that preserve the restriction to planes in morphospace. PLS latent vectors do not respect that coplanarity condition. Also, like principal components, PLS latent variables (LV’s) must be orthogonal within their blocks separately, but the form-growth eigenvectors of this method, even those corresponding to the real eigenvalues, need not be: it is only the matched pair of real and imaginary canonical vectors that are guaranteed orthogonal by the eigenvector scaling adopted here. Nor does PLS allow a description of the phenomenon for eigenvalues that are far from the real line, such as those in panel (C) of Fig. 15, where a pair of block-2 LV’s are linear combinations of the corresponding pair of block-1 LV’s without being aligned with them separately. Methods like PLS and PCA that derive from the singular-value theorem (Bookstein, 2018, Chap. 4) are governed by an intuition about how dimensions of form ought to assort for a different explanatory purpose (namely, least-squares explanation) that fails to cope with the mathematical structure and biomathematical purpose of these crosscovariance problems. In short, the geometry of PLS is inadequate for the morphometric analysis of growth dynamics. The task requires a more demanding mathematics than that.
The implications of this initial exploration of the extended eigenanalysis method for growth studies are broad. One recommendation is immediate: in knockout studies of genes suspected of regulating growth, more effort should be devoted to assembling explicit growth records, in order to make visible the corresponding aspects of dynamic stability. The investigator should prefer a longitudinal imaging design such as Vilmann’s over any alternative requiring sacrifice of the experimental animals at any fixed age or sequence of ages. Also, the role of the graduate student digitizing landmarks by tedious scrutiny of magnified radiographs should be superseded by state-of-the-art multiscale image analysis techniques instead. (In particular, the software will be more effective than the graduate student at tracking homologous points over time by the information in cues from nearby tissue inhomogeneities that do not rise to the level of landmarks themselves.) Such a change would also increase the experimentalist’s willingness to oversee larger samples than would otherwise be feasible, in order to reduce the standard errors of these eigenvectors and of the form-growth correlations among their scores. Collecting minimally intrusive longitudinal data is much easier now that automated processing of CT scans can retrieve most of the necessary information without relying on human-computer interaction except for intermittent visual confirmation.
A second recommendation would be the restoration of growth dynamics to the broader context of evolutionary applications of GMM. One core strategy would be replacement of MANOVA (multivariate analysis of variance) of form coordinates by the corresponding MANCOVA (multivariate analysis of covariance). Among those covariates should be the most meaningful dimension(s) of stabilization, real or complex, offered by eigenanalyses based in ancillary studies of near-adult growth in other samples. This might be particularly helpful in studies of extinct paleospecies that have extant descendants. Such analyses should not use Centroid Size as a covariate for the reasons set out in Bookstein (2021)—CS is not aligned with any plausible allometric process except possibly in fishes. Furthermore, analysis should not be of shape coordinates, which are form coordinates out of which Centroid Size has been divided. What grows is form, not shape; to understand physical development correctly, analysis needs to be in form space, not shape space—Boas coordinates, not Procrustes. The inappropriateness of Procrustes methods for studies of growth dynamics is particularly consequential in applications of GMM to paleobiology, where (except when the histology is accretionary) the principal components of growth and its regulation can only be proxied by principal components of form at whatever ages the organisms had died. So another application context for the dynamical method of this paper would be studies of orthogeneses and other selection gradients. There, findings could change substantially when multivariate analyses are calibrated against any dominant form-change eigenvectors, real or complex, that emerge from relevant longitudinal studies; or those eigenvectors might themselves be phenotypic aspects of selection.
I believe the generalized eigenanalysis offered in this paper may help fill a major gap in the methodology of multivariate developmental biology. I am aware of no canonical method prior to what this paper is offering for the quantification of organismal growth dynamics, whether genetic or ecophenotypic, that extends over multiple measurements jointly governed by some common metrological framework. The Boas coordinate system of geometric morphometrics offers an ideal testbed for the exploration of this potentially higher-dimensional version of developmental stability both because the mathematics has been familiar now for more than 150 years in a field closely related to biomechanics and bioengineering and because GMM affords so rich a range of suggestive diagrams. I eagerly anticipate the broadening of today’s developmental landmark-based research designs to embrace the multidimensional analysis of dynamic stability against perturbation that this paper has demonstrated for one data source fully half a century old, data that were gathered for other purposes entirely.
Data Availability Statement
The data analyzed here were published in full in Bookstein (1991). They are also available somewhere on Jim Rohlf’s morphometric archive.
Notes
Just in this footnote, write \(X=Z_k+\varepsilon _1,\) \(Y=Z_k+\varepsilon _2\) where \(Z_k\) is some true value for the \(k^{\mathrm{th}}\) animal, a value having a distribution over animals, and X and Y are two consecutive measurements of it with the same error distribution \(\varepsilon \) common to all the animals. Assume Z, \(\varepsilon _1\), and \(\varepsilon _2\) are all independent. On this independence assumption the variances of X and Y are both \(\mathrm{var}(Z)+\mathrm{var}(\varepsilon ),\) the covariance of X and Y is the same as \(\mathrm{var}(Z)\), and the variance of \(\varepsilon _2-\varepsilon _1\) is \(2~\mathrm{var}(\varepsilon ).\) Then the covariance of X and \(Y-X\) is just \(-\mathrm{var}(\varepsilon )\) and the correlation of X and \(Y-X\) must be \(-\mathrm{var}(\varepsilon )/\sqrt{\bigl (\mathrm{var}(Z)+\mathrm{var}(\varepsilon )\bigr )\cdot \bigl (2~\mathrm{var}(\varepsilon )\bigr )} = -\sqrt{\mathrm{var}(\varepsilon )/ 2~\bigl (\mathrm{var}(Z)+\mathrm{var}(\varepsilon )\bigr )}.\) For this to equal \(-0.5,\) we must have \(\mathrm{var}(\varepsilon )=\mathrm{var}(Z)\).
If this were truly a possible covariance matrix \(X^tX/N\) for some bivariate data set X, we would have to have \(a>0\), \(d>0,\) \(b=c,\) and \(b^2<ad.\) But none of those identities have to obtain in any of our form-growth analyses. For instance, the covariance of any form coordinate against its growth changes—a number like a or d in this toy example—is almost always negative; also, the covariance of any form coordinate—say, the fifth one—against the change in another coordinate, say the sixth one, is hardly ever equal to the covariance of that sixth form coordinate against changes in the fifth one, contravening the symmetry \(b=c\) that applies to covariance matrices \(X^tX/N.\)
In the application to growth series, the elements of matrix A will all be covariances. In a different setting, elements a through d that are independent standard Gaussians, it can be shown both by theorem and by simulation that the probability of this complex case—the probability that \((a-d)^2+4bc<0\)—is about 29.4%, which is far from rare. That the complex eigenanalyses appearing in dynamical analyses of ecological data (e.g. Strogatz, 2014; Edelstein-Keshet, 1988) ought to extend to multivariate contexts like morphometrics perhaps should have been noticed before.
Of course this is not the same meaning of the word “canonical” as in the notions of canonical correlations or canonical variates that are likely more familiar to my readership. Those concepts arise in studies of covariance structures \(X^tX/N\), not the matrices \(X^tY/N\) here. The word “canonical” has many different usages in mathematics; the usage here is borrowed from engineering dynamics.
In these examples, the two canonical components often display distinct degrees of integration. A corollary theme worth pursuing would explore possible rotations of the pairs of grid columns here to accentuate the cogency of that particular interpretation. For a simple algorithm for rotation against the spectrum of partial warps in this way, see the Appendix of Bookstein (2017).
As long as those that are that “multiple roots” are counted the appropriate number of extra times—a circumstance that cannot occur if the elements of A arise from real data, subject as they always are to measurement noise. According to Wikipedia, the first rigorous proof of this fundamental theorem dates from 1806, though Gauss (he of the “Gaussian distribution”) almost had it in 1799. It first appeared in a textbook by Cauchy in 1821, precisely two hundred years ago. The mathematics we apply can be remarkably older than our applications!
The command is simply eigen(matrix(c(-5,1.25,-1.25,-2),2,2)), followed in the complex case by a normalization of the sum of squares of each eigenvector to 1.
In this \(2\times 2\) setting, A’s eigenvalues are complex if and only if Aw is never parallel or antiparallel to w for any direction w in the plane. Rather, the action of A rotates every vector w, usually to a variable extent, in the same clockwise or counterclockwise sense.
References
Abarbanel, H. D. I., Brown, R., & Kennel, M. B. (1992). Local Lyapunov exponents computed from observed data. Journal of Nonlinear Science, 2, 343–365.
Boas, F. (1905). The horizontal plane of the skull and the general problem of the comparison of variable forms. Science, 21, 862–863.
Bookstein, F. L. (1983). The geometry of craniofacial growth invariants. American Journal of Orthodontics, 83, 221–234.
Bookstein, F. L. (1987). Describing a craniofacial anomaly: Finite elements and the biometrics of landmark locations. American Journal of Physical Anthropology, 74, 495–509.
Bookstein, F. L. (1991). Morphometric tools for landmark data: Geometry and biology. New York: Cambridge University Press.
Bookstein, F. L. (2014). Measuring and reasoning: Numerical inference in the sciences. New York: Cambridge University Press.
Bookstein, F. L. (2015). Integration, disintegration, and self-similarity: Characterizing the scales of shape variation in landmark data. Evolutionary Biology, 42, 395–426.
Bookstein, F. L. (2017). A method for factor analysis of shape coordinates. American Journal of Physical Anthropology, 64, 221–245.
Bookstein, F. L. (2018). A course of morphometrics for biologists. New York: Cambridge University Press.
Bookstein, F. L. (2019). Reflections on the biometrics of organismal form. Biological Theory, 14, 177–211.
Bookstein, F. L. (2021). Centric allometry: Studying growth using landmark data. Evolutionary Biology, 48, 129–159. https://doi.org/10.1007/s11692-020-09530-w.
Boyd, E. (1980). Origins of the study of human growth. School of Dentistry, University of Oregon Health Sciences Center, Child Study Center.
Burnaby, T. P. (1966). Growth-invariant discriminant functions and generalized distances. Biometrics, 22, 96–110.
Coddington, E. A., & Levinson, N. (1955). Theory of ordinary differential equations. New York: McGraw-Hill.
Edelstein-Keshet, L. (1988). Mathematical models in biology. New York: McGraw-Hill.
Gonzalez, P. N., & Barbeito-Andrés, J. (2021). Canalization: A central but controversial concept in evo-devo (pp. 1062–1072). In: L. Nuño de la Rosa and G. B. Müller, eds., Evolutionary developmental biology: A reference guide. Springer, Berlin.
Gold, H. J. (1977). Mathematical modeling of biological systems: An introductory guidebook. New York: Wiley.
Gower, J. C. (1975). Generalized Procrustes analysis. Psychometrika, 40, 33–51.
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417–441, 498–520.
Jackson, J. E. (1991). A user’s guide to principal components. New York: Wiley.
Jacobson, A. G., & Gordon, R. (1976). Changes in the shape of the developing vertebrate nervous system analyzed experimentally, mathematically and by computer simulation. Journal of Experimental Zoology, 197, 191–246.
Jolliffe, I. T. (2002). Principal component analysis (2nd ed.). New York: Springer.
Lande, R. (1979). Quantitative genetic analysis of multivariate evolution, applied to brain-body size allometry. Evolution, 33, 402–416.
Legendre, P., & Legendre, L. (2012). Numerical ecology (3rd English edn.). Amsterdam: Elsevier.
Margalit, D., Rabinoff, J., & Rolen, L. (2017). Interactive Linear Algebra. Georgia Institute of Technology.
Mitteroecker, P. M., & Stansfield, E. (2021). A model of developmental canalization, applied to human cranial form. PLoS Computational Biology, 17(2), e1008381. https://doi.org/10.1371/journal.pcbl.1008381.
Moss, M. L., & Vilmann, H. (1978). Studies on orthocephalization of the rat head. I. A model system for the study of adjustive cranial growth processes. Gegenbaurs Morphologisches Jahrbuch, 124, 559–579.
Moss, M. L., Skalak, R., Dasgupta, G., & Vilmann, H. (1980). Space, time, and space-time in craniofacial growth. American Journal of Orthodontics, 77, 591–612.
Moss, M. L., Skalak, R., Patel, H., Shinozuka, M., Moss-Salentijn, L., & Vilmann, H. (1984). An allometric network moel of craniofacial growth. American Journal of Orthodontics, 85, 316–332.
Moss, M. L., Skalak, R., Patel, H., Shinozuka, M., Moss-Salentijn, L., & Vilmann, H. (1983). An allometric network model of craniofacial growth. American Journal of Orthodontics, 85, 316–332.
Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2, 559–572.
Petrovic, A. G. (1972). Mechanisms and regulation of mandibular condylar growth. Acta Morphologica Neerlando-Scandinavica, 10, 25–34.
Pontryagin, L. S. (1962). Ordinary differential equations., Tr. L. Kacinskas and W. B. Counts. Boston: Addison-Wesley.
Potthoff, R. F., & Roy, S. N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika, 51, 313–326.
Przibram, H. (1923). Aufbau mathematischer Biologie. Abhandlungen zur theoretischen Biologie, volume 18. Berlin: Verlag von Gebrüder Borntraeger.
Reyment, R. A. (1991). Multivariate palaeobiology. Oxford: Pergamon Press.
Riolo, M. L., Moyers, R. E., McNamara, J. A., & Hunter, W. S. (1974). An Atlas of Craniofacial Growth. Craniofacial Monograph 2, Center for Human Growth and Development, University of Michigan.
Schüepp, O. (1966). Meristeme. Basel: Birkhäuser.
Strogatz, S. H. (2014). Nonlinear dynamics and chaos: With applications to physics, biology, chemistry, and engineering (2nd ed.). New York: CRC Press.
Tanner, J. M., & Davies, P. S. W. (1985). Clinical longitudinal standards for height and height velocity for North American children. Journal of Pediatrics, 107, 317–329.
Tuddenham, R. D., & Snyder, M. M. (1954). Physical growth of California boys and girls from birth to eighteen years. University of California Publications in Child Development, 1, 183–228.
Vilmann, H. (1969). The growth of the cranial base in the albino rat studied by roentgencephalometry. Journal of Zoology London, 159, 583–591.
Waddington, C. H. (1942). Canalization of development and the inheritance of acquired characters. Nature, 150, 563–565.
Acknowledgements
Thanks are due to Philipp Mitteroecker and Ekaterina Stansfield for prepublication access to their paper on canalization that stimulated my search for a truly dynamical method, and to Jim Rohlf for many helpful suggestions at the time the argument here was being designed. I am particularly grateful to Julien Claude and (again) Philipp Mitteroecker for the detailed and constructive comments in their signed reviews of a rather different earlier version of this paper. There is no external support to be reported in connection with the results here.
Funding
Open access funding provided by University of Vienna.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author has no conflicts of interest to disclose.
Appendix: The Missing Mathematics
Appendix: The Missing Mathematics
There is a standard pedagogy for biometricians that tries to optimize the way we teach principal components analysis (PCA), the multivariate procedure most commonly encountered across the range of geometric morphometric applications. Beginning with a data set of morphometric variables, PCA is the interpretation of a certain matrix manipulation in terms of the variances of a series of derived principal component scores. The matrix being manipulated is the covariance matrix of that roster of morphometric variables, and the matrix analysis borrowed from the mathematicians (who call it eigenanalysis) is interpreted in this PCA context as proffering scores that are uncorrelated among themselves and that maximize the “explained variance” of their successive subsequences (the first one, the first two, and so on). There are several good textbooks about the applications of this technique throughout the quantitative sciences, for instance, Jackson (1991) or Jolliffe (2002), and a graduate textbook of multivariate biometric or morphometric method usually has a chapter on the topic (for instance, Bookstein, 2018, Sect. 4.2).
But this is not the mathematician’s actual construal of eigenanalysis, which emerged more than 150 years ago in the scientific context of dynamical system analysis that is the prototype of this paper’s application. Indeed, the applied mathematician will typically have encountered these notions, as I did, in the context of an undergraduate course on ordinary differential equations, e.g., Coddington and Levinson’s (1955) fifteenth chapter, on “linear real autonomous two-dimensional systems.” If A is any square matrix of numbers, p rows by p columns, eigenanalysis is the exploration of solutions of the equation
where by “solution” we mean a specific value of the constant \(\lambda \) paired with the set of all multiples of any vector v satisfying the equation. The constant \(\lambda \) is called the eigenvalue of the pair, and any of the v’s can be its eigenvector (or eigenfunction, in other contexts). In words, the action of the matrix A on one of these eigenvectors is to leave its direction unchanged but to multiply its magnitude by a factor \(\lambda .\) (Hence the particle “eigen-” in the name of the technique: it is the German prefix that corresponds roughly to the English concept of “own” or ‘individual.”) You may not even recognize equation (A1) as equivalent to the definition of the principal components (PC’s) of a covariance matrix A as you were taught them, but it is the same—the equation states the fact that the coefficients of the PC as a linear combination are exactly proportional to the covariances of the PC score with the list of the contributory variables. The mathematician’s preference for equation (A1) originates deep in the foundations of matrix algebra, where it is shown that the eigenvalues \(\lambda \) satisfying equation (A1) turn out to be the solutions of one single polynomial equation of degree p. And, according to the celebrated Fundamental Theorem of Algebra, there are always exactly p of these values.Footnote 6
But where this general mathematical approach deviates from the simplistic version we use in PCA is that those eigenvalues do not have to be positive numbers. They must be permitted to be negative numbers or even complex numbers as well, quantities like \(3+2i\) where i is “the square root of \(-1\).” If the matrix A is \(\begin{pmatrix}0&{}-1\\ 1&{}0 \end{pmatrix},\) for example, the values of \(\lambda \) entangled in the matrix equality (A1) satisfy the equation \(x^2+1=0\) whose roots are the two square roots \(+i,~-i\) of \(-1\). This is not as occult as it sounds (or it would not have the dominant role it plays in the physical applications of ordinary differential equations). The extension of the notion of a vector “multiple” to these complex values is the price we pay for extending the domain of the theorem from matrices A that are covariance matrices of data sets against themselves—the mathematician calls the members of this class the “symmetric positive-definite” matrices (in particular, their diagonal entries all have to be positive)—to the set of all p-by-p matrices whatever the numerical values of their entries. In fact, the entries of the matrix A can themselves be complex numbers and the theorem will still hold, but the applications in this paper do not require this version of the theorem. (Note, too, that the two-dimensional aspect of complex numbers is not what makes them suit the two-dimensional world of the Vilmann data. The same theorem and the same approach via canonical coordinates would apply to guarantee the possibility of eigenanalysis for form coordinates in three dimensions as well.)
The essence of the procedure encoded in equation (A1) can be made clear enough just from its simplest applications, to a matrix A that has only two rows and two columns. Consider the contrast of two examples:
While neither of these matrices can itself be a covariance matrix, owing to those negative numbers down the diagonal and the lack of symmetry off the diagonal, nevertheless \(M_1\) results in an eigenanalysis that is easily explained. From your favorite statistical computing package (I am using Splus,Footnote 7 but you may prefer R, Matlab, Mathematica, or any other matrix-savvy computational tool) we learn that the two solutions of equation (A1) for this matrix \(M_1\) can be ordered as
as in the left-hand panel of Fig. 18. Here \(v_1\) and \(v_2\) are those two eigenvectors, and \(w_1\) and \(w_2\) are the results of the action of \(M_1\) on them. One can explicitly check that they are eigenvalue-eigenvector pairs: for instance, using the \(v_1\) from equation (A3),
and similarly, by explicit arithmetic, \(w_2=M_1v_2=-2.671~v_2.\) Notice that although these eigenvectors are of unit length, they are not perpendicular, and notice too that the sum of the eigenvalues, \(-7\). is equal to the trace of the matrix \(M_1\). When \(M_1\) is replaced by a crosscovariance matrix \(X^tY/18\), the v’s will be the coefficients for the score of the initial form, and the w’s those for the growth autoregression (which applies to a list of coordinate changes having somewhat smaller variances).
This analysis is to be interpreted as follows. The action of matrix \(M_1\) on the eigenvector \(v_1\) is to multiply it by \(-4.329\), that is, to reverse its direction and magnify it a bit more than fourfold. This multiplication by \(-4.329\) can be considered the combination of a multiplication by \(-3.5\) (average of the entries on the diagonal of \(M_1\)) followed by an extension by the factor 0.829 (on the original scale) in this same direction. Similarly, the action of \(M_1\) on \(v_2\) results in a vector \(w_2\) reversed in direction along which it is scaled by a factor \(-3.5\) that is then reduced in length by this adjustment factor of 0.829. That multiplication by \(-3.5\) would take the initial v’s to the positions in the open circles; the two adjustments follow the dashed lines outward or inward, respectively, to their final positions at the w’s.
Now consider the matrix \(M_2\), which differs from \(M_1\) only in the magnitude of those opposed off-diagonal entries. Qualitatively, \(M_2\) looks like its properties ought to be the same as those of \(M_1\). But if you check those entries in detail, you notice that the discriminant \((a-d)^2 + 4bc\) of equation (A3) is now a negative number, \(-16\), so that the eigenanalysis must instead involve complex numbers. Because 16 is a perfect square, the Splus eigen command returns the simple values
where i is that “imaginary” unit, the square root of \(-1,\) and I have replaced the letter v by z for clarity in the sequel. The two \(\lambda \)’s are complex conjugates—they have the same real part but opposite imaginary parts—and their sum is still \(-7,\) the trace of the matrix \(M_2\). Likewise the two eigenvectors are complex conjugates entry by entry. (If you have not seen complex numbers before, there is a very terse overview in Bookstein (2018), pp. 370–371.) And again, following the rules of complex arithmetic, one can check that these are eigenvalue-eigenvector pairs: e.g., for the \(z_1\) in equation (A4),
where the multiplications and additions are all carried out according to the rules of complex arithmetic, including the identity \(i^2=-1.\)
What could this mean? That question is answered in the right-hand panel of Fig. 18, which lays out for the \(M_2\) case the same sort of diagram that applied for the visualization of the effect of \(M_1\) to its left. The vectors \(v_1\) and \(v_2\) are now not the eigenvectors \(z_1,~z_2\) explicitly reported by the eigenanalysis, but instead the very useful pair of auxiliary vectors corresponding to what the applied mathematicians call the real canonical form of this complex eigenvalue pair. In general (and in particular in the analogous examples of our Vilmann finding) these are just the real and imaginary parts of either eigenvector considered separately. As it happens, in this example the entries derive from remarkably simple fractions. After the normalization to sum of squares 1.0, the eigenvector \(z_1\) rotates and rescales to \(z_1'=(-0.8165-0.4082i,0.8165-0.4082i)\) (the symmetry owing to the skew-symmetry of \(M_2\)) and hence canonical vectors \(v_1=-0.8165\times (1,-1)\) and \(v_2=-0.4082\times (1,1)\): vectors with elements in a ratio of \(\pm 1\) and lengths in a ratio of 2:1 pointing southwest and northwest after the negative scaling. (One can check that \(z_1'\) continued to be an eigenvector of
As shown in the right-hand panel of Fig. 18, the effect of multiplication by \(M_2\) is once again the composition of two different components of which the first, a multiplication by \(-3.5,\) is the same as it was at left. But the second component is no longer a revision of this common factor by equal and opposite adjustments. Instead, you see that the vector \(-3.5v_1\) is adjusted by \(2v_2\), and, at the same time, the vector \(-3.5v_2\) is adjusted by \(-2v_1\). These final positions are \(w_1=\begin{pmatrix}2.04\\ -3.674\\ \end{pmatrix}\) and \(w_2=\begin{pmatrix}3.062\\ -0.204\\ \end{pmatrix}\), respectively, the vector of real parts and the vector of imaginary parts of the product \(M_2~z_1'\). These w’s are no longer aligned with the intermediate quantities \(-3.5v\) plotted at the open circles;Footnote 8 and the adjustments, while perpendicular, are not of equal length or equal leverage around the origin. Had the vectors \(v_1\) and \(v_2\) been of equal length, the adjustments (the dotted vectors in Fig. 18) would likewise have been of equal length—a rotation of the vectors w from positions \(-3.5v\) to their final values. When the lengths of the v’s are not equal, as in this example, the adjustments are to be interpreted as displacements that represent a “sheared rotation” as in the main text. Notwithstanding that disproportion, the vectors to the open circles are displaced from their starting positions in the same sense (here, clockwise). Yet the effect on their interpretation is disproportionate. While the effective stabilization along \(v_1\) is almost unaffected in slope, that along \(v_2\) is nearly doubled—a situation resembling panel (C) of Fig. 15.
When the diagonal elements of a matrix like these M’s are unequal, the eigenvalues and eigenvectors are complex if the squared difference of the diagonal elements is less than \(-4\) times the product of the offdiagonal terms; otherwise the analysis will look like the left-hand version in the figure. This will always be the case when the offdiagonal entries have the same sign (since a positive quantity cannot be less than a negative quantity), and is sometimes true even when their signs are opposite, as in the case of \(M_1\). A range of other examples may be found in Margalit et al. (2017). For a complete survey of the topologically distinct possibilities, including all the special cases, see pp. 371–375 of Coddington and Levinson (1955). There is an accessible introduction to the range of realistic possibilities in Edelstein-Keshet, 1988, Secs. 5.6–5.8, and another survey, including the very interesting limiting case of only a single eigenvector, in Strogatz (2014), Secs. 5.1–5.2.
The analysis for matrices A that are larger than \(2\times 2\) is, in general, a combination of these two configurations, together with an assortment of special cases of probability zero, across one-dimensional or two-dimensional subspaces of the full vector space of linear combinations. For matrices A that are \(3\times 3,\) or any higher count of rows and columns that is odd, since the complex roots must come in conjugate pairs, there has to be at least one real eigenvalue paired to a correspondingly real eigenvector. When a data space has dimensions of no variance, such as the dimensions for centering and for specimen rotation that are annihilated by the Boas registration procedure, the corresponding eigenvalues are zero and their eigenvectors any set of perpendicular “directions” in this null space. This is why the graph of eigenvalues for the eight-landmark Vilmann data, Fig. 9, stops at eigenvalue 13, and also, because 13 is an odd number, why each of these analyses must have at least one real eigenvector (an exact autoregression), even if it is not the one of largest eigenvalue magnitude.
Two properties of the familiar PCA are shared with this more general construal of eigenanalysis. First, when the matrix A is indeed a covariance matrix \(X^tY/N\) where X and Y are centered data sets arising from the same Boas registration, the analysis is contravariant against rotation of both data sets by the same matrix. (“Contravariant” here means that the eigenvalues don’t change, whereas the eigenvectors rotate by exactly the inverse of the rotation assigned to the geometric data.) This is an important guarantee, since the actual assignment of a “horizontal” to a Boas analysis or a Procrustes analysis is completely arbitrary. The second useful property is that, just as for the familiar PCA, the sum of the eigenvalues, real or complex, is exactly equal to the sum of the diagonal elements of the matrix A. (For instance, in the analyses of both \(M_1\) and \(M_2,\) the sum of the eigenvalues had to be \(-7.\)) So when the signs of the diagonal of A are homogeneous, or nearly, we can partition the total over the full range of eigenvalues, with complex conjugates, of course, counting twice. But one important property of principal components does not persist into this more general context. In general, real eigenvectors of a matrix \(X^tY/N\) are not orthogonal (perpendicular), nor are canonical vectors deriving from different eigenvalues. This constitutes a substantial contrast with PCA or PLS. (This nonorthogonality also characterizes the canonical variates deriving from the matrix \(BW^{-1}\) mentioned two paragraphs below.)
Although the growth data to which this article applies the novel eigenanalysis are open trajectories, from infancy through adulthood, the mathematics of these analyses arose in the context of the much deeper problem of analyzing quasiperiodic systems, systems like those of the planets orbiting the Sun, along with the chaotic systems beloved of young applied-mathematical explorers since the advent of personal computer graphics. It was the Russian mathematician Aleksandr Lyapunov (), 1855–1918, who first noticed that the points in which an almost-periodic orbit would pierce a plane normal to their orbit would in many cases obey a discrete process closely analogous to the matrix model here. In the modern rephrasing (cf. Pontryagin, 1962), an orbit would be stable if all of the associated eigenvalues (“Lyapunov exponents”), both real and complex, had absolute values less than 1; otherwise the orbits would be unbounded or, if bounded, chaotic. In organismal biology, that concept of stable orbits proves a useful model for physiological cycles such as heartbeats. In that literature, the eigenvalues of analyses like ours, which are observed as Jacobian derivatives of change over time rather than cycle-to-cycle perturbations, are called “local Lyapunov exponents.” See, for instance, Abarbanel et al., 1992.
It is useful to contrast the general eigenproblem \(Av=\lambda v\) here with three much more strongly patterned special cases familiar from the usual graduate textbooks of multivariate biometrics. First, of course, is principal components analysis, the application to covariance matrices \(A=X^tX/N\) where X is a centered data matrix on N specimens. Second would be a different generalization, the singular-value decomposition (SVD) \(A=UDV^t\) where A, \(n \times p,\) is any matrix, D is an \(n\times p\) diagonal matrix of nonnegative singular values, and U and V are two different orthonormal matrices, the first one \(n\times n\) and the second \(p \times p.\) The technique of Partial Least Squares is the analysis of a general crosscovariance matrix \(A=X^tY/N\) by this method, where the data matrices X and Y involve the same specimens but are not otherwise restricted. Though in PLS the columns u, v of the matrices U and V are paired, there is no machinery for assessing one as the “opposite” of the other, and hence no equivalent of the discussion of stability as in the present Vilmann example. Finally, the technique of canonical variates analysis is the analysis of a nonsymmetric square matrix \(A=BW^{-1}\) where W is the matrix of “between-group covariances” (covariances of group averages) and W the pooled within-group covariances of the same variables. Analysis proceeds by converting the problem to a different covariance structure, \(A'=W^{-1/2}BW^{-1/2},\) extracting its principal components by the usual approach, and then interpreting the resulting eigenvectors by a postmultiplication by the remaining postfactor of \(W^{-1/2}.\) There is no role for complex numbers in any of these three conventional maneuvers.
When the matrix A is the covariance matrix of a set of Boas coordinates against their growth changes, the biological interpretation of a finding like that in the left panel of Fig. 18, one negative eigenvalue at a time, is similar to what we are accustomed to saying about a principal component. One reports that the effect of growth is to attenuate the variability around the average form by a reversion of covariance \(\vert \lambda \vert \) toward the mean in the specific direction of form space (linear combination of the Boas coordinates) given by the accompanying eigenvector v. This phenomenon can be reported as an ordinary stabilizing autoregression after the fashion of the Ornstein–Uhlenbeck models known to the field from their evolutionary applications (cf. Lande, 1979). One easy visualization is as in Fig. 5: the negative dependence of the change score in the direction given by coefficients v against its original value in this same direction. Surveying the entire Vilmann growth history over its full age range, one also encounters two autoregressions of this type that have positive slopes instead, but the corresponding covariances are too weak to be worth reporting.
Another echo of a property from PCA is the role of an eigenvalue as a covariance. Specifically, if \(\lambda \) is an eigenvalue of a matrix \(A=X^tY/N\) associated with the eigenvector v, scaled to have sum of squared loadings 1.0, then the covariance of the scores on mean-centered blocks X and Y with coefficients given by the vector v must be equal to \(\lambda .\) (This is easy to show algebraically. If \(Av=\lambda v,\) then \(v^tAv\) must be equal to \(v^t \lambda v = (v^tv)\lambda =\lambda ,\) since the eigenvector v was normalized to sum of squares \(v^tv=1.\) But \(v^tAv = v^t(X^tY/N)v=(Xv)^t(Yv)/N,\) which is just the formula for the covariance of the pair of linear combinations of the variables of the blocks X and Y by the same vector v of loadings. This is a variant of the same verification applying to the role of the SVD in the method of Partial Least Squares: cf. Bookstein, 2014, p. 371.) One can confirm this identity in the rightmost panels of Fig. 5 and later figures of the same layout.
When the eigenvalues are complex, the verbal interpretation needs to be a bit more complex as well. As in the right-hand panel of Fig. 18, the reportage must refer to both relevant dimensions of the canonical form, one for the real canonical vector (the real part of either eigenvector) and the other for the imaginary canonical. Assume (as is often the case for these Vilmann data) that the real component of the eigenvalue is a large negative number while the imaginary component is smaller (Fig. 7 or 15A). Then we can interpret the situation as if it pertains to two eigenvalues that are “almost the same”—the stabilizing of perturbations along both the \(v_1\) direction and the \(v_2\) direction in the right-hand panel of Fig. 18 by the same substantial attenuation factor—except for the same small sheared rotation along the dashed vectors that applies to both of the transformations there.
It is still the case that the covariance of the two scores generated by one of these complex eigenvectors, one for the X-block of forms and the other for the Y-block of form changes, must equal the eigenvalue. But now that eigenvalue consists of two terms that arise in two different ways. The real part of the eigenvalue is the covariance of the form’s score with the change’s score on the real part of the eigenvector minus the covariance of the two scores on the imaginary part of the eigenvector (because \(i^2=-1\)). At the same time, the imaginary part of the eigenvector is the sum of the covariance of the real part of the form score against the imaginary part of the change score and the imaginary part of the form score against the real part of the change score. This arithmetic was verified in Figs. 7, 13, and 14.
When the dominance of the real component of \(\lambda _1\) over the imaginary component is great enough (as it is for growth between ages 7 and 14, 30 and 40, and 60 and 90 days), the eigenanalysis corresponding just to the covariances reported in the panels corresponding to the right side of Fig. 18 is mainly real and is dominated, as one would expect, by the one-dimensional attenuation (stabilization) of the approximating single real canonical score. By comparison, Fig. 2.51 of Bookstein (2018) visualizes one striking special case of the complex alternative, a spiral process (still of Ornstein–Uhlenbeck form) in which the resetting of the relation to the mean is not any sort of “regression” but instead adjustment by a matrix \(\begin{pmatrix}\cos \theta &{}\sin \theta \\ -\sin \theta &{}\cos \theta \\ \end{pmatrix}\) representing rotation by some angle \(\theta \) without either increase or decrease of expected distance from the origin. A process governed by a matrix like that is often pertinent to studies in ecology (see, for instance, the detailed discussion of competition in Gold (1977), Sec. 7.7 or Strogatz (2014), Sec. 6.4), but would not be so plausible a finding if encountered in auxology.
Programming Suggestions (The following typed text, in the notation of Splus or R, is specific to the Vilmann data set; for your project, all counts are to be adjusted appropriately.) Begin with two arrays each 18 cases by 16 columns. The first eight columns of the matrix boas.fits.16 are the horizontal coordinates of the eight Boas coordinate pairs, followed by all their vertical coordinates. Coordinates of the matrix boas.changes.16 should be the changes in these coordinates from one wave of observation to the next. After centering, these play the roles of the matrices X, Y of the text.
To get the covariance matrix \(X^tY/N\), just ask for it:
\(\mathtt{covar.fit.chg<{-}var(boas.fits.16,boas.changes.16,unbiased=F)~}\)
(In statistical packages such as R or Splus, the var function will ordinarily return \({n\over n-1}\) times this expression, the version with \(n-1\) in the denominator instead of n. This adjustment—division of the total sum of crossproducts by 17 instead of 18, for our Vilmann example—makes little difference for any of the interpretations yielded by the method here, as it applies equivalently to the form-vs.-change covariances, the eigenvalues, and the covariances of scores. The option unbiased=F in the var call circumvents this problem neatly.) Likewise, to get the eigenanalysis of this matrix, again just ask:
\(\mathtt{eigen.16<{-}eigen(covar.fit.chg)}\)
This single routine is doing all the real work. In Splus, at least, it returns a sequence \(\mathtt{evals<{-}eigen.16}\) $ \(\mathtt{values}\) of 16 variously real and complex numbers, of which the last three are within a few quadrillionths of zero, and also a matrix \(\mathtt{evecs<{-}eigen.16}\) $ \(\mathtt{vectors}\), \(16\times 16,\) of the associated eigenvectors. (If any of the eigenvectors are complex, none of them are normalized to length 1.0; but they should all immediately be normalized both for the interpretation as covariances of the paired scores and, as the main text explained, to guarantee the orthogonality of the canonical vectors.) The eigen output fills the roles of the \(\lambda \)’s and v’s of the main text. When the eigenvalue is complex, so is the associated eigenvector; these are output in conjugate pairs as consecutive entries of evals and consecutive columns of evecs. (In Splus these eigenpairs are returned in descending order of modulus; in R, I am told, the reverse order.)
For real eigenvalues, the displacements at left and center in figures like Fig. 5 corresponding to the \(j{\mathrm{th}}\) eigenvalue and eigenvector are from
\(\mathtt{boas.mean<{-}apply(boas.fits,2,mean)}\),
the eight mean coordinates as complex numbers, to the same boas.mean shifted by an arbitrary multiple of
\(\mathtt{complex(r=evecs[1:8,j],i=evecs[9:16,j])}.\)
The assignment of the first 8 slots to the Boas x-coordinates and the last 8 to the y’s is, of course, the same as in the Boas vectorization that was the first step above.
For complex eigenvalues, of which you need draw only the first of each conjugate pair, the displacement and grid diagrams in the first two columns of figures like Fig. 7 show displacements from the same boas.mean by the real canonical vectors u computed as
\(\mathtt{complex(r=Re(evecs[1:8,j]),i=Re(evecs[9:16,j]))}\) (top row)
and the imaginary canonicals v computed as
\(\mathtt{complex(r=Im(evecs[1:8,j]),i=Im(evecs[9:16,j]))}\) (bottom row).
For the purposes of Figs. 15 and 16 these are typed as r and i, but in the equations, where i is reserved for \(\sqrt{-1}\), they are written u and v.
For a real eigenvalue evals[j], the corresponding scores are produced as
\(\mathtt{boas.fits.16\%{*}\%evecs[,j]}\) (horizontal in the right panel) and
\(\mathtt{boas.changes.16\%{*}\%evecs[,j]}\) (vertical).
For complex eigenvalues, the two canonical scores are the expressions
\(\mathtt{boas.fits.16\%{*}\%Re(evecs[,j])}\) and \(\mathtt{boas.fits.16\%{*}\%Im(evecs[,j])}\)
and their changes are the same formulas in boas.changes.16.
All the grids of this paper were produced by my twenty-year-old Fdrawtps utility. Every GMM package has its own version of this; there is nothing special about mine.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bookstein, F.L. A New Method for Landmark-Based Studies of the Dynamic Stability of Growth, with Implications for Evolutionary Analyses. Evol Biol 48, 428–457 (2021). https://doi.org/10.1007/s11692-021-09548-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11692-021-09548-8