Abstract
A general theory is developed for the eigenvalue effective size (\(N_{eE}\)) of structured populations in which a gene with two alleles segregates in discrete time. Generalizing results of Ewens (Theor Popul Biol 21:373–378, 1982), we characterize \(N_{eE}\) in terms of the largest non-unit eigenvalue of the transition matrix of a Markov chain of allele frequencies. We use Perron–Frobenius Theorem to prove that the same eigenvalue appears in a linear recursion of predicted gene diversities between all pairs of subpopulations. Coalescence theory is employed in order to characterize this recursion, so that explicit novel expressions for \(N_{eE}\) can be derived. We then study \(N_{eE}\) asymptotically, when either the inverse size and/or the overall migration rate between subpopulations tend to zero. It is demonstrated that several previously known results can be deduced as special cases. In particular when the coalescence effective size \(N_{eC}\) exists, it is an asymptotic version of \(N_{eE}\) in the limit of large populations.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The effective size \(N_e\) was introduced by Wright (1931, 1938) as the size of an ideal homogeneous population with the same rate of loss of heterozygosity per generation as the studied population. It has become one of the most important parameters in population genetics and conservation biology, as reviewed for instance by Crow and Denniston (1988), Orrive (1993), Caballero (1994), Wang and Caballero (1999), Waples (2002) and Charlesworth (2009).
Several closely related variants of \(N_e\) exist, and Crow (1954) first distinguished between the inbreeding effective size \(N_{eI}\), the quantity originally defined by Wright, and the variance effective size \(N_{eV}\). He also introduced a random extinction parameter that quantifies the long term rate at which genetic variants are lost. It is equivalent to the eigenvalue effective size \(N_{eE}\), defined in terms of the largest non-unit eigenvalue of a Markov chain of allele frequencies (Ewens 1982, 2004). The nucleotide diversity or mutation effective size \(N_{e\pi }\) is essentially the expected coalescence time of a pair of haploid individuals (Ewens 1989; Durrett 2008), whereas the coalescent effective size \(N_{eC}\) is defined for populations such that the ancestral tree of any finite number of individuals converges to a Kingman coalescent in the limit of large populations (Kingman 1982; Nordborg and Krone 2002; Sjödin et al. 2005; Wakeley and Sargsyan 2009; Hössjer 2011).
In this paper we provide a general theory of \(N_{eE}\) for mutation free structured populations, in which a selectively neutral marker (referred to as a gene) with two variants or alleles segregates. The population consists of \(s\) homogeneous subpopulations (geographic sites, age classes, sexes or combinations thereof) and evolves in discrete time, with constant census sizes of all subpopulations.
Ewens (2004) reviewed results on \(N_{eE}\) for homogeneous populations, showing that it agrees with \(N_{eI}\) and \(N_{eV}\) for the Wright-Fisher model (Wright 1931; Fisher 1958), Kimura’s multi-hypergeometric model (Kimura 1957), conditional branching process models (Karlin and McGregor 1965) or more generally models in which offspring numbers are exchangeable (Cannings 1974).
Results on \(N_{eE}\) for structured populations are less complete. Ewens (1982) showed that \(N_{eE}\) may differ from \(N_{eI}\) and \(N_{eV}\) for two-sex models. Cabellero and Hill (1992) and Nagylaki (1995) considered a number of diploid models and derived formulas for an effective size based on the long term decay of heterozygosity. Chesser et al. (1993) and Wang (1997a, b) analyzed the island model with two sexes. They derived linear recursion formulas for the inbreeding coefficient and the coancestry of individuals from the same and different subpopulations, and computed an effective size from the largest eigenvalue of this recursion. Felsenstein (1971) computed the effective size for models with \(s\) age classes, and found the effective size from the largest eigenvalue of a linear recursion of \(s^2\) non-identity by descent probabilities of genes drawn with replacement from all pairs of age groups. Maruyama (1970a) derived a similar effective number for the circular stepping stone model under large population and small migration rate limits. Tufto et al. (1996) and Tufto and Hindar (2003) defined the eigenvalue effective size from a linear recursion of covariances between of all pairs of subpopulations.
All these notions of effective size are derived in terms the largest eigenvalue \(\lambda \) of linear recursions of covariances or probabilities of identity by descent or state. Whitlock and Barton (1997) showed that these linear recursions are closely related. They also argued briefly that the transition matrix of the Markov chain of allele frequencies has its largest non-unit eigenvalue equal to \(\lambda \), and therefore all effective sizes of the previous paragraph agree with \(N_{eE}\).
Motivated by the argument of Whitlock and Barton (1997), our main purpose in this paper is to provide a general framework for exact and asymptotic computation of \(N_{eE}\) for a large class of structured populations with stochastic backward migration and exchangeable reproduction within subpopulations. In Sect. 2 we introduce the population genetic model and Ewens’ definition of \(N_{eE}\) in terms of the rate at which the Markov process of allele frequencies in all subpopulations reaches an absorbing state, quantified by the largest non-unit eigenvalue \(\lambda \) of its transition matrix. In Sect. 3 we focus on gene diversities, i.e. probabilities of genes not being identical by state. We introduce an \(s^2\)-dimensional deterministic process of predicted gene diversities and prove that whenever it is a linear recursion, \(\lambda \) also equals the largest eigenvalue of the matrix \({\varvec{A}}\) of this recursion. In Sect. 4 we show how the elements of \({\varvec{A}}\) are obtained from coalescence theory in new settings that generalize previous work, as illustrated with several examples in Sect. 5. Asymptotic approximations of \(\lambda \) and \(N_{eE}\) are obtained from perturbation theory of eigenvalues of matrices in Sect. 6, when either the population gets large and/or the migration rate gets small. This gives novel asymptotic expressions for \(N_{eE}\) that, for instance, in the limit of large populations agrees with the coalescence effective size \(N_{eC}\) when the latter exists. A discussion follows in Sect. 7 and proofs are collected in the “Appendix”.
2 Model of reproduction, migration and allele frequency change
Consider a population of \(N\) individuals, divided into \(s\) subpopulations
of constant sizes \(N_1=Nu_1,\ldots ,N_s=Nu_s\), with \(u_i\ge 0\) and \(\sum _{i\in \mathcal{I}} u_i = 1\). Each individual carries two copies of a selectively neutral gene so that subpopulation \(i\) has a total of \(2N_i\) genes.
The population evolves in discrete time (not necessarily generations) \(t=0,1,\ldots \), with the genes of each subpopulation \(k\in \mathcal{I}\) at time \(t-1\) numbered \(g=1,\ldots ,2N_k\), and \(\nu _{tkig}\) referring to the number of offspring of gene \(g\) that migrate to subpopulation \(i\) at time \(t\). The total gene flow from \(k\) to \(i\) between \(t-1\) and \(t\) is summarized by the backward migration rate
i.e. the fraction of genes a time \(t\) and subpopulation \(i\) that originate from \(k\) at time \(t-1\). The matrix \({\varvec{\mathcal {B}}}_t=(\mathcal{B}_{tij})\) is referred to as the observed backward migration matrix at time \(t\). Since its row sums are one, it is the transition matrix of a Markov chain with state space \(\mathcal{I}\).
Let \({\varvec{\nu }}_{tkg}=(\nu _{tk1g},\ldots ,\nu _{tksg})\) summarize the frequency distribution of the offspring of gene \(g\) of subpopulation \(k\) at time \(t-1\) in all subpopulations. Assume that \(\{{\varvec{\nu }}_{tkg}\}_{g=1}^{2N_k}\) are exchangeable random vectors, conditionally on \(\varvec{\mathcal {B}}_t\), and that \(\{\varvec{\mathcal {B}}_t\}\) are independent and identically distributed random matrices with
where \({\varvec{B}}=(B_{ik})\), the expected backward migration matrix, is the transition matrix of a Markov chain with state space \(\mathcal{I}\). We assume that the \({\varvec{B}}\) is irreducible and aperiodic, with a unique equilibrium distribution \({\varvec{\gamma }}= (\gamma _1,\ldots ,\gamma _s)\) satisfying \(\gamma _i\ge 0\), \(\sum _{i=1}^s \gamma _i = 1\) and
It follows from (2) that the observed forward migration rate between \(k\) and \(i\), i.e. the expected number of offspring of the genes in subpopulation \(k\) at time \(t-1\) that at time \(t\) end up in subpopulation \(i\), is
In order to keep subpopulation sizes constant over time, the total contribution in (5) from all parental populations \(k\) must be constant, i.e.
Let \({\varvec{\mathcal {M}}}_t = (\mathcal{M}_{tki})\) be the observed forward migration matrix of time \(t\). It follows from (3) and (5) that the corresponding expected forward migration matrix \({\varvec{M}}=E(\varvec{\mathcal {M}}_t)\) has elements \(M_{ki}=E(\nu _{tkig})\) related to those of \({\varvec{B}}\) as
for \(1\le k,i \le s\). Taking expectations on both sides of (6), we find that the vector \({\varvec{u}}=(u_1,\ldots ,u_s)\) of relative subpopulation proportions satisfies
a left eigenvector of \({\varvec{M}}\) with eigenvalue 1. The two vectors \({\varvec{u}}\) and \({\varvec{\gamma }}\) are identical for conservative migration (Nagylaki 1980), but in general they differ.
Consider a biallelic genetic marker, and let \({\varvec{X}}_t = (X_{t1},\ldots ,X_{ts})^\prime \) be a column vector of (relative) frequencies of one of the two alleles in all subpopulations at time \(t\), where prime denotes transposition. Since \(\{{\varvec{\nu }}_{tkg}\}_{g=1}^{2N_k}\) are exchangeable, we may number the genes of subpopulation \(k\) and time \(t-1\) so that the first \(2N_kX_{t-1,k}\) have the specified allele. Then the allele frequency drift from one time point to the next can be summarized as
The following result is a simple consequence of (3), (5) and (9):
Proposition 1
Assume that \(\mathcal{B}_t\) is independent of \(\{{\varvec{X}}_{s}\}_{s\le t-1}\). Then the sequence \(\{{\varvec{X}}_t\}\) of allele frequencies satisfies
where \({\varvec{x}}\) is a column vector of allele frequencies of length \(s\).
We can rephrase Proposition 1 as \(\{{\varvec{X}}_t\}\) being a vector-valued autoregressive process of order 1 (Brockwell and Davis 1991). This process will be heteroscedastic, since the covariance matrix \(\text{ Var }({\varvec{X}}_{t}|{\varvec{X}}_{t-1}={\varvec{x}})\) varies with \({\varvec{x}}\). The dynamics of \(\{{\varvec{X}}_t;\, t\ge 0\}\) is described more generally by means of a time homogeneous Markov chain with a state space
of size \(|\mathcal{X}|= \prod _{i=1}^s (2N_i +1)\), and a transition kernel \({\varvec{P}}= \left( P({\varvec{x}},{\varvec{y}})\right) \), with elements \(P({\varvec{x}},{\varvec{y}}) = P({\varvec{X}}_{t}={\varvec{y}}|{\varvec{X}}_{t-1}={\varvec{x}})\) for all \({\varvec{x}},{\varvec{y}}\in \mathcal{X}\). Since our model is free of mutations and no subpopulation is isolated, sooner or later one of the two alleles will be fixed in all subpopulations. This can be phrased as \(\{{\varvec{X}}_t\}\) having two absorbing states \({\varvec{0}}=(0,\ldots ,0)\) and \({\varvec{1}}=(1,\ldots ,1)\), so that \({\varvec{P}}\) is reducible with two stationary distributions \(\pi _1({\varvec{x}}) = 1_{\{{\varvec{0}}\}}({\varvec{x}})\) and \(\pi _2({\varvec{x}}) = 1_{\{{\varvec{1}}\}}({\varvec{x}})\), one for each of the absorbing states, with \(1_{\mathcal{Y}}({\varvec{x}})\) the indicator function of \(\mathcal{Y}\subset \mathcal{X}\). Write \({\varvec{\pi }}_i = (\pi _i({\varvec{x}});\, {\varvec{x}}\in \mathcal{X})\) for the corresponding two row vectors of length \(|\mathcal{X}|\). Since \({\varvec{\pi }}_i = {\varvec{\pi }}_i{\varvec{P}}\) for \(i=1,2\), they are left eigenvectors of \({\varvec{P}}\) with eigenvalue 1. We divide \(\mathcal{X}= \cup _{i=1}^n \mathcal{X}_i\) into components \(\mathcal{X}_1=\{{\varvec{0}}\},\mathcal{X}_2=\{{\varvec{1}}\},\mathcal{X}_3,\ldots ,\mathcal{X}_n\) that induce a block form
of the transition matrix, with zero blocks above the diagonal. For any function \(\phi : \mathcal{X}\rightarrow {\mathbb {R}}\), let \(\phi = (\phi ({\varvec{x}});{\varvec{x}}\in \mathcal{X})^\prime \) be a column vector of function values. Then \({\varvec{P}}\) acts as an operator \(\phi \rightarrow {\varvec{P}}\phi \) on \({\mathbb {R}}^\mathcal{X}\), as
where \(E_{{\varvec{x}}}\) denotes expectation conditionally on \({\varvec{X}}_0={\varvec{x}}\), cf. e.g. Norris (2008). In particular, the column vectors generated from
are both right eigenvectors of \({\varvec{P}}\) with eigenvalue 1, i.e. \(\phi _i = {\varvec{P}}\phi _i\) for \(i=1,2\). Indeed, this follows from (4) and (10), since
and similarly for \({\varvec{\phi }}_1\).
In order to find the rate of fixation of one of the two alleles, we need to look at \({\varvec{P}}^t\) for large \(t\). We apply Markov chain theory and find this rate among all possibly complex-valued eigenvalue of \({\varvec{P}}\), as the largest non-unit one. More specifically, in the “Appendix” we use Perron–Frobenius Theorem (see for instance Cox and Miller 1965) as a main ingredient for establishing the following:
Theorem 1
Suppose the square submatrices \({\varvec{P}}_{ii}\) in (11) along the diagonal are irreducible and aperiodic with at least one row sum less than one, for \(i=3,\ldots ,n\). Then the eigenvalues \(\lambda _i=\lambda _i({\varvec{P}})\) of \({\varvec{P}}\) (including multiplicity), can be ordered as
with
Moreover, if the maximum in (14) is attained uniquely for \(i=k\), then
as \(t\rightarrow \infty \), where \({\varvec{R}}_{ii}={\varvec{0}}\) for \(i\ne k\), \({\varvec{R}}_{kk} = \phi _k{\varvec{q}}_k\), \({\varvec{q}}_k=\left( q_k({\varvec{x}});\, {\varvec{x}}\in \mathcal{X}_{k}\right) \) is a row vector and \(\phi _k = \left( \phi _k({\varvec{x}}); \,{\varvec{x}}\in \mathcal{X}_k\right) ^\prime \) a column vector, both with strictly positive elements, \({\varvec{R}}_{ij}\) has non-negative elements for \(j\ge 3\), and the remainder term is a matrix with all its \(|\mathcal{X}|^2\) elements of smaller order than \(\lambda _3^t\).
The requirement on \({\varvec{P}}\) in Theorem 1 is very weak, essentially that \(\mathcal{X}_3,\ldots ,\mathcal{X}_n\) contain transient states, so that no subpopulation is isolated and eventually one of the two alleles will be fixed in all subpopulations. When \(m=3\) there is only one component of transient states, and migration is then possible within a finite number of time steps, back and forth between any pair of subpopulations. A recursive formula is provided in the “Appendix” for all \({\varvec{R}}_{ij}\), and for \({\varvec{R}}_{kk}\) we can normalize the two vectors \({\varvec{\phi }}_k\) and \({\varvec{q}}_k\) so that \(\sum _{{\varvec{x}}\in \mathcal{X}_k} q_k({\varvec{x}}) = \sum _{{\varvec{x}}\in \mathcal{X}_k} \phi _k({\varvec{x}})q_k({\varvec{x}})=1\). Then \({\varvec{q}}_k\) is the quasi equilibrium distribution
of \({\varvec{X}}_t\) conditionally on starting and remaining in \(\mathcal{X}_k\) (Darroch and Seneta 1965; Collet and Martinez 2013). The quasi equilibrium distributions for the other \(\mathcal{X}_3,\ldots ,\mathcal{X}_n\) are part of the remainder term of (15).
The following important corollary of Theorem 1 deals with the asymptotic decay rate of the expected value of \(\phi ({\varvec{X}}_t)\) for a large class of functions:
Corollary 1
Let \(\phi :\mathcal{X}\rightarrow {\mathbb {R}}\) be a function satisfying
where \(\mathcal{X}_k\) is the component for which the maximum in (14) is attained. Then
where \({\varvec{R}}=({\varvec{R}}_{ij})=\left( R({\varvec{x}},{\varvec{y}})\right) \) is the matrix in (15), and \(E_\pi \) denotes expected value conditional on \(P_\pi ({\varvec{X}}_0={\varvec{x}})=\pi ({\varvec{x}})\). In particular, the right hand side of (18) is strictly positive if \(\pi (\mathcal{X}_k)>0\).
Corollary 1 shows that the largest non-unit eigenvalue \(\lambda _3=\lambda _3({\varvec{P}})\) determines the rate of decrease of the expected value \(E_\pi (\phi ({\varvec{X}}_t))\) as \(t\rightarrow \infty \). Putting \(\phi ({\varvec{x}})=1_{\{{\varvec{x}}\notin \mathcal{X}_1\cup \mathcal{X}_2\}}\), we notice that the probability of non-fixation decreases with \(t\) at this rate, so that \(\lambda _3\) is the rate of fixation and the eigenvalue of main interest. We will often simplify notation and write
The Wright-Fisher (WF) model is a homogeneous population (\(s=1\)) of \(N\) diploid individuals, with \(X_{t}|X_{t-1} \sim \text{ Bin }(2N,X_{t-1})/(2N)\). Feller (1951) found all the eigenvalues of the transition matrix for the WF model, and in particular
For an allele frequency process \(\{{\varvec{X}}_t\}\) with transition matrix \({\varvec{P}}\), we define the eigenvalue effective size
as the size of a WF population for which the largest non-unit eigenvalue in (20) is the same as for the studied population.
3 Rate of decay of predicted gene diversities
Following Nei (1973, 1987), we define the gene diversity
between subpopulations \(i\) and \(j\) at time \(t\) as the probability that two randomly chosen genes from \(i\) and \(j\) (picked with replacement if \(i=j\)) are different by state, i.e. have different types of alleles. Regarding \(t=0\) as present and \(t>0\) as future, let
be the predicted gene diversity between \(i\) and \(j\) at time \(t\), given an initial distribution \({\varvec{X}}_0\sim \pi \). We collect the predicted gene diversities between all pairs of subpopulations into a column vector
of length \(s^2\), where vec is the vectorization operator that converts a matrix into a column vector by stacking its columns on top of each other. In order to compute linear combinations of the elements of \({\varvec{H}}_t\), we define weights \(W_{ij}\) for all pairs of subpopulations, and prove the following:
Proposition 2
Suppose \({\varvec{W}}= \text{ vec }\left( (W_{ij})_{i,j=1}^s\right) ^\prime \) is a row vector of length \(s^2\) with non-negative weights \(W_{ij}\ge 0\) satisfying the symmetry condition \(W_{ij}=W_{ji}\) for all \(i,j\), and let \(\phi _{{\varvec{W}}}({\varvec{x}}) = 2\sum _{i,j=1}^s W_{ij}x_i(1-x_j)\) be a quadratic functional of \({\varvec{x}}=(x_1,\ldots ,x_s)^\prime \). Then
To see the importance of Proposition 2, we notice that a sufficient condition for \(\phi _{{\varvec{W}}}\) to satisfy (17) is that all \(W_{ij}>0\). It therefore follows from Corollary 1 that we can find \(\lambda =\lambda _3({\varvec{P}})\) and hence also \(N_{eE}\) from the rate of decrease to \(0\) of linear combinations of \(H_{tij}\). It is therefore of interest to study the time dynamics of \({\varvec{H}}_t\), and we will assume that a non-negative square matrix \({\varvec{A}}\) of order \(s^2\) exists, so that \({\varvec{H}}_t\) satisfies the linear recursion
for \(t=1,2,\ldots \). It will be convenient to introduce
the set of all pairs of subpopulations (cf. (1)), and write the elements of \({\varvec{A}}\) as
where \(ij\) and \(kl\) is short hand notation for the row and column numbers obtained from the stacking procedure of the vec operation. We can always divide (26) into \(m\) irreducible components \(\mathcal{I}_2 = \mathcal{C}_1\cup \cdots \cup \mathcal{C}_m\). After a possible reordering of the elements of \(\mathcal{I}_2\), this gives a corresponding block decomposition
of (27), with \({\varvec{A}}_{aa}=(A_{ij,kl})_{ij,kl\in \mathcal{C}_a}\) irreducible for \(a=1,\ldots ,m\). Typically, the pair of ancestors of two genes, picked from any pair \(\mathcal{I}_2\) of subpopulations, will ultimately belong to \(\mathcal{C}_1\), provided the ancestry is traced sufficiently far back in time. The following fundamental result clarifies the importance of \({\varvec{A}}\):
Theorem 2
Let \(\{{\varvec{X}}_t\}\) satisfy the conditions of Theorem 1, with \(\lambda =\lambda _3({\varvec{P}})\) the largest non-unit eigenvalue of its transition matrix \({\varvec{P}}\), defined in (19). Assume that (25) holds, with \({\varvec{A}}\) having non-negative elements. Then
where \({\varvec{A}}_{aa}\) are the diagonal matrices of (28). If the maximum in (29) is attained for a unique \(1\le c\le m\), then
with \({\varvec{\rho }} =\text{ vec }\left( (\rho _{ij})_{ij\in \mathcal{I}_2}\right) ^\prime \) and \({\varvec{r}}= \text{ vec }\left( (r_{ij})_{ij\in \mathcal{I}_2}\right) \) left and right eigenvectors
of \({\varvec{A}}\) with eigenvalue \(\lambda \). Explicit expressions for \({\varvec{\rho }}\) and \({\varvec{r}}\) are provided in the “Appendix”, and their components can be normalized so that \(\rho _{ij},r_{ij}>0\) for \(ij\in \mathcal{C}_c\), \(\sum _{ij\in \mathcal{C}_c} \rho _{ij} = \sum _{ij\in \mathcal{C}_c} \rho _{ij}r_{ij}=1\), \(\rho _{ij}=0\) for \(ij\in \mathcal{C}_{c+1}\cup \cdots \cup \mathcal{C}_m\) and \(r_{ij}=0\) for \(ij\in \mathcal{C}_{1}\cup \cdots \cup \mathcal{C}_{c-1}\).
The following key result follows from (21) and Theorems 1–2:
Corollary 2
Suppose \(\{{\varvec{X}}_t\}\) satisfies the conditions of Theorem 1, with a linear recursion (25) for predicted gene diversities in terms of a non-negative quadratic matrix \({\varvec{A}}\). Then the eigenvalue effective size is
4 Coalescence probabilities
In this section we derive a linear recursion (25) for the predicted gene diversity vector \({\varvec{H}}_t\), using probabilities that ancestral lineages of two genes coalesce. This enables us to compute \(N_{eE}\) from (32).
Many authors have derived linear recursions for identity by descent probabilities, gene diversities, covariances or coalescence probabilities of subdivided populations with spatial, age, sex or some other structure, with or without mutations. This includes results in Malécot (1951), Hill (1972), Li (1976), Sawyer (1976), Felsenstein (1982), Slatkin (1991), Nagylaki (2000), Ryman and Leimar (2008), Durrett (2008), Hössjer and Ryman (2014), Hössjer et al. (2014) and other papers mentioned in the introduction. They utilize coalescence probabilities more or less explicitly. We will generalize several of these results for constant subpopulation sizes, allowing backward migration rates to be stochastic and reproduction within subpopulations to have a general form.
Consider a structured coalescent (Notohara 1990; Herbots 1997; Wakeley 1998) for two genes, with coalescence events formulated hierarchically in two steps, first for subpopulations and then for genes. We draw two different genes from the population at time \(t\) and consider their joint ancestral subpopulation history \((I_\tau ,J_\tau )_{\tau =0}^T\) at times \(\{t-\tau \}_{\tau =0}^T\) until they find their most recent common ancestor at \(t-T\), where \(T\) is the coalescence time. Let
be the probability that the parents of two genes drawn from subpopulations \(i\) and \(j\) at time \(t\) have parents from subpopulations \(k\) and \(l\), conditionally on \(\varvec{\mathcal {B}}_{t}\). Since there are \(2Nu_i\mathcal{B}_{tik}\) genes of subpopulation \(i\) at time that originate from subpopulation \(k\), it follows that
We gather all these probabilities into an observed backward migration matrix \(\varvec{\mathcal {Q}}_{t} =(\mathcal{Q}_{t,ij,kl})_{ij\in \mathcal{I}_2,kl\in \mathcal{I}_2}\) of order \(s^2\) for pairs of subpopulations. By averaging with respect to \(\varvec{\mathcal {B}}_{t}\), we then define the unconditional probabilities
that the parents of two genes from \(i\) and \(j\) belong to \(k\) and \(l\), and collect them into a square matrix \({\varvec{Q}}=(Q_{ij,kl})_{ij\in \mathcal{I}_2,kl\in \mathcal{I}_2}\) of order \(s^2\). The following result summarizes the role of \(\varvec{\mathcal {Q}}_t\) and \({\varvec{Q}}\) and is stated without proof:
Proposition 3
Suppose segregation is independent between generations and condition on that coalescence events do not take place. Then the joint subpopulation ancestry \(\{(I_\tau ,J_\tau )\}_{\tau =0}^t\) of a pair of different genes from time \(t\) back to time 0 is a Markov chain with state space \(\mathcal{I}_2\) that (a) conditionally on \(\{\varvec{\mathcal {B}}_{t-\tau }; \, \tau \ge 0\}\) has time varying transition matrices \(\{\varvec{\mathcal {Q}}_{t-\tau }\}_{\tau =0}^{t-1}\), (b) unconditionally has a time invariant transition matrix \({\varvec{Q}}\).
The next step is to incorporate coalescence events. To this end, let
be the probability that two genes from subpopulations \(i\) and \(j\) at time \(t\) have the same parent, given that the parent belongs to \(k\), and conditionally on \(\varvec{\mathcal {B}}_{t}\). The corresponding unconditional coalescence probability \(p_{ijk}\) that two genes from \(i\) and \(j\) that both have their parents in \(k\), have the same parent, is
It will be helpful to introduce the quantities
in order to get a more explicit expression for \(p_{ijk}\), where \(\nu _{tkig}\) in (2) is the number of offspring of gene \(g\) in subpopulation \(k\) at time \(t-1\) that end up in \(i\) at time \(t\). The variables \(\{V_{kij}\}_{i,j=1}^s\) quantify the average variability of reproductive success among individuals in subpopulation \(k\), and the coalescence probabilities are closely related to standardized versions of them:
Theorem 3
Suppose the backward subpopulation ancestry of two different genes before coalescence is a Markov chain, with transition matrix \({\varvec{Q}}=(Q_{ij,kl})\) as in (35), and define coalescence probabilities \(p_{ijk}\) as in (37). Then
with \(V_{kij}\) defined in (38), and the components (22) of the gene diversity vectors \({\varvec{\mathcal {H}}}_t = \text{ vec }\left( (\mathcal{H}_{tij})_{i,j=1}^s\right) \) form a multivariate autoregressive process
of order one for \(t=0,1,\ldots \), with \(\epsilon _{tij}\) satisfying \(E(\epsilon _{tij}|{\varvec{X}}_{t-1})=0\), and
The expected gene diversity \({\varvec{H}}_t\) satisfies (25), with \({\varvec{A}}=(A_{ij,kl})\) as in (41).
It turns out that the right eigenvector of \({\varvec{P}}\) corresponding to its third largest eigenvalue \(\lambda _3\) can be found explicitly, whenever (40) holds:
Corollary 3
Suppose the conditions of Theorems 1 and 2 hold, with a gene diversity process \(\varvec{\mathcal {H}}_t\) satisfying (40). Then the transition matrix \({\varvec{P}}\) of the Markov chain \({\varvec{X}}_t\) has a right eigenvector \(\phi _3=(\phi _3({\varvec{x}});\, {\varvec{x}}\in \mathcal{X})^\prime \), corresponding to the third largest eigenvalue \(\lambda =\lambda _3({\varvec{P}})\), with
where \({\varvec{\rho }} = (\rho _{ij})\) is the left eigenvector (14) corresponding to the largest eigenvalue \(\lambda \) of \({\varvec{A}}=(A_{ij,kl})\). Moreover,
where \(\mathcal{X}_k\) is the component of \(\mathcal{X}_1\cup \cdots \mathcal{X}_n\) for which the maximum in (14) is uniquely attained. In particular, the restriction of \(\phi _3\) to \(\mathcal{X}_k\) agrees (up to a multiplicative constant) with the function \(\phi _k\) of Theorem 1.
5 Examples
The key formula (41) provides a general way to find \({\varvec{A}}\) and hence also \(N_{eE}\) through (32). We study its two main ingredients; expected backward migration rates \(Q_{ij,kl}\) for pairs of genes in Subsect. 5.1, and coalescence probabilities \(p_{ijk}\) in Subsect. 5.2. Then we apply these findings to a number of models in Subsect. 5.3 in order to show the generality of Theorem 3, explain how to apply it and as a preparation for the asymptotics of Sect. 6.
5.1 Backward migration
Example 1
(Fixed backward migration). When the observed backward migration rates are non-random, we must have
in order to satisfy (3). It follows from (34), (35) and (44) that
Following the nomenclature of Sved and Latter (Sved and Latter (1977)), we refer to (44) as fixed migration rates. \(\square \)
Example 2
(Dirichlet multinomial backward migration). Denote the \(i\):th rows of \({\varvec{B}}\) and \(\varvec{\mathcal {B}}_t\) by \({\varvec{B}}_i = (B_{i1},\ldots ,B_{is})\) and \(\varvec{\mathcal {B}}_{ti}=(\mathcal{B}_{ti1},\ldots ,\mathcal{B}_{tis})\) respectively. Specify parameters \(0< \alpha _i< \infty \), a random matrix \(\check{\varvec{\mathcal {B}}}_t\) with rows
that are independent for \(i=1,\ldots ,s\), and assume
are conditionally independent and multinomially distributed random vectors, given all \(\check{\varvec{\mathcal {B}}}_{ti}\). Combining (46) and (47), the rows of \(\varvec{\mathcal {B}}_t\) are independent random vectors with Dirichlet multinomial distributions. The parameters \(\alpha _i\) quantify the amount of variability of the rows of \(\check{\varvec{\mathcal {B}}}_t\). We will also extend (46) to \(\alpha _i\in \{0,\infty \}\), which in conjunction with (47) gives two degenerate cases within the Dirichlet multinomial family: When \(\alpha _i\equiv \infty \), \(\check{\varvec{\mathcal {B}}}_t={\varvec{B}}_t\) is fixed, and
whereas if \(\alpha _i\equiv 0\) we have
so that when time proceeds backwards, the ancestral history of all genes within a subpopulation will vary according to the same Markov chain with transition matrix \({\varvec{B}}\). From a forward perspective, the latter system has \(s\) subpopulation, but the backwards scenario will be identical to that of a single population whose size varies between \(2Nu_1,\ldots ,2Nu_s\), according to a Markov chain with transition kernel \({\varvec{B}}\), see for instance Jagers and Sagitov (2004), Sampson (2006), Pollak (2010), Kaj and Krone (2003) and Sano et al. (2004).
For any \(0\le \alpha _1,\ldots ,\alpha _s \le \infty \), we notice that
in accordance with (3). Since two genes of subpopulations \(i\) and \(j\) are drawn independently with multinomial distributions from rows \(i\) and \(j\) of \(\check{\varvec{\mathcal {B}}}_t\), it follows from (34) that \(E\left( \mathcal{Q}_{tij,kl}|{\varvec{\check{\mathcal{B}}}}_t\right) = \check{\mathcal{B}}_{tik}\check{\mathcal{B}}_{tjl}\). Since the rows of \(\check{\varvec{\mathcal {B}}}_t\) have independent Dirichlet distributions (46);
which simplifies to
when \(\alpha _i\equiv \infty \) and the rows of \(\varvec{\mathcal {B}}_t\) have multinomial distributions. \(\square \)
5.2 Coalescence probabilities
Example 3
(Mixed multinomial reproduction). Coalescence probabilities require that a reproduction scheme is specified. A fairly general scheme is defined by dividing the time interval between \(t-1\) and \(t\) into three steps. In a first fertilization step, a gamete pool of infinite size is created for each parental subpopulation \(k\), to which the \(2N_k\) genes contribute in fractions \(\omega _{tk1},\ldots ,\omega _{tk,2N_k}\) that are exchangeable random variables summing to one. In a second migration migration step, the \(s\) gamete pools mix, so that the infinitely sized post-migration gamete pool \(i\) is a mixture of pre-migration pools \(1,\ldots ,s\) in proportions \(\check{\mathcal{B}}_{ti1},\ldots ,\check{\mathcal{B}}_{tis}\). In the final reproduction step, subpopulation \(k\) at time \(t\) is formed by drawing \(2N_k\) genes from post migration gamete pool \(k\). Then (47) holds, and regardless of the distribution of \(\check{\mathcal{B}}_{tik}\),
independently between all pairs of parental and offspring subpopulations \(k\) and \(i\), where \({\varvec{\omega }}_{tk} = (\omega _{tk1},\ldots ,\omega _{tk,2N_k})\).
When (52) holds, we evaluate the expected value in (38) by conditioning on \(\mathcal{B}_{tik}\) and \({\varvec{\omega }}_{tk}\) and then insert into (39). By second moment properties of the multinomial distribution; this yields
and this can also (more easily) be obtained by a direct argument. The coalescence probability \(p_{ijk}=p_k\) in (53) only depends on the parental subpopulation \(k\), since \({\varvec{\omega }}_{tk}\) is the same, regardless of the offsprings’ subpopulations \(i\) and \(j\). In the second step of (53) we used \(E(\omega _{tk1})=1/(2N_k)\), since the components of \({\varvec{\omega }}_{tk}\) are exchangeable. We interpret \(N_{eI,k}\le N_k\) as a local inbreeding effective size of subpopulation \(k\), with equality if and only if \({\varvec{\omega }}_{tk}\) is non-random. \(\square \)
Example 4
(Survival indicators). Consider the genes \(g=1,\ldots ,N_k\) of subpopulation \(k\) at time \(t-1\), and assume that transitions \(k\rightarrow i\) from time \(t-1\) to a specific subpopulation \(i\) at time \(t\) represent their survival, with \(\nu _{tkig}\in \{0,1\}\) an indicator for gene \(g\) to be still alive at time \(t\). Then
follows from (38) to (39). For all other subpopulations \(j\ne i\) we let \(\nu _{tkjg}\) refer to the number of offspring of \(g\). Then \(V_{kij} = P(\nu _{tki1}=1) E(\nu _{tkj1}| \nu _{tki1}=1) =: M_{ki}M_{k,j|i}\), where \(M_{k,j|i}\) is the expected number of offspring in \(j\) for a gene that survives, possibly different from the expected number of offspring \(M_{kj}=E(\nu _{tkj1})\) of any gene in \(k\). Since (34)–(35) imply \(Q_{ij,kk}=B_{ik}B_{jk}\), it follows from (7) and (39) that the coalescence probability
is the inverse of the local census size of \(k\) times a correction factor that quantifies how correlated survival is with number of progeny in \(j\). \(\square \)
5.3 Example models
Example 5
(Cannings models). For a homogeneous population (\(s=1\)), we may drop subpopulation index and write \(\nu _{t11g}=\nu _{tg}\), with \(\nu _{t1},\ldots ,\nu _{t,2N}\) exchangeable random variables. Since subpopulation sizes and backward migration probabilities are trivial (\(u_1=Q_{11,11}=1\)), we only need to specify coalescence probabilities \(p=p_{111}\). It follows from (38)–(39) that
and \(A=A_{11,11}=1-p\) due to (41). Insertion into (32) yields
in agreement with Section 3.7 of Ewens (2004). Notice that (56) implies \(N_{eE}=N\) for a Wright-Fisher model, since \(\nu _{t1}\sim \text{ Bin }(2N,1/(2N))\). \(\square \)
Example 6
(Sink and source populations). This population has a source of size \(N_1\) and a sink of size \(N_2\). It is assumed that on average \(N_{ \text{ emig }}\) individuals emigrate from 1 to 2 per generation. The migration scheme is degenerate in the sense that the expected forward and backward migration matrices are both reducible;
although the equilibrium distribution \({\varvec{\gamma }}=(1,0)\) of \({\varvec{B}}\) is still unique. We assume multinomial backward migration (51), and reproduction where parents are drawn uniformly and independently from the parental subpopulation, corresponding to \(p_{ijk}=1/(2N_k)\) in (53). If subpopulations are ordered as \(\mathcal{I}_2=\{11,12,21,22\}\), we find from (41) that
It is evident that \({\varvec{A}}\) has a block structure (28), with \(\mathcal{C}_1=11\), \(\mathcal{C}_2 = 12\), \(\mathcal{C}_3 = 21\) and \(\mathcal{C}_4 = 22\). Combining the expression for \({\varvec{A}}\) with (29) and (32), we obtain
For large enough migration rates \(N_{ \text{ emig }}\ge N_2/(2N_1)\), the source population will determine the effective size, whereas for small migration rates \(N_{ \text{ emig }}<N_2/(2N_1)\), the two populations get increasingly isolated, and the effective size tends to infinity. It turns out that Theorem 1 applies with \(n=5\) components, of which the three non-absorbing ones are
For large migration rates, the eigenvalue \(\lambda \) of \({\varvec{P}}\) and \({\varvec{A}}\) is found within components \(\mathcal{X}_5\) and \(\mathcal{C}_1\), and (42) simplifies to \(\phi _3({\varvec{x}})=\rho _{11}x_1(1-x_1)\). For small migration rates, we find \(\lambda \) within \(\mathcal{X}_3,\mathcal{X}_4\) and \(\mathcal{C}_2,\mathcal{C}_3\) respectively. \(\square \)
Example 7
(Combined age and spatial structure). Age structured models have been studied by Felsenstein (1971), Hill (1972), Kaj et al. (2001), Sagitov and Jagers (2005) and Hössjer (2011). Here we consider a population with \(s=2z\) that has two demes with \(z\) age classes each. Subpopulations are numbered so that \(i\) (\(z+i\)) corresponds to age class \(i=1,\ldots ,z\) of deme 1 (deme 2). The expected forward and backward migration matrices
have a block structure, with \({\varvec{M}}_{cd}\) and \({\varvec{B}}_{cd}\) describing migration between demes \(c\) and \(d\). All blocks of the forward matrix have the same form, e.g.
with a first column containing expected number of offspring in deme 1 of all age classes in deme 1, and a superdiagonal with probabilities of surviving to the next age class and not migrating to deme 2. The blocks of the backward matrix a similar structure, for instance
with probabilities in the first row that the parent of a newborn in deme 1 originates from the various age classes in deme 1, and probabilities along the subdiagonal that genes of the various adult age classes in deme 1 did not migrate during the last time step. If parental subpopulations are chosen independently for all genes from \({\varvec{B}}\), it follows that \(Q_{ij,kl}\) is given by (51).
Coalescence probabilities
are obtained for all triples \(ijk\) of age classes for which \(Q_{ij,kk}\) is nonzero, with \(a(i)=i \text{ mod } z\) the age class of subpopulation \(i\), assuming in the first row that parents of newborns in a particular deme are chosen from a mixed multinomial distribution, with coalescence probabilities as in (53). For the second row we used that the coalescence probability (54) for survival is zero, and in third row we assumed that survival is independent of number of offspring, cf. (55).
We finally obtain \(N_{eE}\) by inserting (57) and (59) into (51), and then (51) and (60) into (41) and (32). \(\square \)
Example 8
(Extended Moran model). Eldon and Wakeley (2006, 2009) extended the Moran model to allow for more skewed offspring distributions, for a homogeneous population and for the island model. Here generalize their model to populations with conservative migration. The reproduction cycle between time points \(t-1\) and \(t\) is divided into two parts. In the first reproduction step one gene within each subpopulation \(k\) at time \(t-1\) is chosen randomly to have \(Y_{t-1,k}=Y_k\) offspring, including itself, where \(2\le Y_k\le 2N_k\). Then \(Y_k-1\) other, randomly chosen genes from the same subpopulation die. In the second migration step we assume that (44) holds, so that the forward and backward migration rates are non-random and fixed. The conservative migration assumption implies that exactly \(2N_k\) genes from subpopulation \(k\) migrate in (randomly chosen) groups of sizes \(2N_1B_{1k},\ldots 2N_sB_{sk}\) to subpopulations \(1,\ldots ,s\). The coalescence probability for any triple \(i,j,k\) of subpopulations is
as shown either by a direct argument, or from (38), (39) and (45). We finally obtain \(N_{eE}\) by inserting (45) and (61) into (41) and (32). \(\square \)
Example 9
(Dioecious population). Consider a population with \(N_m\) males, \(N_f=N-N_m\) females and sex ratio \(\xi =N_m/N\). Inheritance at an autosomal locus is modeled with \(s=4\) subpopulations; gametes within males inherited from the father (\(i=1\)) and mother (\(i=2\)), and gametes within females inherited from the father (\(i=3\)) and mother (\(i=4\)), so that the relative subpopulation sizes are \({\varvec{u}}= \left( \xi /2,\xi /2,(1-\xi )/2,(1-\xi )/2\right) \). According to Mendelian laws, each gamete is inherited, with equal probability 0.5, either from a grandpaternal or a grandmaternal gamete within the father or mother. This gives an observed backward migration matrix \(\varvec{\mathcal {B}}_t\) with a multinomial distribution (48). In view of (7), the expected backward/forward migration matrices are
and \({\varvec{\gamma }}= (1/4,1/4,1/4,1/4)\) follows from (4). Pairwise backward migration probabilities \(Q_{ij,kl}\) are given by (51), and in order to find the coalescence probabilities \(p_{ijk}\), we follow the notation of Hill (1979) and let \(m\) and \(f\) represent male and female sexes, write \(\tau _{rs}^2\) for the variance of the number of children of sex \(s\) of an individual of sex \(r\), and \(\tau _{rr,rs}\) for the covariance of the number of children of sex \(r\) and \(s\) of an individual of sex \(r\). It is shown in the “Appendix” that the nonzero coalescence probabilities when \(k=1\) are
and analogous formulas hold for \(k=2,3,4\). In particular, if a sperm or ova gamete chooses its parent randomly among all \(N_m\) or \(N_f\) parental genes, the number of male and female offspring of a male are two independent binomial random variables \(\text{ Bin }(2Nu_1,(2Nu_1)^{-1})\) and \(\text{ Bin }(2Nu_3,(2Nu_1)^{-1})\), with variances \(\tau _{mm}^2\) and \(\tau _{mf}^2\) respectively, and covariance \(\tau _{mm,mf}=0\). It is easily seen that in this case, all three probabilities in (63)–(65) equal \(1/(2Nu_1)\). We finally obtain \(N_{eE}\) by inserting (62) into (51), and then (51), (63)–(65) and the analogous coalescence probabilities for \(k=2,3,4\) into (41) and (32). \(\square \)
6 Asymptotics
According to (32), we find \(N_{eE}\) from the largest eigenvalue \(\lambda \) of \({\varvec{A}}\), for which we derived very explicit expressions in Sect. 4. Here will use this approach and give asymptotic formulas for \(N_{eE}\) when
depends on a positive parameter \(\varepsilon \rightarrow 0\), with a Taylor expansion
and \(\dot{{\varvec{A}}}=(\dot{A}_{ij,kl})_{ij\in \mathcal{I}_2,kl\in \mathcal{I}_2}\) a matrix of order \(s^2\). For each fixed \(\varepsilon >0\), \({\varvec{A}}(\varepsilon )\) satisfies the conditions of Theorem 2, so that in particular its unique largest eigenvalue is \(\lambda =\lambda (\varepsilon )\). The limiting matrix \({\varvec{A}}(0)\) is assumed to have a largest, not necessarily unique, eigenvalue \(\lambda (0)=1\). As in Maruyama (1970a) and Nagylaki (1980, 1995), we use perturbation theory of matrices (Horn and Johnson 1985; Friswell 1996), in order to deduce
for some negative constant \(\dot{\lambda }< 0\). Inserting (68) into (21), we find the asymptotic rate
by which the eigenvalue effective size tends to infinity. The following result gives a very explicit formula for \(\dot{\lambda }\), see for instance Aa et al. (2007) and references therein for a proof:
Theorem 4
Suppose the above conditions hold, with \({\varvec{A}}(0)\) having a largest eigenvalue \(\lambda (0)=1\) of multiplicity \(1\le v \le s^2\), with corresponding left and right eigenvectors \({\varvec{\rho }}_1(0),\ldots ,{\varvec{\rho }}_v(0)\) and \({\varvec{r}}_1(0),\ldots ,{\varvec{r}}_v(0)\). If also the perturbed left eigenvectors \({\varvec{\rho }}_\alpha (\varepsilon )\) and eigenvalues \(\lambda _\alpha (\varepsilon )\) are differentiable functions of \(\varepsilon \) at 0 for \(\alpha =1,\ldots ,v\), it holds that \(\lambda (\varepsilon ) = \max _{\alpha =1,\ldots ,v} \lambda _\alpha (\varepsilon )\) for small enough \(\varepsilon >0\), and (68) is satisfied, with
and \(\dot{{\varvec{\Lambda }}}=(\dot{\Lambda }_{\alpha \beta })\) a \(v\times v\) matrix with entries
In particular, suppose \({\varvec{A}}(0)\) is the transition matrix of a Markov chain with a unique equilibrium distribution \({\varvec{\rho }}(0)=(\rho _{ij}(0))\), the left eigenvector corresponding to \(\lambda (0)=1\), and a right eigenvector \({\varvec{r}}(0)={\varvec{1}}\). Then \(v=1\) and
In the following three subsections, the small perturbation parameter \(\varepsilon \) will either correspond to an inverse population size, a migration rate or both. We will use (41) and establish a Taylor expansion (67) for each case based on
when population sizes \(N_i(\varepsilon )=N(\varepsilon )u_i(\varepsilon )\), backward migration rates \(Q_{ij,kl}(\varepsilon )\) and/or coalescence probabilities \(p_{ijk}(\varepsilon )\) depend on the perturbation parameter \(\varepsilon \rightarrow 0\). The asymptotic expression for \(N_{eE}\) is then obtained from (69) and (70).
6.1 Large populations
We assume that the population size \(N\) tends to infinity, while the relative subpopulations sizes \({\varvec{u}}\), forward and backward migration matrices \({\varvec{M}}\) and \({\varvec{B}}\) are kept fixed. Introduce
as a perturbation parameter, with \(0<\beta \le 1\) a fixed constant. In order to verify a Taylor expansion of \({\varvec{A}}(\varepsilon )\) in (73), we first consider the backward migration matrix \({\varvec{Q}}(\varepsilon )=(Q_{ij,kl}(\varepsilon )) = E(\varvec{\mathcal {Q}}_t(\varepsilon ))\) for pairs of genes. It does not depend on \(\varepsilon \) for Dirichlet multinomial backward migration in (50), whereas for fixed backward migration (45) it does. In order to keep generality we assume that \({\varvec{Q}}(\varepsilon )\) may depend on \(\varepsilon \), with a Taylor expansion
for some matrix \(\dot{{\varvec{Q}}}= (\dot{Q}_{ij,kl})\). It will be seen though that \(\dot{{\varvec{Q}}}\) does not influence the asymptotic behavior of \(N_{eE}\). In contrast, the asymptotic behavior of the coalescence probabilities \(p_{ijk}=p_{ijk}(\varepsilon )\) is crucial and depends on how variable reproductive success is between individuals that migrate from one subpopulation (\(k\)) to other pairs of subpopulations (\(i,j\)). The limits
are assumed to exist for all triples \(ijk\), with \(V_{kij}\) defined in (38). It follows from (39), (74) and (76) that the coalescence probabilities admit Taylor expansions
for all \(i,j,k\). We refer to \(\dot{p}_{ijk}\) as the coalescence rate when two lines from \(i\) and \(j\) are merged into \(k\) and time is measured in units of \(\varepsilon ^{-1}=2N^\beta \) generations. The coalescence rate \(\sigma _{ijk}\) takes the size of the parental subpopulation \(k\) into account and measures time in units of \(2(Nu_k)^\beta \) instead. Taking the \(\varepsilon \rightarrow 0\) limit in (73), it follows from (75) and (77) that
We assume that \({\varvec{Q}}(0)\) is the transition kernel of a Markov chain that is not necessarily irreducible (it may contain some transient states), but has a unique equilibrium distribution \({\varvec{\rho }}(0)=(\rho _{ij}(0))\), which is also the left eigenvector of \({\varvec{Q}}(0)\) corresponding to its unique largest eigenvalue \(\lambda (0)=1\). Hence formula (72) of Theorem 4 applies, and we obtain the following:
Theorem 5
Define \(\varepsilon \) and \({\varvec{A}}={\varvec{A}}(\varepsilon )\) as in (73)–(74). Assume the population size \(N\rightarrow \infty \) so that (75) holds and the limits in (76) exist for some \(0<\beta \le 1\). Then \({\varvec{A}}(\varepsilon )\) satisfies Taylor expansion (67), with \({\varvec{A}}(0)\) as in (78) and
If the differentiability conditions on \(\lambda (\cdot )\) and \({\varvec{\rho }}(\cdot )\) in Theorem 4 hold, then
with
Suppose \(I=I_\tau \) and \(J=J_\tau \) are the subpopulations of the ancestors, taken from the same generation \(t-\tau \) of two genes sampled at a fixed time \(t\). Let \(K=I_{\tau +1}\) and \(L=J_{\tau +1}\) refer to the subpopulations of their two parents. Arguing as in Möhle (1998a), it will take many generations before coalescence in a large population, so that \((I,J,K,L)\) will first attain its equilibrium distribution \({\varvec{\rho }}_{ij}(0)Q_{ij,kl}(0)\). Therefore, (81) can be interpreted as a coalescence rate
at equilibrium if time is counted in units of \(2N^\beta \). In view of this, we may expect that for large populations, \(N_{eE}\) is asymptotically equivalent to the coalescence effective size \(N_{eC}\) whenever the latter exists. In a number of examples below, we will indeed verify that
with \(C\) the same constant as in (81). To this end, we first need the following:
Corollary 4
Suppose the conditions of Theorem 5 hold. Then asymptotically as \(N\rightarrow \infty \), \(N_{eE}\) is given by (80)–(81), with
for fixed backward migration (44), and
for Dirichlet multinomial backward migration (46)–(47).
Theorem 2 and (85) imply that \({\varvec{A}}\) has \(m=1\) irreducible component \(\mathcal{C}_1=\mathcal{I}_2\) for Dirichlet multinomial backward migration when \({\varvec{B}}\) is irreducible and at least one \(\alpha _i>0\), whereas \({\varvec{A}}\) has \(m=2\) components \(\mathcal{C}_1 = \{(i,i); \, i=1,\ldots ,s\}\) and \(\mathcal{C}_2 = \{(i,i); \, i\ne j\}\) when \(\alpha _i\equiv 0\). Then the joint ancestry of two genes are confined to lie in the same subpopulation after a few generations, once their ancestral subpopulation lineages merge for the first time, although they may not yet have coalesced at the gene level.
Corollary 5
Assume the conditions of Theorem 5 hold, with Dirichlet multinomial backward migration and \(\alpha _i\equiv 0\). Then the equilibrium distribution \({\varvec{\rho }}(0)=(\rho _{ij}(0))\) of \({\varvec{Q}}(0)\) is supported on the diagonal of \(\mathcal{I}_2\), with elements
Moreover, \(N_{eE}\) is asymptotically given by (80), with
\(\bar{V}_{kii} = E\left( \nu _{tki1}(\nu _{tki1}-1)|K_{ti}=k\right) \) and \(K_{ti}\) the subpopulation to which all parents of the genes in \(i\) at time \(t\) belong.
Jagers and Sagitov (2004) and Pollak (2010) studied populations with rapidly varying sizes, which can be viewed as a special case of the Dirichlet multinomial backward distribution with \(\alpha _i\equiv 0\) (see Example 2). They showed that \(N_{eC}\) satisfies (83), with \(\beta =1\) and \(C\) as in (87).
It is also possible to get explicit expressions for \(C\) when backward migration is fixed or multinomial:
Corollary 6
Assume the conditions of Theorem 5 hold, with fixed or multinomial (\(\alpha _i\equiv \infty \)) backward migration. Then \(Q_{ij,kl}(0)=B_{ik}B_{jl}\) follows from (84) and (85) with \(\alpha _i\equiv \infty \) respectively, and the equilibrium distribution \({\varvec{\rho }}(0)=(\rho _{ij}(0))\) has elements
Moreover, \(N_{eE}\) is asymptotically given by (80)–(81), with
and \(V_{kij}\) as defined in (38).
Felsenstein (1971) seems to have been first to use (89) for weighting pairs of subpopulations. Hössjer (2011) studied models with fixed backward migration, and showed that \(N_{eC}\) satisfies (83) when \(\beta =1\), with \(C\) as in (90).
Example 10
(Local subpopulation sizes). When the coalescence probability \(p_{ijk}(\varepsilon )=p_k(\varepsilon )\) only depends on the parental subpopulation \(k\) for all \(\varepsilon >0\), the size standardized coalescence rate (76) satisfies
as for the mixed multinomial reproduction scheme of Example 3. We deduce from (53) that \(p_{k}\ge 1/(2Nu_k)\), and
with \(C\) larger and \(N_{eE}\) smaller the more variable the gene weights \(\omega _{tkg}\) are. When (92) and Theorem 5 hold, it follows that \(N_{eE}\) is given by (80), with
using \(\sum _{ij} \rho _{ij}(0)Q_{ij,kl}(0) = \rho _{kl}(0)\) in the last step, since \({\varvec{\rho }}(0)\) is the left eigenvector of \({\varvec{Q}}(0)\) with eigenvalue 1. In particular, if all genes within each subpopulation have parents from the same subpopulation, (86) implies
whereas for fixed or multinomial backward migration, we deduce
from (89). In particular, if offspring pick their parents uniformly and independently within the parental subpopulation \(k\), we have \(p_{ijk} = p_k = 1/(2N_k)\), so that the local inbreeding effective size \(N_{eI,k}\) in (53) equals the local census size \(N_k\). Asymptotically, this corresponds to having \(\beta =1\) in (74) and \(\sigma _k=1\) in (92). For fixed backward migration, we can therefore use (80) and (96), and deduce \(N_{eE}=N/C + o(N)\) as \(N\rightarrow \infty \), with
Notohara (1993) and Nordborg and Krone (2002) showed that the coalescence effective size satisfies \(N_{eC}=N/C\), with \(C\) as in (97). Whenever \(N_{eI,k}=N_k\),
The first inequality of (98) shows how much stochastically varying migration lowers \(N_{eE}\) at most. Then Cauchy-Schwarz inequality shows how much a variable long term reproductivity \(\gamma _k/u_k\) between subpopulations lowers \(N_{eE}\), with equality for conservative migration \(\gamma _k=u_k\) (Nagylaki 1980). The circular stepping stone model has conservative migration with uniform population sizes \(u_k=1/s\), and Maruyama (1970a) found that \(N_{eE}/N\rightarrow 1\) as \(N\rightarrow \infty \), in agreement with the right hand side of (98). \(\square \)
Example 11
(Multiple mergers). Other limiting ancestral processes with multiple mergers (Pitman 1999; Sagitov 1999) are possible. Let \(p_{ijh,k}^{(3)}\) be the probability that three genes from subpopulations \(i,j,h\) that all have their parents in subpopulation \(k\), have the same parent. We will only consider models for which the pairwise and triple coalescence probabilities \(p_{ijk}=p_k\) and \(p_{ijh,k}^{(3)}=p_k^{(3)}\) only depend on the source population. Then we must have
in order for the limiting process to be Kingman’s coalescent, see Theorem 3.2 of Möhle (2000). It is not possible to violate (99) in Example 3. Indeed,
as \(N\rightarrow \infty \), since \(\omega _{tk1}\) is bounded by 1 and tends to zero in probability. On the other hand, the Moran model of Example 8 allows for multiple mergers for an appropriate choice of the offspring size \(Y_{t-1,k}=Y_k\) of the reproducing gene of subpopulation \(k\). As in Eldon and Wakeley (2006), we let
for some \(0<\psi \le 1\) and \(\beta > 0\). Then one shows
analogously as in (61). Since
as \(N\rightarrow \infty \), it follows from (61) and the last two displayed equations that we can violate (99) when \(0<\beta \le 1\), with \(\sigma _{ijk} = \sigma _k = \psi ^2 2^{1-\beta }\). Since the extended Moran model has fixed backward migration and conservative migration \(\gamma _k=u_k\), it follows that \(N_{eE}\) satisfies (96) with
Notice that this expression equals 1 when \(\beta =\psi =1\), since then the coalescence probability is asymptotically equivalent to \(1/(2N_k)\). \(\square \)
Other applications of Theorem 5 with \(\beta =1\) includes a single deme with \(s\) age classes. Explicit formulas for the constant \(C\) in \(N_{eE}=N/C + o(N)\) can be found under general assumptions on how reproductivity varies randomly between and within age classes, thereby extending results of Felsenstein (1971), Sagitov and Jagers (2005) and Hössjer (2011).
For dioecious models, Hill (1979) found that \(N_{eE}=N/C + o(N)\), with
which also follows from (51), (62), (63)–(65) and (90). Other effective size expressions of a diploid population can be found in Crow and Denniston (1988), Caballero (1995) and Nagylaki (1995). The latter two authors also treat inheritance at sex-linked loci. The expression for \(C\) is then somewhat different, since males only have one copy of an \(X\)-chromosome, and only \(s=3\) subpopulations are needed. Overlapping generations within a dioecious population (Pollak 2011) requires \(s=4z\) (\(s=3z\)) subpopulations for inheritance at an autosomal (X-linked) locus with \(z\) age classes. See also Möhle (1998b), for coalescence theory of two-sex models.
6.2 Small migration rates
Assume that the subpopulations in (1) divide into \(m\le s\) demes
with deme \(d\) containing the subpopulations of \(\mathcal{I}(d)\). We will introduce a migration parameter \(\varepsilon \rightarrow 0\) that quantifies the amount of migration between the demes (not within them) while the total population size \(N\) is kept fixed. In order to obtain an expression for \(N_{eE}\) as \(\varepsilon \rightarrow 0\), the crucial part is to find how all \(Q_{ij,kl}(\varepsilon )\) in (73) depend on \(\varepsilon \). Although the relative subpopulation sizes \(u_i(\varepsilon )\) and coalescence probabilities \(p_{ijk}(\varepsilon )\) may vary with \(\varepsilon \) to some extent, this will have no asymptotic impact on \(N_{eE}\) as \(\varepsilon \rightarrow 0\).
We will assume that the backward migration matrix
depends on \(0\le \varepsilon \le \varepsilon _{ \text{ max }}\), where \(\varepsilon _{ \text{ max }}\) is chosen to guarantee that \({\varvec{B}}(\varepsilon )\) remains a non-negative matrix. The demes are isolated when \(\varepsilon =0\), so that \({\varvec{B}}(0)\) has a block diagonal structure \({\varvec{B}}(0) = \text{ diag }({\varvec{B}}_{11}(0),\ldots ,{\varvec{B}}_{mm}(0))\), with \({\varvec{B}}_{dd}(0)=(B_{ik}(0))_{i,k\in \mathcal{I}(d)}\) describing backward migration within deme \(d\). Since \({\varvec{B}}(\varepsilon )\) is a transition matrix of a Markov chain for all \(\varepsilon \), the row sums of \(\dot{{\varvec{B}}}\) must be zero, and this holds, for instance, if
for all \(i\in \mathcal{I}(d)\) and \(d=1,\ldots ,m\). If \({\varvec{M}}(\varepsilon )\) and \({\varvec{u}}(\varepsilon )\) are computed for each \(\varepsilon >0\) from (7) and (8), it follows from (102) that
if all \(u_i(\varepsilon )\) are differentiable at \(0\), with \(\dot{{\varvec{M}}}=(\dot{M}_{ki})\) having elements
The migration parameter \(\varepsilon \) is such \(B(\varepsilon )=\dot{B}\varepsilon + o(\varepsilon )\) and \(M(\varepsilon ) = \dot{M}\varepsilon + o(\varepsilon )\) as \(\varepsilon \rightarrow 0\) for some positive constants \(\dot{B}\) and \(\dot{M}\), where
is the backward migration rate between demes, i.e. the average number of parents of ancestors far back in time that originate from another deme than their children, and
is the forward migration rate, i.e. the fraction of all offspring today whose parents reside in another deme. Backward migration \(B(\varepsilon )\) is somewhat easier to analyze theoretically, but often \(\varvec{M}(\varepsilon )\) is of more interest in applications.
In order to find explicit expressions for \(\dot{B}\) and \(\dot{M}\), we introduce \({\varvec{\gamma }}_d = (\gamma _{di})_{i\in \mathcal{I}(d)}\) as the equilibrium distribution of \({\varvec{B}}_{dd}(0)\), and the matrix \({\varvec{G}}=(G_{ab})_{a,b=1}^m\) with elements
It is the infinitesimal generator of a continuous time Markov process with state space \(\{1,\ldots ,m\}\) and an equilibrium distribution \({\varvec{\theta }}= (\theta _1,\ldots ,\theta _m)\) satisfying
In the next lemma we assume \(\varepsilon \) is small, so that migration is faster within than between demes, and subpopulations within a deme form a macro state. The backward ancestry of a gene then attains its equilibrium distribution within a deme before any transitions between demes occur, and then the backward deme ancestry is a continuous time Markov process with generator \({\varvec{G}}\):
Lemma 1
Suppose \(\varepsilon \rightarrow 0\). Then the equilibrium distribution \({\varvec{\gamma }}(\varepsilon )\) corresponding to (102) satisfies
for \(i=1,\ldots ,s\), and the backward migration rate (106)
where \(\gamma _i(0)\) is the limit of the right hand side of (110). If all \(u_i(\varepsilon )\) are differentiable at 0, the forward rate (107) has a similar expansion
In order to find the asymptotic behaviour of \(N_{eE}\) as \(\varepsilon \rightarrow 0\) by means of (69) and Theorem 4, we derive an expression for \({\varvec{A}}(\varepsilon )\) in (73), find \(\dot{{\varvec{A}}}\), show that \({\varvec{A}}(0)\) has a largest eigenvalue \(\lambda (0)=1\), find its multiplicity \(v\) and corresponding left and right eigenvectors. Because all demes are isolated when \(\varepsilon =0\), it is easy to see that the ancestors of \(i\in \mathcal{I}(a)\) and \(j\in \mathcal{I}(b)\) must belong to \(k\in \mathcal{I}(a)\) and \(l\in \mathcal{I}(b)\) respectively. Therefore \({\varvec{Q}}(0)\) has a block diagonal structure
with \({\varvec{Q}}_{ab}(0) = \left( Q_{ij,kl}\right) _{i, k\in \mathcal{I}(a), j,l\in \mathcal{I}(b)}\) a square matrix of order \(|\mathcal{I}(a)||\mathcal{I}(b)|\) containing all backward transitions when one gene and its parent are from deme \(a\) and the other gene and its parent are from deme \(b\). It follows from (73) that \({\varvec{A}}(0)\) has a block diagonal structure
as well, with \({\varvec{A}}_{ab}(\varepsilon ) = \left( A_{ij,kl} (\varepsilon )\right) _{i,k\in \mathcal{I}(a),j,l\in \mathcal{I}(b)}\) having elements
for any \(\varepsilon \) because of (73), for subpopulations \(i\) and \(j\) that reside in different demes. In particular, \({\varvec{A}}_{ab}(0)={\varvec{Q}}_{ab}(0)\) has a unique largest eigenvalue 1 when \(a\ne b\), and associated left and right eigenvectors \({\varvec{\rho }}_{ab}(0) = \text{ vec }\left( (\rho _{ab,ij} (0))_{ij\in \mathcal{I}_2}\right) ^\prime \) and \({\varvec{r}}_{ab}(0) = \text{ vec }\left( (r_{ab,ij}(0))_{ij\in \mathcal{I}_2}\right) \) with components
Since coalescence events are possible within each deme, even when \(\varepsilon =0\), it follows that \({\varvec{A}}_{dd}(0)\) differs from \({\varvec{Q}}_{dd}(0)\), with a largest eigenvalue strictly smaller than one. Therefore, the largest eigenvalue 1 of \({\varvec{A}}(0)\) has multiplicity
In order to apply Theorem 4 we must also find the entries of the matrix \(\dot{{\varvec{\Lambda }}}=(\dot{\Lambda }_{\alpha , \beta })_{\alpha , \beta =1}^v\), where \(\alpha =(a,b)\), \(\beta =(c,d)\), \(a\ne b\) and \(c\ne d\). Suppose \(i\in \mathcal{I}(a)\), \(j\in \mathcal{I}(b)\), \(k\in \mathcal{I}(c)\) and \(l\in \mathcal{I}(d)\), then
follows from (114) and differentiation of (115) with respect to \(\varepsilon \). Invoking the definition of \(\dot{{\varvec{\Lambda }}}\) in (71), we find that
Then we insert (117) into (118), make use of (108) and obtain
In particular, when each subpopulation is a deme, \(m=s\) and \(\mathcal{I}(d) = \{d\}\) for \(d=1,\ldots ,s\), so that (108) implies \({\varvec{G}}=\dot{{\varvec{B}}}\), and \(\dot{{\varvec{\Lambda }}}=(\dot{\Lambda }_{ij,kl})_{1\le i\ne j\le s, 1\le k\ne l\le s}\) is of order \(v=s(s-1)\), with elements
Combining (111), (112) and (119) with Theorem 4, we obtain the following:
Theorem 6
Suppose subpopulations are divided into \(m\) demes, as in (101), whose isolation is quantified by the matrix \(\dot{{\varvec{B}}}=(\dot{B}_{ik})\) in (102). Then
with \(B=B(\varepsilon )\) the backward migration rate between demes in (106), and \(\dot{{\varvec{\Lambda }}}=(\dot{\Lambda }_{ab,cd})\) the matrix in (119), which simplifies to (120) when \(m=s\). If all \(u_k(\varepsilon )\) are differentiable at \(\varepsilon =0\) as well, then
with \(M=M(\varepsilon )\) the forward migration rate in (107) and \(\dot{M}_{ki}\) defined in (105).
Example 12
(Island model). The island model (Wright 1943; Maruyama 1970b) is the most well known example of a population with spatial substructure, having \(m=s\) demes, and a forward migration matrix
where \({\varvec{1}}\) is a column vector of \(s\) ones. Migration is symmetric, so that the migration rate \(M_{ki} = \varepsilon /(s-1)\) from each \(k\) to any other deme \(i\ne k\) is the same. It follows by symmetry from (4), (7) and (8) that \({\varvec{B}}= {\varvec{M}}\), \(M=B=\varepsilon \) and \(u_k(\varepsilon )=\gamma _k(\varepsilon )=1/s\). This implies in particular that
Insertion of (124) into (120) yields
so that by symmetry, the largest eigenvalue of \(\dot{{\varvec{\Lambda }}}\) corresponds to an eigenvector \({\varvec{1}}_v = (1,\ldots ,1)^\prime \) that is a column vector with \(v=s(s-1)\) ones. Hence we find, from any of the row sums of \(\dot{{\varvec{\Lambda }}}\), that
We finally apply (122) and arrive at
as \(M\rightarrow 0\). The accuracy of this formula is illustrated in Fig. 1. \(\square \)
Example 13
(Circular stepping stone model). The circular stepping stone model (Kimura 1953; Kimura and Weiss 1964; Maruyama 1970a) is a spatial model with \(m=s\) demes located along the perimeter of a circle, where migration from any deme is only possible to one of its two nearest neighbors. The elements of the expected forward migration matrix are
where \(\delta (i,j)\) is the shortest distance between demes \(i\) and \(j\) along the circle perimeter, when the distance between two neighboring demes is normalized to 1. It follows from (4), (7) and (8) that \({\varvec{M}}={\varvec{B}}\), \(M=B=\varepsilon \) and \(u_k(\varepsilon )=\gamma _k(\varepsilon )=1/s\). Hence
and from (120) we find that the matrix \(\dot{{\varvec{\Lambda }}}=(\dot{\Lambda }_{ij,kl})\) has elements
Since \(\sum _{k=1}^s u_k(0)\sum _{i;i\ne k}\dot{M}_{ki} = 1\), we finally deduce from (122) that
It seems difficult to obtain an explicit expression for the multiplicative constant in (127), although Maruyama (1970a) derived an approximation
for even \(s\). In Table 1 we compare (127) and (128) for different \(s\) and find a very good agreement. \(\square \)
Example 14
(System with five subpopulations). A system with five subpopulations of varying size is shown in Fig. 2, with number of migrants in each generation depicted next to the arrows. The forward migration matrix is
and the relative subpopulation size vector \({\varvec{u}}=(2,4,0.5,4,1)/11.5\). We let the forward migration rates depend on a perturbation parameter \(\varepsilon \) according to
so that \({\varvec{u}}={\varvec{u}}(\varepsilon )\) does not depend on \(\varepsilon \), whereas the forward migration rate \(M(\varepsilon )=\varepsilon M\) is proportional to \(\varepsilon \). It follows from (105) that
Combining (122) and (129), we find that
with \(C = (1-\sum _k u_k M_{kk})/(-2\lambda _{ \text{ max }}(\dot{{\varvec{\Lambda }}}))\) and \(\dot{{\varvec{\Lambda }}}\) derived from (120) and (130). The numerically computed value \(C=1.419\) is justified in Fig. 2. \(\square \)
Example 15
(Combined spatial and age structure). Continuing Example 7, we assume that the forward migration matrix depends on \(\varepsilon \) as
so that the two demes are isolated when \(\varepsilon =0\). For brevity, write \(M_{ki}=M_{ki}(0)\). The nonzero elements of the two off-diagonal blocks of \(\dot{{\varvec{M}}}\) are
where \(c_1,\ldots ,c_z\) are non-negative constants, of which at least one is strictly positive. The migration rates in (132) are chosen so that \(u_k=u_k(\varepsilon )\) does not depend on \(\varepsilon \). Intuitively, a fraction \(\varepsilon c_1 u_{z+1}\) of all offspring in deme 1 end up in deme 2, and a fraction \(\varepsilon c_{k+1}u_{z+k+1}\) of all genes of age class \(k\) of deme 1 that survive, migrate to deme 2, and similarly for the other two equations of (132). In the “Appendix” we verify that
as \(M\rightarrow 0\). When \(c_i\) increases with \(i\), older individuals will migrate more, and this will increase \(N_{eE}\) if older individuals are less reproductive, and decrease \(N_{eE}\) if they reproduce more. Conservative migration is the intermediate case when all age groups are equally reproductive, with \(\gamma _{1i}=u_i/U(1)\), \(\gamma _{2i}=u_{z+i}/U(2)\) and \(U(d)=\sum _{i\in \mathcal{I}(d)} u_i\) the relative size of deme \(d\). Insertion into (133) gives
as \(M\rightarrow 0\) for conservative migration, independently of the age dependency of the migration pattern. In particular, if both demes are equally large, we get the same multiplicative constant \(C=U(1)U(2)=(1/2)^2 = 1/4\) as for an island model (125) with \(s=2\). \(\square \)
6.3 Large populations and small migration rates
We let the inverse population size and the backward migration rate both tend to zero at the same speed, so that
with \(\varepsilon \rightarrow 0\) and \(c\) a constant. This can be viewed as an asymptotic scenario intermediate between (74) (with \(\beta =1\)) and (102).
The asymptotic expression for \(N_{eE}\) is derived similarly as in the previous subsection, so we only highlight the differences. Since the population size tends to infinity, the coalescence probabilities \(p_{ijk}\) will tend to zero, as described in (77), and this modifies (73) to
where \(\sigma _{ijk}\) is the size standardized coalescence rate in (77). Consequently,
is a block diagonal matrix with blocks \({\varvec{A}}_{ab}(0)\) given by (115) when \(\varepsilon =0\) for all \(1\le a,b \le m\). These blocks have a unique largest eigenvalue 1, and
When \(a\ne b\), the left and right eigenvectors of \({\varvec{A}}_{ab}(0)\) are as in (116). The same is true when \(a=b\) if we add the assumption of fixed or multinomial backward migration proportions. Differentiating (135) with respect to \(\varepsilon \) we find that
with \(\dot{Q}_{ij,kl}\) as in (117), but without the restriction \(a\ne b\). Therefore, inserting (116) and (136) into the definition of \(\dot{{\varvec{\Lambda }}}= (\dot{\Lambda }_{ab,cd})_{1\le a,b\le m, 1\le c,d \le m}\) in (71), we find, after some computations, that this matrix has elements
with \(G_{ab}\) as in (108) and
is a coalescence rate between the lines of deme \(a\) that can be interpreted as a local version of (90). In particular, when each subpopulation \(i\) is a deme, \(G_{ik}=\dot{B}_{ik}\), and (137) reduces to
Equipped with (137) and (138), we apply (69) and Theorem 4, and deduce:
Proposition 4
Suppose the migration rate between demes and the inverse population size tend to zero simultaneously as in (134) when \(\varepsilon \rightarrow 0\), with a backward migration that is either fixed (44) or multinomial (48). The eigenvalue effective size then has an asymptotic expansion
with the elements of \(\dot{{\varvec{\Lambda }}}\) as in (137), or (138) when each deme contains one single subpopulation.
Example 16
(Island model.) We will assume that
where \(\sigma =(N/s)/N_{eI}\) can be interpreted as a ratio between the (constant) local census and effective size of each deme. This local inbreeding effective size \(N_{eI}=N_{eI,i}\) is similar to (53), although we only consider triplets \(i=j=k\) of demes here. It follows from (124), (138) and (140) that
Let \(\dot{\lambda }= \lambda _{ \text{ max }}(\dot{{\varvec{\Lambda }}})\) be the largest eigenvalue of \(\dot{{\varvec{\Lambda }}}\), with \({\varvec{x}}= \text{ vec }\left( (x_{ij})_{i,j=1}^s\right) \) the corresponding right eigenvector satisfying \(\dot{{\varvec{\Lambda }}}{\varvec{x}}= \dot{\lambda }{\varvec{x}}\). By symmetry we must have \(x_{ij} = y\) when \(i=j\) and \(x_{ij}=z\) when \(i\ne j\), and a \(2\times 2\) system
of equations for \(y\) and \(z\). It will be convenient to introduce the parameter \(\kappa = c/(s\sigma ) = 4NM/(s\sigma ) =: 4N_{eI}M\). Then we apply (139) and find that
as \(M\rightarrow 0\), where
is the largest root of the (quadratic) characteristic equation in \(\dot{\lambda }\) obtained from (142). Figure 3 verifies numerically fast convergence in (143), for three combinations of \(s\) and \(\kappa \) (notice the narrow scales of the \(y\)-axes). We have that \(\lim _{\kappa \rightarrow 0} (-2\dot{\lambda }(\kappa )) = 4/(s-1)\), in agreement with (125). On the other hand,
where
and when the migration rate dominates the inverse population size, we get \(\lim _{\kappa \rightarrow \infty } C(\kappa ) = \sigma \) for \(\sigma _k=\sigma \) and \(\gamma _k=u_k=1/s\), in agreement with (96). \(\square \)
7 Discussion
In this paper we developed a general theory which enables computation of the eigenvalue effective size \(N_{eE}\) for a large class of structured populations with stochastic backward migration and exchangeable reproduction within subpopulations, exactly or asymptotically when either the inverse population size and/or migration rates between subpopulations tend to zero.
Our work can be extended in several ways. First, subpopulation sizes could be time varying. Existence of \(\lambda \) then requires extra conditions, e.g. sizes that either vary as a Markov chain or cyclically. Several authors have studied this problem for homogeneous or age structured models, see Karlin (1968), Jagers and Sagitov (2004), Pollak (1980, 2002) and Wang and Pollak (2000a, b). For cyclically varying populations with period \(\tau \), the matrix \({\varvec{A}}={\varvec{A}}_t\) of the predicted gene diversity recursion will depend cyclically on time. Whitlock and Barton (1997) argued that this deterministic process tends to zero at a rate
as formally proved in Hössjer et al. (2014). It is straightforward to extend Theorem 1 by measuring time in units of \(\tau \), so that the allele frequency Markov process has kernel \({\varvec{P}}_1\cdot \ldots \cdot {\varvec{P}}_\tau \). Then (144) equals the rate of fixation \(\lambda = \lambda _{3}\left( {\varvec{P}}_1\cdot \ldots \cdot {\varvec{P}}_\tau \right) ^{1/\tau }\) of alleles in units of time step one.
Second, we have included two-sex models, defining subpopulations in terms of male and female gametes. It would also be of interest to define subpopulations in terms of individuals, as for an island model with diploid monoecious or dioecious individuals (Chesser et al. 1993; Wang 1997a, b). This would require some changes in the way the elements of \({\varvec{A}}\) are characterized in terms of coalescence probabilities, requiring modifications of (34) and Theorem 3.
References
Brockwell PJ, Davis RA (1991) Time series: theory and methods, 2nd edn. Springer, New York
Caballero A (1994) Developments in the prediction of effective population size. Heredity 73:657–679
Caballero A (1995) On the effective size of populations with separate sexes, with particular reference to sex-linked genes. Genetics 139:1007–1011
Cabellero A, Hill WG (1992) A note on the inbreeding effective population size. Evolution 46(6):1969–1972
Cannings C (1974) The latent roots of certain Markov chains arising in genetics: a new approach I. Haploid models. Adv Appl Prob 6:260–290
Caswell H (2001) Matrix population models, 2nd edn. Sinauer, Sunderland
Charlesworth B (2009) Effective population size and patterns of molecular evolution and variation. Nat Rev Genet 10:195–205
Chesser RK, Rhodes OE, Sugg DW, Schnabel A (1993) Effective sizes for subdivided populations. Genetics 135:1221–1232
Collet P, Martinez S (2013) Quasi stationary distributions, Markov chains, diffusions and dynamical systems. Springer, Berlin
Cox DR, Miller HD (1965) The theory of stochastic processes. Methuen & Co Ltd, London
Crow JF (1954) Breeding structure of populations II. Effective population number. In: Statistics and Mathematics in Biology. Iowas State Coll. Press, Ames, Iowa, pp 543–556
Crow JF, Denniston C (1988) Inbreeding and variance effective population sizes. Evolution 42(3):482–495
Darroch JN, Seneta E (1965) On quasi-stationary distributions in absorbing discrete-time finite Markov chains. J Appl Prob 2:88–100
Durrett R (2008) Probability models for DNA sequence evolution, 2nd edn. Springer, New York
Eldon B, Wakeley J (2006) Coalescent process when the distribution of offspring number among individuals is highly skewed. Genetics 172:2621–2633
Eldon B, Wakeley J (2009) Coalescence times and \(F_{ST}\) under a skewed offspring distribution among individuals in a population. Genetics 181:615–629
Ewens WJ (1982) On the concept of effective population size. Theor Popul Biol 21:373–378
Ewens WJ (1989) The effective population size in the presence of catastrophes. In: Feldman M (ed) Mathematical evolutionary theory. Princeton University, Princeton, p 9
Ewens WJ (2004) Mathematical population genetics. I. Theoretical introduction, 2nd edn. Springer, New York
Feller W (1951) Diffusion processes in genetics. In: Neyman J (ed) Proceedings of 2nd Berkeley symposium on Math. Stat. Prob. University of California Press, Berkeley, pp 227–246
Felsenstein J (1971) Inbreeding and variance effective numbers in populations with overlapping generations. Genetics 68:581–597
Felsenstein J (1982) How can we infer geography and history from gene frequencies? J Theor Biol 96:9–20
Fisher RA (1958) The genetical theory of natural selection, 2nd edn. Dover, New York
Friswell MI (1996) The derivatives of repeated eigenvalues and their associated eigenvectors. Trans ASME 118:390–397
Grimmett G, Stirzaker D (2001) Probability and random processes, 3rd edn. Oxford University Press, Oxford
Herbots HM (1997) The structured coalescent. In: Donnelly P, Tavaré S (eds) Progress in population genetics and human evolution. Springer, New York, pp 231–255
Hill WG (1972) Effective size of populations with overlapping generations. Theor Popul Biol 3:278–289
Hill WG (1979) A note of effective population size with overlapping generations. Genetics 92:317–322
Horn RA, Johnson CR (1985) Matrix analysis. Cambridge University Press, Cambridge
Hössjer O (2011) Coalescence theory for a general class of structured populations with fast migration. Adv Appl Prob 43(4):1027–1047
Hössjer O, Olsson F, Laikre L, Ryman N (2014) A general framework for modeling short and long term patterns of genetic differentiation and effective size of subdivided populations. Under revision for Mathematical Biosciences
Hössjer O, Ryman N (2014) Quasi equilibrium, variance effective population size and fixation index for models with spatial structure. J Math Biol
Jagers P, Sagitov S (2004) Convergence to the coalescent in populations of substantially varying size. J Appl Prob 41:368–378
Kaj I, Krone SM, Lascoux M (2001) Coalescent theory for seed bank models. J Appl Prob 38:285–301
Kaj I, Krone SM (2003) The coalescent process in a population of varying size. J Appl Prob 40:33–48
Karlin S (1968) Rares of approach to homozygosity for finite stochastic models with variable population size. Am Nat 102:443–455
Karlin S, McGregor J (1965) Direct product branching processes and related induced Markov chains I. Calculations of rates of approach to homozygosity. Bernoulli, Beys, Laplace Anniversary volume. Springer, Berlin-Heidelberg-New york, pp 11–145
Kimura M (1953) ’Stepping stone’ model of population. Ann Rep Natl Inst Genet Jpn 3:62–63
Kimura M (1957) Some problems of stochastic processes in genetics. Ann Math Statist 28:882–901
Kimura M, Weiss GH (1964) The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics 61:763–771
Kingman JFC (1982) The coalescent. Stoch Proc Appl 13:235–248
Li W-H (1976) Effect of migration on genetic distance. Am Nat 110:841–847
Malécot G (1951) Un treatment stochastique des problemès linairés (mutation, linkage, migration) en géneétique de populations. Annales de l’Université de Lyon A 14:79–117
Maruyama T (1970a) On the rate of decrease of heterozygosity in circular stepping stone models of populations. Theor Popul Biol 1:101–119
Maruyama T (1970b) Effective number of alleles in subdivided populations. Theor Popul Biol 1:273–306
Möhle M (1998a) A convergence theorem for Markov chains arising in population genetics and the coalescent with selfing. Adv Appl Prob 30:493–512
Möhle M (1998b) Coalescent results for two-sex population models. Adv Appl Prob 30:513–520
Möhle M (2000) Ancestral processes in population genetics—the coalescent. J Theor Biol 204:629–638
Nagylaki T (1980) The strong-migration limit in geographically structured populations. J Math Biol 9:101–114
Nagylaki T (1995) The inbreeding effective population number in dioecious populations. Genetics 139:473–485
Nagylaki T (2000) Geographical invariance and the strong-migration limit in subdivided populations. J Math Biol 41:123–142
Nei M (1973) Analysis of gene diversity in subdivided populations. Proc Natl Acad Sci USA 70:3321–3323
Nei M (1987) Molecular evolutionary genetics. Columbia University Press, Columbia
Nordborg M, Krone S (2002) Separation of time scales and convergence to the coalescent in structured populations. In: Slatkin M, Veuille M (eds) Modern development in theoretical population genetics. Oxford University Press, Oxford, pp 194–232
Norris JR (2008) Markov chains. Cambridge University Press, Cambridge
Notohara M (1990) The coalescent and the genealogical process in geographically structured populations. J Math Biol 29:59–75
Notohara M (1993) The strong-migration limit for the genealogical process of geographically structured populations. J Math Biol 31:115–122
Orrive ME (1993) Effective population size in organisms with complex life-histories. Theor Popul Biol 44:316–340
Pitman J (1999) Coalescents with multiple collisions. Ann Probab 27:1870–1902
Pollak E (1980) Effective population numbers and mean time to extinction in dioecious populations with overlapping generations. Math Biosci 52:1–25
Pollak E (2002) Eigenvalue effective population numbers for populations that vary cyclically in size. Math Biosci 177(178):11–24
Pollak E (2010) Coalescent theory for a monoecious random mating population with a varying size. J Appl Prob 47:41–57
Pollak E (2011) Coalescent theory for age-structured random mating populations with two sexes. Math Biosci 233:126–134
Ryman N, Leimar O (2008) Effect of mutation on genetic differentiation among nonequilibrium populations. Evolution 62(9):2250–2259
Sagitov S (1999) The general coalecsent with asynchronous mergers of ancestral lines. J Appl Probab 36:1116–1125
Sagitov S, Jagers P (2005) The coalescent effective size of age-structured populations. Ann Appl Probab 15(3):1778–1797
Sampson KY (2006) Structured coalescent with nonconservative migration. J Appl Prob 43:351–362
Sano A, Shimizu A, Iizuka M (2004) Coalescent process with fluctuating population size and its effective size. Theor Popul Biol 65:39–48
Sawyer S (1976) Results for the stepping stone model for migration in population genetics. Ann Prob 4:699–728
Sjödin P, Kaj I, Krone S, Lascoux M, Nordborg M (2005) On the meaning and existence of an effective population size. Genetics 169:1061–1070
Slatkin M (1991) Inbreeding coefficients and coalescent times. Genet Res Camb 58:167–175
Sved JA, Latter BDH (1977) Migration and mutation in stochastic models of gene frequency change. J Math Biol 5:61–73
Tufto J, Engen S, Hindar K (1996) Inferring patterns of migration from gene frequencies under equilibrium conditions. Genetics 144:1911–1921
Tufto J, Hindar K (2003) Effective size in management and conservation of subdivided populations. J Theor Popul Biol 222:273–281
van der Aa NP, Ter Morsche HG, Mattheij RRM (2007) Computation of eigenvalue and eigenvector derivatives for a general complex-valued eigensystem. Electron J Linear Algebra 16:300–314
Wakeley J (1998) Segregating sites in Wright’s island model. Theor Popul Biol 53:166–174
Wakeley J, Sargsyan O (2009) Extensions of the coalescent effective population size. Genetics 181:341–345
Wang J (1997a) Effective size and \(F\)-statistics of subdivided populations. I. Monoecious species with partial selfing. Genetics 146:1453–1463
Wang J (1997b) Effective size and \(F\)-statistics of subdivided populations. II. Dioecious species. Genetics 146:1465–1474
Wang J, Caballero A (1999) Developments in predicting the effective size of subdivided populations. Heredity 82:212–226
Wang Y, Pollak E (2000a) The effective number of a population that varies cyclically in size I. Discrete generations. Math Biosci 166:1–21
Wang Y, Pollak E (2000b) The effective number of a population that varies cyclically in size. II. Overlapping generations. Math Biosci 179:161–181
Waples RS (2002) Definition and estimation of effective population size in the conservation of endangered species. In: Beissinger SR, McCullogh DR (eds) Populations viability analysis. The University of Chicago Press, Chicago
Whitlock MC, Barton NH (1997) The effective size of a subdivided population. Genetics 145:427–441
Wright S (1938) Size of population and breeding structure in relation to evolution. Science 87:430–431
Wright S (1931) Evolution in Mendelian populations. Genetics 16:97–159
Wright S (1943) Isolation by distance. Genetics 28:114–138
Acknowledgments
The author was financially supported by the Gustafsson Foundation for Research in Natural Sciences and Medicine and the Swedish Research Council, contracts nr. 621-2008-4946 and 621-2013-4633. He thanks an associate editor and two anonymous reviewers for very helpful comments that considerably improved the manuscript, Nils Ryman for suggesting the source population example of Sect. 5 and for provision of valuable references, and Fredrik Olsson for the plots and numerical values of Sect. 6.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Proof of Theorem 1
Formula (14) follows from the lower block diagonal decomposition (13) of \({\varvec{P}}\), which implies that the spectrum of eigenvalues of \({\varvec{P}}\) is the union of the spectrum of eigenvalues of all \({\varvec{P}}_{ii}\). Moreover, for each \(i\ge 3\) we have that \(\lambda _{ \text{ max }}({\varvec{P}}_{ii}) <1\), since \({\varvec{P}}_{ii}\) has non-negative elements, row sums less or equal to one, with at least one row sum strictly less than one. In conjunction with (14), this implies \(\lambda _3<1\).
In order to prove (15), we recall from (12) that \(\phi _2({\varvec{x}})=\gamma {\varvec{x}}\). It follows from (10) that \(\{\phi _2({\varvec{X}}_t)\}\) is a martingale. With \(\tau = \min \{t; \, {\varvec{X}}_t\in \mathcal{X}_1\cup \mathcal{X}_2\}\) a stopping time, we can use the Optional Stopping Theorem (cf. Chapter 7 of Grimmett and Stirzaker 2001) to deduce
and similarly \({\varvec{P}}^{t}({\varvec{x}},{\varvec{0}}) \rightarrow 1-\phi _2({\varvec{x}}) = \phi _1({\varvec{x}})\), for any \({\varvec{x}}\in \mathcal{X}\). This establishes the leading two terms on the right hand side of (15). In order motivate the next term of order \(\lambda _3^t\), we notice that \({\varvec{P}}_{kk}\) is non-negative, irreducible and aperiodic, and therefore has a unique largest eigenvalue \(\lambda _3\). But since the maximum in (14) is attained uniquely for \(i=k\), \(\lambda _3\) must also be a simple eigenvalue for \({\varvec{P}}\), so that \(|\lambda _i|<\lambda _3\) for \(i=4,\ldots ,|\mathcal{X}|\). In conjunction with (145), this implies
for some matrix \({\varvec{R}}\) as \(t\rightarrow \infty \). Since \({\varvec{P}}\) has a lower triangular block decomposition by (11), so has \({\varvec{P}}^t=({\varvec{P}}^{(t)}_{ij})\) and \({\varvec{R}}=({\varvec{R}}_{ij})\). Moreover, \({\varvec{P}}^t\) has non-negative elements and \({\varvec{\phi }}_1{\varvec{\pi }}_1 + {\varvec{\phi }}_2{\varvec{\pi }}_2\) has all but its first two columns equal to zero, and therefore each \({\varvec{R}}_{ij}\) with \(j\ge 3\) is non-negative. Apply Perron-Frobenius Theorem to the non-negative, irreducible and aperiodic matrix \({\varvec{P}}_{kk}\) in order to deduce \({\varvec{P}}_{kk}^{(t)} = \lambda _3^t{\varvec{\phi }}_k{\varvec{q}}_k + o(\lambda _3^t)\), with \({\varvec{\phi }}_k\) and \({\varvec{q}}_k\) the right and left eigenvectors for the leading eigenvalue \(\lambda _3\) with strictly positive components, normalized for instance so that \(\sum _{{\varvec{x}}\in \mathcal{X}_k} q_k({\varvec{x}}) = \sum _{{\varvec{x}}\in \mathcal{X}_k} \phi _k({\varvec{x}})q_k({\varvec{x}}) = 1\). This proves that \({\varvec{R}}_k = {\varvec{\phi }}_k{\varvec{q}}_k\), and since all other block diagonal \({\varvec{P}}_{ii}\) have leading eigenvalues smaller than \(\lambda _3\), the corresponding matrices \({\varvec{R}}_{ii}\) must be zero.
The remaining submatrices \({\varvec{R}}_{ij}\) of \({\varvec{R}}\) (\(i>j\ge 1\), \(i\ge 3\)) can be computed as follows. We use (15) and the two recursions \({\varvec{P}}^{(t+1)}_{ij} = \sum _{l=j}^i {\varvec{P}}_{il}{\varvec{P}}_{lj}^{(t)}\) and \({\varvec{P}}^{(t+1)}_{ij} = \sum _{l=j}^i {\varvec{P}}_{il}^{(t)}{\varvec{P}}_{lj}\) and let \(t\rightarrow \infty \). After some computations this leads to
with \({\varvec{I}}_p\) an identity matrix of order \(p\). The upper and lower recursions of (147) is applicable for \(i\ne k\) and \(j\ne k\), and all \({\varvec{R}}_{ij}\) can be found from them. \(\square \)
Proof of Corollary 1
It follows from (15) and (17) that
where the last inequality holds if \(\pi (\mathcal{X}_k)>0\), since \(q_k\), \(\phi _k\) and \(\phi \) are strictly positive functions on \(\mathcal{X}_k\). \(\square \)
Proof of Proposition 2
The proposition follows from
where in the second step we used the symmetry condition \(W_{ij}=W_{ji}\), and in the last step the definition of \(H_{tij}\) in (23). \(\square \)
Proof of Theorem 2
Assume that all elements \(W_{ij}\) of \({\varvec{W}}\) are strictly positive, and that the symmetry condition \(W_{ji}=W_{ij}\) holds. Then \(\phi _{{\varvec{W}}}\) satisfies (17), and it follows from (18) and Proposition 2 that
with \(C > 0\). On the other hand we can use (25) repeatedly \(t\) times to deduce
The block decomposition (28) implies that \(\lambda _{ \text{ max }}({\varvec{A}}) = \max _{a=1,\ldots ,m} \lambda _{ \text{ max }}({\varvec{A}}_{aa})\), and since (148)–(149) hold for any vector \({\varvec{W}}\) with non-negative components and any admissible \({\varvec{H}}_0\), (29) follows. In the sequel, we therefore write \(\lambda =\lambda _{ \text{ max }}({\varvec{A}})\).
If the maximum in (29) is attained for a unique \(1\le c\le m\), we lump \(\mathcal{I}_2 = \bar{\mathcal{C}}_1 \cup \bar{\mathcal{C}}_2 \cup \bar{\mathcal{C}}_3\) into (at most) three components \(\bar{\mathcal{C}}_1 = \cup _{a=1}^{c-1} \mathcal{C}_a\), \(\bar{\mathcal{C}}_2 = \mathcal{C}_c\) and \(\bar{\mathcal{C}}_3 = \cup _{a=c+1}^{m} \mathcal{C}_a\). In particular, we put \(\bar{\mathcal{C}}_1=\emptyset \) if \(c=1\), and \(\bar{\mathcal{C}}_3=\emptyset \) if \(c=m\). The corresponding block decomposition of \({\varvec{A}}\) is
It can be shown that matrices \({\varvec{C}}_{ab}\) exist for \(1\le b \le a \le 3\) such that
as \(t\rightarrow \infty \). We will identify the components of \({\varvec{C}}=({\varvec{C}}_{ab})\), and in this process find the right and left eigenvectors \({\varvec{r}}\) and \({\varvec{\rho }}\) of \({\varvec{A}}\).
Starting with the diagonal submatrices of \({\varvec{C}}\), it follows that \({\varvec{C}}_{11}={\varvec{C}}_{33}={\varvec{0}}\), since \(\lambda _{ \text{ max }}(\bar{{\varvec{A}}}_{aa}) < \lambda \) for \(a=1,3\). On the other hand, since \(\bar{{\varvec{A}}}_{22}={\varvec{A}}_{cc}\) is irreducible with \(\lambda _{ \text{ max }}(\bar{{\varvec{A}}}_{aa}) = \lambda \), Perron–Frobenius Theorem and (149) imply that \(\lambda \) is a simple eigenvalue of \(\bar{{\varvec{A}}}_{22}\) with periodicity \(1\). Thus we can find a right eigenvector \({\varvec{r}}_2 = \left( r_{2ij};\, ij\in \bar{\mathcal{C}}_2\right) ^\prime \), and a left eigenvector \({\varvec{\rho }}_2 = \left( \rho _{2ij};\, ij\in \bar{\mathcal{C}}_2\right) \) of \(\bar{{\varvec{A}}}_{22}\) with strictly positive components, normalized so that \(\sum _{ij\in \bar{\mathcal{C}}_2} \rho _{2ij} = \sum _{ij\in \bar{\mathcal{C}}_2} r_{2ij}\rho _{2ij} = 1\), with
For the non-diagonal elements of \({\varvec{C}}\) we use the three recursions
insert (151) into (153), divide both sides of all three equations in (153) by \(\lambda ^t\) and let \(t\rightarrow \infty \). After some computations, this yields
with \({\varvec{I}}_p\) an identity matrix of order \(p\). Finally, inserting (152) into (154), we find that \({\varvec{C}}= {\varvec{r}}{\varvec{\rho }} = ({\varvec{0}}^\prime ,{\varvec{r}}_2^\prime ,{\varvec{r}}_3^\prime )^\prime ({\varvec{\rho }}_1,{\varvec{\rho }}_2,{\varvec{0}})\), with
Both \({\varvec{r}}_3\) and \({\varvec{\rho }}_1\) must have non-negative elements, since \({\varvec{r}}_2\) and \({\varvec{\rho }}_2\) have strictly positive elements, all \(\bar{{\varvec{A}}}_{ab}\) are non-negative and \(\bar{{\varvec{A}}}_{11},\bar{{\varvec{A}}}_{33}\) have all their eigenvalues less than \(\lambda \). Finally, since \({\varvec{r}}_2\) and \({\varvec{\rho }}_2\) are right and left eigenvectors of \(\bar{{\varvec{A}}}_{22}\) with eigenvalue \(\lambda \), it follows after some computations from (155) and the block decomposition (150) of \({\varvec{A}}\), that \({\varvec{r}}\) and \({\varvec{\rho }}\) are right and left eigenvectors of \({\varvec{A}}\) with eigenvalue \(\lambda \). \(\square \)
Proof of Theorem 3
Recall that \(\mathcal{H}_{tij}\) in (22) is the probability that two genes at time \(t\) have different types of alleles when picked independently from subpopulations \(i\) and \(j\), with replacement if \(i=j\). Conditionally on \(\varvec{\mathcal {B}}_t\) and \({\varvec{X}}_{t-1}\), we compute the expected value of this probability by conditioning on the parental subpopulations \(k\) and \(l\) of \(i\) and \(j\), and then take into account whether the two parental genes are identical or not. We find that
since the probability of picking two different genes at time \(t\) is \(1-1/(2Nu_i)\) when \(i=j\) and 1 when \(i\ne j\). Then the probability that the two parental genes from subpopulations \(k\) and \(l\) are different is 1 when \(k\ne l\) and \(1-\mathcal{P}_{tijk}\) when \(k=l\). In the former \(k\ne l\) case, the probability is \(\mathcal{H}_{t-1,kl}\) that the parental genes are different by state, and when \(k=l\), the probability is \(\mathcal{H}_{t-1,kk}/(1-(2Nu_k)^{-1})\) that the two parental genes are different by state.
We can express the coalescence probability \(p_{ijk}\) in (37) in terms of \(\mathcal{P}_{tijk}\) as
Then average with respect to \(\varvec{\mathcal {B}}_t\) on the left and right hand sides of (156), use independence between \(\varvec{\mathcal {B}}_t\) and \({\varvec{X}}_{t-1}\), invoke (35), (41) and (157), to find that
in accordance with (40). As a next step we average both sides of (158) with respect to \({\varvec{X}}_{t-1}\), using starting distribution \(\pi \) for \({\varvec{X}}_0\), and get
which is equivalent to (25). To verify (39), we first compute
and introduce the variables
which are conditional versions of \(V_{kij}\) in (38). Then average with respect to the exchangeable random vectors \(\{{\varvec{\nu }}_{tkg}\}_{g=1}^{2Nu_k}\) in (160) to deduce that
We can rewrite (34) as
and taking the product of the last two displayed equations, we find that
Formula (39) then follows from (157) by averaging with respect to \(\varvec{\mathcal {B}}_t\) in (163), using that \(E(\mathcal{V}_{tkij})=V_{kij}\), and finally dividing by \(Q_{ij,kk}\). \(\square \)
Proof of Corollary 3
Define \({\varvec{\phi }}_3\) as in (42), and the gene diversity vector \({\varvec{\mathcal{H}}}({\varvec{x}}) = \text{ vec }\left( (x_i(1-x_j)+x_j(1-x_i))_{i,j=1}^s\right) ^\prime \) of length \(s^2\) obtained from an allele frequency vector \({\varvec{x}}\) of length \(s\). Then
In the fifth step we used (40), and in the seventh step that \({\varvec{\rho }}\) is a left eigenvector of \({\varvec{A}}\) with eigenvalue \(\lambda \). Formula (43) is proved in the same way as the corresponding result for the right eigenvector \({\varvec{r}}\) of \({\varvec{A}}\) in Theorem 2. \(\square \)
Verifying ( 63 ), ( 64 ) and ( 65 ) We first insert (7) and the expression for \(Q_{ij,kl}\) in (51) into (39), and find that \(p_{ijk} = \left[ V_{kij}/(M_{ki}M_{kj})\right] / (2Nu_k-1_{\{i=j\}}u_k/u_i)\). Since \(E(\nu _{tki1})=M_{ki}\), it follows from (38) that
In order to compute \(\text{ Var }(\nu _{tki1})\) and \(\text{ Cov }(\nu _{tki1},\nu _{tkj1})\) for \(k=1\), the subpopulation of grandpaternally inherited alleles of the fathers of time \(t-1\), let \(\zeta _{tmmg} = \nu _{t11g} + \nu _{t21g^\prime }\) and \(\zeta _{tmfg} = \nu _{t13g} + \nu _{t23g^\prime }\) be the total number of children that are males and females respectively, of the male in time \(t-1\) whose paternally and maternally inherited genes have been assigned number \(g\) and \(g^\prime =g^\prime (g)\) from \(1,\ldots ,N_m=2Nu_1\). (Due to the convention (9) that the first \(2Nu_kX_{t-1,k}\) genes of subpopulation \(k\) have the specified allele, we cannot assume \(g^\prime =g\).) Due to exchangeability, \(\tau _{mm}^2 = \text{ Var }(\zeta _{tmm1})\), \(\tau _{mf}^2 = \text{ Var }(\zeta _{tmf1})\) and \(\tau _{mm,mf} = \text{ Cov }(\zeta _{tmm1},\zeta _{tmf1})\).
It follows from Mendel’s law of inheritance that either the paternally or maternally inherited gamete is passed on to the offspring, with equal probability 0.5, independently between matings. Hence \(\nu _{t11g}|\zeta _{tmmg} \sim \text{ Bin }(\zeta _{tmmg},0.5)\) and \(\nu _{t13g}|\zeta _{tmfg} \sim \text{ Bin }(\zeta _{tmfg},0.5)\), which implies
and analogously \(\text{ Var }(\nu _{t13g}) = \tau _{mf}^2/4 + M_{13}/2\) and \(\text{ Cov }(\nu _{t11g},\nu _{t13g}) = \tau _{mm,mf}/4\). Inserting the last three formulas and (62) into (164), we see that \(p_{111}\), \(p_{331}\) and \(p_{131}\) simplify to (63), (64) and (65) respectively. \(\square \)
Proof of Theorem 5
Formulas (78) and (79) follow easily from a first order Taylor expansion of (73) with respect to \(\varepsilon \) at 0, making use of (75) and (77). Next we apply Theorem 4. We want to show that (80) follows from (69), with \(\varepsilon \) as in (74) and that (81) is obtained by inserting (79) into (72). Indeed,
where in the last step we employed that \({\varvec{\rho }}(0)\) is a left eigenvector of \({\varvec{Q}}(0)\) with eigenvalue 1, and moreover that \(\sum _{kl} Q_{ij,kl}(\varepsilon )=1\) for all \(\varepsilon \ge 0\), which implies \(\sum _{kl} Q_{ij,kl}(0)=1\) and \(\sum _{kl} \dot{Q}_{ij,kl}=0\). \(\square \)
Proof of Corollary 5 and 6
Corollary 4 implies that \(Q_{ij,kl}(0)=B_{ik}B_{jl}\) in Corollary 6, so that
follows from (4), and \({\varvec{\gamma }}\otimes {\varvec{\gamma }}= (\gamma _i\gamma _j)\) is a left eigenvector of \({\varvec{Q}}(0)\) with eigenvalue 1. This proves (89), and insertion into (81) yields (90). We then use (7) and (76) in order to prove (91), since
In Corollary 5, it follows from Corollary 4 that \(Q_{ii,kl}(0) = 1_{\{k=l\}}B_{ik}\), and
because of (4). This proves (86), which inserted into (81) yields (87). Since
and \(Q_{ii,kk}(0)=B_{ik}\), we get \(\sigma _{iik} = \lim _{N\rightarrow \infty } \left( B_{ik}\bar{V}_{kii}u_k^2\right) /\left( (Nu_k)^{1-\beta }u_i^2B_{ik}\right) \) from (76), thereby proving (88). \(\square \)
Motivation of Lemma 1. We first sketch a proof of (110), using similar calculations as in Möhle (1998a). By the definition of an equilibrium distribution of a Markov chain, \({\varvec{B}}(\varepsilon )^t \rightarrow {\varvec{1}}{\varvec{\gamma }}(\varepsilon )\) as \(t\rightarrow \infty \), where \({\varvec{1}}\) is a column vector of ones of length \(s\). Hence we will study \({\varvec{B}}(\varepsilon )^t\) for large \(t\) and use the approximation
which can be shown to be accurate in the limit \(\varepsilon \rightarrow 0\), with
a block diagonal matrix and \({\varvec{1}}_d=(1,\ldots ,1)^\prime \) a column vector of \(|\mathcal{I}(d)|\) ones. It follows from (108) and \({\varvec{\gamma }}_d={\varvec{\gamma }}_d{\varvec{B}}_{dd}(0)\), that
where \({\varvec{G}}^r=(G_{ab}^{(r)})\). Putting \(t=x/\varepsilon \) in (165), we notice that
where
since \({\varvec{\theta }}=(\theta _1,\ldots ,\theta _m)\) is an equilibrium distribution of the continuous time Markov process with infinitesimal generator \({\varvec{G}}\). Combining the last two displayed equations, we find for large \(x\) and small \(\varepsilon \) that
where \({\varvec{\gamma }}(0)=(\gamma _i(0))\). We use (106), (108) and (110) to prove (111);
since the row sums of \(\dot{B}\) are zero. Finally, (112) follows from (105), since
\(\square \)
Proving ( 133 ). By (105) and (132), the off-diagonal block elements of \(\dot{{\varvec{B}}}\) are
where \(B_{1k}(0)\) is the probability that the parent of a newborn in deme 1 originates from age class \(k\) of that deme when there is no migration, and similarly \(B_{z+1,z+k}(0)\) is the probability that the parent of a newborn in deme 2 has a parent from age class \(k\) of that deme. Invoking (108), the elements of \({\varvec{G}}\) are
Use (119) to deduce that \(\dot{{\varvec{\Lambda }}}=(\dot{\Lambda }_{\alpha ,\beta })_{\alpha ,\beta =12,21}\) is a diagonal matrix of order \(v=2(2-1)=2\), with both diagonal elements equal to \(G_{11}+G_{22}\). Therefore
Because of (8) and (132), the numerator of (122) equals
Inserting (166) and (167) into (122), we arrive at (133). \(\square \)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
About this article
Cite this article
Hössjer, O. On the eigenvalue effective size of structured populations. J. Math. Biol. 71, 595–646 (2015). https://doi.org/10.1007/s00285-014-0832-5
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-014-0832-5
Keywords
- Eigenvalue effective size
- Coalescence theory
- Predicted gene diversity
- Migration
- Perron–Frobenius
- Perturbation theory of eigenvalues