Abstract
The epidemic threshold is probably the most studied quantity in the modelling of epidemics on networks. For a large class of networks and dynamics, it is well studied and understood. However, it is less so for clustered networks where theoretical results are mostly limited to idealised networks. In this paper we focus on a class of models known as pairwise models where, to our knowledge, no analytical result for the epidemic threshold exists. We show that by exploiting the presence of fast variables and using some standard techniques from perturbation theory we are able to obtain the epidemic threshold analytically. We validate this new threshold by comparing it to the threshold based on the numerical solution of the full system. The agreement is found to be excellent over a wide range of values of the clustering coefficient, transmission rate and average degree of the network. Interestingly, we find that the analytical form of the threshold depends on the choice of closure, highlighting the importance of model selection when dealing with real-world epidemics. Nevertheless, we expect that our method will extend to other systems in which fast variables are present.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Epidemic dynamics on networks, being susceptible-infected-susceptible (SIS), susceptible-infected-recovered (SIR) or otherwise, are often modelled as continuous time Markov chains with discrete but extremely large state spaces of order \(m^{N}\), where m denotes the number of different disease statuses (e.g. \(m=2\) for SIS and \(m=3\) for SIR) and N stands for the number of nodes in the network. This makes the analysis of the resulting exact system almost impossible, except for some specific network topologies such as the fully connected network, networks with considerable structural symmetry or networks with few nodes (Kiss et al. 2017; Holme 2017).
Often, this problem is dealt with by focusing on mean-field models where the goal is to derive, often heuristically, a system of ordinary or integro-differential equations that describe (non-Markovian) epidemics for some average quantities, such as the expected number of nodes in various states, the expected number of links in various states or the expected number of star-like structures (focusing on a node and all of its neighbours). These methods usually rely on closures to break the dependency on higher-order moments (e.g. the expected number of nodes in a state depends on the expected number of links in certain states and so on). Such an approach has led to a number of models including heterogeneous or degree-based mean-field (Pastor-Satorras and Vespignani 2001; Pastor-Satorras et al. 2015), pairwise (Rand 1999; Keeling 1999), effective-degree (Lindquist et al. 2011), edge-based compartmental (Miller et al. 2012) and message passing (Karrer and Newman 2010a), to name a few. These models essentially differ in the choice of variables over which the averaging is done. Perhaps the most compact model with the fewest number of equations is the edge-based compartmental model (Miller and Volz 2013) which is valid for heterogeneous networks with Markovian SIR epidemics, although extensions of this model for arbitrary infection and recovery processes are possible (Sherborne et al. 2018).
Pairwise models have been extremely popular and the very first model for regular networks and SIR epidemics (Rand 1999; Keeling 1999) has been generalised to heterogeneous networks (Eames and Keeling 2002), preferentially mixing networks (Eames and Keeling 2002), directed (Sharkey et al. 2006) and weighted networks (Rattana et al. 2013), adaptive networks (Gross et al. 2006; Kiss et al. 2012; Szabó-Solticzky et al. 2016), and structured networks (House et al. 2009) among others. Perhaps this is due to the relative simplicity and transparency of the pairwise model, whereby variables have a straightforward interpretation and a basic understanding of the network and epidemic dynamics coupled with good bookkeeping leads to valid and analytically tractable model equations. Pairwise models have been successfully used to derive analytically the epidemic threshold and final epidemic size, with these results mostly limited to networks without clustering. The propensity of contacts to cluster, i.e. two neighbours of a node being neighbours of one another, is known to lead to many complications, and modelling epidemics on clustered networks using analytically tractable mean-field models is still limited to networks with specific structural features (House et al. 2009; Newman 2009; Miller 2009a, b; Karrer and Newman 2010b; Volz et al. 2011; Ritchie et al. 2016). However, using approaches borrowed from percolation theory (Miller 2009b) and focusing more on the stochastic process itself (Trapman 2007a), some results have been obtained. For example, Miller (2009b) showed that for the SIR epidemic on clustered networks with heterogeneous degree distributions, the basic reproduction number is given by
where \(\langle k^i \rangle \) stands for the ith moment of the degree distribution, T is the probability of infection spreading across a link connecting an infected to a susceptible node and \(\langle n_{\triangle }\rangle \) denotes the average number of triangles that a node belongs to. The first positive term in Eq. (1.1) corresponds to the threshold for configuration-type networks without clustering. The second term in Eq. (1.1), which is negative, shows that clustering reduces the epidemic threshold when compared to the unclustered case, the contribution of the remaining terms being of a smaller order.
For pairwise models, clustering first manifests itself by requiring a different and more complex closure, which makes the analysis of the resulting system, even for regular networks and SIR dynamics, challenging. Furthermore, it turns out that such a closure may in fact fail to conserve pair-level relations and may not accurately reflect the early growth of quantities such as closed loops of three with all nodes being infected (House and Keeling 2010). Such considerations have led to an improved closure being developed in an effort to keep as many true features of the exact epidemic process as possible (House and Keeling 2010). In this paper we focus on the classic pairwise model for regular networks with clustering, using both the simplest closure and a variant of the improved closure. We show that by working with two fast variables, corresponding to correlations between neigbouring nodes during the epidemic, we can analytically determine the epidemic threshold as an asymptotic expansion in terms of the global clustering coefficient \(\phi \), defined in Sect. 2.1.
The use of fast variables is not new (Keeling 1999; Juher et al. 2013; Llensa et al. 2014; Britton et al. 2016; Eames 2008). However, in many cases the epidemic threshold has only been obtained numerically and it was framed in terms of a growth-rate-based threshold, which is equivalent to the basic reproduction number at the critical point. Eames (2008) considered a hybrid pairwise model incorporating random and clustered contacts, with the analysis focusing on the growth-rate-based threshold. Eames (2008) derived a number of results, some analytic (the critical clustering coefficient for which an epidemic can take off) and some semi-analytic, and showed, in agreement with most studies, that clustering inhibits the spread of the epidemic when compared to an equivalent network without clustering but with equivalent parameter values governing the epidemic process. However, no analytic expression for the epidemic threshold was provided.
More recently, Li et al. (2018) calculated the epidemic threshold in a pairwise model for clustered networks with closures based on the number of links in a motif, rather than nodes. This led to
where n is the average number of links per node, \(\phi \) is the global clustering coefficient, and \(\tau \) and \(\gamma \) are the infection and recovery rates, respectively. The expression above can be expanded in terms of the clustering coefficient \(\phi \) to give
which again demonstrates that clustering reduces the epidemic threshold.
Building on these results, and effectively extending the work by Keeling (1999) and Eames (2008), our paper presents a method to determine the epidemic threshold analytically and applies it in the context of pairwise models with two different closures for clustered networks. The paper is structured as follows. In Sect. 2 we outline the model with closures for unclustered and clustered networks discussed in Sect. 3. In Sect. 4 we briefly review existing results and approaches for the pairwise model with the simple closure and then focus on the correlation structure in terms of fast variables, showing that the epidemic threshold can be expressed via the solution of a cubic polynomial. This key solution is determined numerically and analytically as an asymptotic expansion in terms of the clustering coefficient. In Sect. 5 we show that our approach extends to a compact version of the improved closure, thus validating and generalising our approach. Finally, we conclude with a discussion of the results, including comparing the threshold to other known results and touching upon a number of possible extensions.
2 Model formulation
2.1 The network
We begin by considering a population of N individuals with its contact structure described by an undirected network with adjacency matrix \(G=(g_{ij})_{i,j=1,2,\dots , N}\) where \(g_{ij}=1\) if nodes i and j are connected and zero otherwise. Self-loops are excluded, so \(g_{ii}=0\) and \(g_{ij}=g_{ji}\) for all \(i,j=1,2, \dots N\). The network is static and regular, such that each individual has exactly n edges or links. The sum over all elements of G is defined as \(||G||=\sum _{i,j}g_{ij}\). Hence, the number of doubly counted links in the network is \(||G||=nN\). More importantly, using simple matrix operations on G, we can calculate the global clustering coefficient of the network
where \(trace(G^{3})\) yields six times the number of closed triples or loops of length three (uniquely counted) and \(||G^{2}||-trace(G^{2})\) is twice the number of triples (open and closed, also uniquely counted).
2.2 SIR dynamics
The standard SIR epidemic dynamics on a network is considered. The dynamics are driven by two processes: (a) infection and (b) recovery from infection. Infection can spread from an infected and infectious node to any of its susceptible neighbours and this is modelled as a Poisson point process with per-link infection rate \(\tau \). Infectious nodes recover from infection at constant rate \(\gamma \).
2.3 The unclosed pairwise model
Let \(A_{i}\) equal 1 if the individual at node i is of type A and equal zero otherwise. Then single nodes (singles) of type A can be counted as \([A]=\sum _{i}A_{i}\), pairs of nodes (pairs) of type \(A-B\) can be counted as \([AB]=\sum _{i,j}A_{i}B_{j}g_{ij}\) and triples of nodes (triples) of type \(A-B-C\) can be counted as \([ABC]=\sum _{i,j,k}A_{i}B_{j}C_{k}g_{ij}g_{jk}\). This method of counting means that pairs are counted once in each direction, so \([AB]=[BA]\), and [AA] is even. Using this notation to keep track of singles, pairs and triples leads to the following system of pairwise equations describing the SIR epidemic on networks:
We note that Eqs. (2.4)–(2.6) contain triples but evolution equations for these are not given. To determine solutions of the system, we must find a way to account for these triples in terms of pairs and singles, a method referred to as closing the system. The system above is exact before a closure is applied. This means that it can be derived directly from the exact stochastic epidemic model on the network, given by a continuous time Markov Chain, without making any approximations [a precise proof for the SIS epidemic was given by Taylor et al. (2012)]. The flow between compartments and the rates of the SIR pairwise model are illustrated in Fig. 1. The system given above only contains dynamically relevant variables, i.e. those that emerge naturally but following a strict bookkeeping rule, and those that appear when a chosen closure for the triples is considered.
3 Closures
A quick inspection of the unclosed pairwise system (2.2)–(2.6) reveals that only triples of type [ASI] need closing, with \(A\in \{S, I\}\). These triples, as well as triples of type [RSI], are illustrated in Fig. 2 for unclustered and clustered networks.
3.1 Closure for unclustered networks
First, we consider the situation depicted in Fig. 2a. We aim to find an approximation for the distribution of the random variables \(X_i\) which take values from the set \(\{S, I, R\}\). Several observations can be made. The expected number of \(A-S\) type links is [AS] and the total number of links emanating from susceptible nodes counted across the whole network is n[S]. Hence, the most straightforward approximation would be to assume that \(X_{i}\), with \(i=1,2,\dots ,n-1\), are independent and identically Bernoulli distributed random variables with probability \(p_{A|S-I}^{uc}=\frac{[AS]}{n[S]}\), where \(p_{A|S-I}^{uc}\) stands for the probability that a neighbour of a susceptible node already connected to an infected node will be in state A, provided that the network is unclustered. Averaging across the whole network leads to the closure
It is important to note that the new closed system, obtained upon using Eq. (3.1) in system (2.2)–(2.6), is effectively an approximation of the exact pairwise model (2.2)–(2.6) and one should question if closure (3.1) conserves the properties of the stochastic process and those of the counting on the network. For example, it is expected that in the closed system the number of nodes is conserved, i.e. \([S]+[I]+[R]=N\). Furthermore, the number of pairs of different types must sum to nN. More subtle conditions refer to the conservation of link types at node level (\([SS]+[SI]+[SR]=n[S]\)) and pair level (\([SSI]+[ISI]+[RSI]=(n-1)[SI]\)), respectively. It turns out that the closure for unclustered networks (3.1) conserves these relations (Kiss et al. 2017). Finally, the validity of closures can be empirically assessed by looking at the initial growth rate of the number of open and closed triples, where the number of open triples comprised of three infectious nodes should grow differently to the number of such closed triples. Of course such subtle tests are usually preceded by direct comparisons between the numerical solution of the closed pairwise system and explicit stochastic network simulations for a range of parameters. Such tests initially focus on prevalence of infection and final epidemic size but may include expected number of pairs.
3.2 Closures for clustered networks
3.2.1 Simple closure
The presence of closed loops of length three, as illustrated in Fig. 2b, introduces some complications. Namely, a neighbour of the central susceptible node that is itself connected to an infected neighbour of the central node is less likely to be susceptible due to the added pressure from the infected neighbour, when compared to the case when the force of infection is distributed evenly, as it is the case for the closure for unclustered networks (3.1). More precisely, the epidemic process on the network displays clear correlations. Cator and Van Mieghem (2014) have shown that the exact SIS and SIR epidemics on networks are non-negatively correlated in the sense that \({\mathbf {P}}(I_{i}I_{j})\ge {\mathbf {P}}(I_i){\mathbf {P}}(I_j)\). Here, \({\mathbf {P}}(I_iI_j)\) represents the probability that nodes i and j, connected by a link, are both infected, while \({\mathbf {P}}(I_i)\) stands for the probability of node i being infected. For this result to hold, all processes must be Markovian and infection rates across all links and recovery rates of all nodes have to be fixed a priori. Using the pairwise model for an SIS epidemic on an unclustered network with closure (3.1), it has been shown that the same correlation is preserved when averaging at the population level (Kiss et al. 2017). While the proof has not been extended to the pairwise SIR model, intuitively we expect to find the same correlation structure. Based on these observations, we assume that the correlation structure in exact SIS and SIR epidemics on networks averaged at the population level is maintained. Hence, the inequalities
hold, where [AB] and [A] with \(A,B\in \{S,I\}\) represent the expected counts of pairs and singles of the corresponding types taken from the exact model, i.e., the continuous time full Markov chain.
Intuitively, this means that as the epidemic spreads on the network, infected nodes are more likely to have neighbours which are themselves infected (either those that infected the node or were infected by it), and at the ‘front’ of the epidemic we would expect to observe a ‘sea’ of susceptible nodes alongside a ‘front’ of links between susceptible and infected nodes that drives the epidemic. Hence, clustering and correlations need to be accounted for and a new \(p_{A|S-I}^{c}\) for clustered networks needs to be defined. This has been done by Keeling (1999) [see also work by Rand (1999) and Keeling et al. (1997)] and relies on a correlation factor, \(C_{AB}\), that is able to capture the propensity that two nodes connected by a link are in states A and B, respectively. This is given by
where \(A,B\in \{S,I\}\). This effectively compares the expected number of edges of type [AB] to what its value would be if nodes were labelled at random with [A] nodes of type A and [B] nodes of type B. If \(C_{AB}>1\), then nodes of type A and B are positively correlated, whereas if nodes of type A and B are negatively correlated, \(C_{AB}<1\). As expected, \(C_{AB}=1\) means that nodes are effectively labelled as type A or B at random. Equation (3.2) implies that
We can modify \(p_{A|S-I}^{uc}=\frac{[AS]}{n[S]}\) to reflect these observations, leading to \(p_{A|S-I}^{c}=\frac{[AS]}{n[S]}C_{AI}\). However, before the closure can be written, open and closed loops need to be treated separately. In order to do this, we split the closure based on whether the neighbour whose state is to be determined is part of a closed loop of three nodes and thus in direct contact with an infectious node, or not. This leads to
where \(\phi \) is defined in Eq. (2.1). With this in mind, the closure can be derived by averaging equation (3.1) over the unclustered and clustered parts of the network. This leads to
This same closure has been derived by Keeling et al. (1997) and Keeling (1999). Framing \(p_{A|S-I}^{uc}\) and \(p_{A|S-I}^{c}\) more generally and independently of the network type, i.e. simply considering \(p_A\), the following statement holds:
Proposition 1
Consider a closure of the following form \([ASI]=(n-1)[SI]p_{A}\). If \(\sum _{A}p_{A}=1\), where A is taken over all possible states, then \(\sum _{A}[ASI]=(n-1)[SI]\).
Proof
\(\sum _{A}[ASI]=(n-1)[SI]\sum _{A}p_{A}=(n-1)[SI]\). \(\square \)
3.2.2 Improved closure
We note that while \(p_{A|S-I}^{uc}\) satisfies the above proposition, \(p_{A|S-I}^{c}\) does not. In particular, we find
However, for the clustered part of the network this is not the case. We find that
which does not result in the desired \((n-1)[SI]\). This can be corrected in a straightforward way by defining
Hence we can now write
as required. It is informative to investigate the relationship between the various probability models that lead to different closures. This is summarised in the following proposition.
Proposition 2
For closures applied across the clustered part of the network and assuming that the number of nodes in state R is negligible, it follows that
and
Proof
All three probabilities follow from their definitions and assuming that \(A\in \{S,I\}\). Since \(S-I\) links are negatively correlated (3.2), it follows that \(C_{SI}=\frac{N[SI]}{n[S][I]}\le 1\) and as a result
While \(p_{S|S-I}^{c}\) has a natural interpretation (it is a simple discounted variant of the probability from the unclustered network case and takes into account the observation that if the neighbour of a central susceptible node is connected to one of the infected neighbours of the same node then it is less likely that the node in question is susceptible), the interpretation of \(p_{S|S-I}^{c_{new}}\) is less obvious. A close inspection reveals that \(p_{S|S-I}^{c_{new}}\) can be rewritten as
However, combining \([SI] \le n[S]\frac{[I]}{N}\) with \([I] \le \frac{N}{n}\frac{[II]}{[I]}\), as given in Eq. (3.2), leads to \([SI] \le [II]\frac{[S]}{[I]}\). Finally, using the relation \([SI] \le [II]\frac{[S]}{[I]}\) in Eq. (3.12) yields
Equation (3.13) illustrates that as expected \(p_{S|S-I}^{c_{new}} \le p_{S|S-I}^{uc}\). Again, this simply shows that for clustered networks and for the setup in Fig. 2, it is less likely to find neighbours who are susceptible compared with the unclustered network case. \(\square \)
Taking into account the new way of defining \(p_{A|S-I}^{c_{new}}\), the improved closure yields
We finally note that the closures rely heavily on the assumption of how the states of the neighbours are distributed, and the assumption of independent and identically Bernoulli-distributed variables is a strong one. For clustered networks in particular, we have illustrated different ways of incorporating correlations induced by closed cycles of length three. Despite these seemingly strong assumptions, it is known that the pairwise model for unclustered networks is equivalent to the edge-based compartmental equivalent on configuration networks (Miller and Kiss 2014; Kiss et al. 2017) and the latter has been shown to be the limiting system of the stochastic network epidemic model (Decreusefond et al. 2012; Janson et al. 2014). For clustered networks we are not aware of such results.
4 Results for the pairwise model with the simple closure
4.1 Background
Using the simple closure for clustered networks (3.7), and writing \(\xi =\frac{(n-1)}{n}\), we obtain the following closed pairwise model equations describing an SIR epidemic on a clustered regular network of N individuals with degree n:
For model equations (4.1)–(4.5), the basic reproductive ratio (\(R_{0}\)) is considered by Keeling (1999). Starting from the evolution equation of the expected number of infectious nodes leads to
where \(C_{SI}\) is defined in Eq. (3.3). Taking into account that \(\tau n=\beta \) and that initially \([S]\simeq N\), Keeling (1999) claimed that \(R_{0}=C_{SI}\beta /\gamma \). It is important to note that this \(R_0\) is not the classical \(R_0\) in the sense of describing the expected number of new infections produced by a typical infectious individual when introduced into a fully susceptible population. Rather it can be thought of as a growth-rate-based threshold, and has the same properties as the classical \(R_0\) when both are exactly one. In what follows, we will simply refer to it as R (Eames 2008; Kiss et al. 2012).
In order to determine R explicitly, Keeling (1999) considered the early behaviour of \(C_{SI}\) and found that this variable is given by the ordinary differential equation (ODE)
However, the ODE above depends on the behaviour of \([I]C_{II}/N\) and Keeling (1999) showed that
Considering the quasi-equilibrium of \(C_{SI}\), referred to as \(C_{SI}^{*}\), in Eq. (4.6) together with the expression for \([I]C_{II}/N\) in Eq. (4.7), one finds that \(C_{SI}^{*}\) is given by
Hence, R can be calculated as \(C_{SI}^{*}\beta /\gamma \), at least numerically. Variables such as \(C_{SI}\) and \(C_{II}\) describe the correlations between the states of neighbouring nodes on the network as the epidemic unfolds and these have been studied numerically by Keeling (1999).
For model equations (4.1)–(4.5) and when there is no clustering in the network (\(\phi =0\)), a further simplification of Eq. (4.8) can be achieved (Keeling 1999). To determine \(R=C_{SI}^{*}\beta /\gamma \) in this case, one can simply solve
to find \(C_{SI}^{*}=\frac{n-2}{n}\) and thus \(R=\frac{(n-2)\tau }{\gamma }\).
Unfortunately when \(\phi \ne 0\), according to our knowledge, the quasi-equilibrium values can only be determined numerically via Eq. (4.8). In what follows, we show that by working with two new variables, \(\alpha =[SI]/[I]\) and \(\delta =[II]/[I]\), which are still closely linked to the correlations formed during the spreading process, it is possible to obtain the epidemic threshold as the solution of a cubic equation and, more importantly, we show that this solution can be approximated by an asymptotic expansion in powers of \(\phi \).
4.2 Epidemic threshold
Consider the initial phase of an infection invading an entirely susceptible population in the pairwise model, described by Eqs. (4.1)–(4.5). We find that
We know the quantity \(\gamma [I]\) remains non-negative regardless of time in the epidemic process, and we choose to consider the epidemic threshold in terms of \(\frac{[SI]}{[I]}\). This leads to \(R=\frac{\tau [SI]}{\gamma [I]}\). When \(R>1\) an epidemic will occur, and when \(R<1\) the epidemic will die out. Although we know the values of \(\tau \) and \(\gamma \), to determine if an epidemic will occur a priori, we require further knowledge about the quantity \(\frac{[SI]}{[I]}\) at some initial time close to \(t=0\). At \(t=0\) or at the disease-free steady state, both [SI] and [I] are zero and hence their ratio is ill-defined. Furthermore, gaining knowledge about \(\frac{[SI]}{[I]}\) will involve \(\frac{[II]}{[I]}\) and this term suffers from the same problem, being ill-defined at \(t=0\). While this is similar to the approach taken by Keeling (1999), we focus on the variables \(\frac{[SI]}{[I]}\) and \(\frac{[II]}{[I]}\), and we motivate our choice below. The problem of finding the epidemic threshold can be dealt with in at least two other different but equivalent ways. First, one can carry out a simple linear stability analysis of the disease-free steady state as is shown in Appendices B and C. Second, the threshold can also be computed as the largest eigenvalue of the next generation matrix, see Sect. 6. However, in both cases, the variables [SI] / [I] and [II] / [I] turn out to play a key role and their values for small times need to be determined.
4.3 Fast variables with the simple closure
To circumvent the problem of the ill-defined variables above, we exploit the fact that \(\alpha :=\frac{[SI]}{[I]}\) and \(\delta :=\frac{[II]}{[I]}\) are fast variables when compared to the time course of the epidemic. Figure 3 shows clearly that \(\alpha \) and \(\delta \) are fast compared to the epidemic process and that they quickly converge to a quasi-equilibrium. Hence, at early times, \(\alpha \) and \(\delta \) attain their quasi-equilibrium values, and these are the values that can be used to compute the epidemic threshold.
We continue by deriving differential equations for the variables \(\alpha =\frac{[SI]}{[I]}\) and \(\delta =\frac{[II]}{[I]}\). Differentiating \(\alpha \) and \(\delta \) and using Eqs. (4.1)–(4.5) leads to
We note that this approach has already been exploited by Juher et al. (2013), Llensa et al. (2014) and Britton et al. (2016), with the authors focusing on combinations of SIS, SIR and SEIR models without demography and rewiring of \(S-I\) links to \(S-S\) links. In all these papers, systems of fast variables are derived and analysed in detail to gain information about the epidemic threshold and the stability of the disease-free or endemic steady states.
4.3.1 Global stability of the steady state
The analysis of system (4.11)–(4.12) is carried out in detail by Trapman (2007b) (see Appendix A of this paper). The only caveat there is that the global stability of the unique steady state was not confirmed, leaving the possibility of the existence of a limit cycle. Below, we sketch the main ideas of the proof and provide some extra results by using the Bendixson criterion.
The starting point is to show that system (4.11)–(4.12) admits a unique steady state which is biologically meaningful, i.e. \((\alpha ^{*},\delta ^{*}) \in D=\{(\alpha ,\delta ): 0 \le \alpha \le n, 0\le \delta \le n-\alpha \}\). First we show that the trajectories of the system remain in D for all appropriate initial conditions and all positive times. When \(\delta =0\), then \(d\delta /dt=2\tau \alpha >0\). When \(\alpha =0\), then \(d\alpha /dt=0\). However, by condition (4.15), \(\frac{d(d\alpha /dt)}{d\alpha }=\tau [(n-1)(1-\phi )-1]>0\). Finally, we need to show that if \(\alpha +\delta =n\) then \(d(\alpha +\delta )/dt<0\). By substituting \(\delta =n-\alpha \), and after some algebra we obtain that \(d(\alpha +\delta )/dt=\gamma (\alpha -n)-\tau (n-1)\phi \alpha (1-\alpha /n)^2<0\). The observations prove that D is invariant. A typical picture of the phase diagram is given in Fig. 4.
It turns out that both null clines can be written conveniently with \(\alpha \) being the independent and \(\delta \) being the dependent variable. The null clines corresponding to \(d\alpha /dt\) and \(d\delta /dt\) are given by
Several observations can be made. If \(\xi n (1-\phi )-1>0\), then \(\delta _1(\alpha )\) will be a decreasing function in \(\alpha \) and its intersection with the horizontal axis is at \(\alpha _1=\frac{\xi n(1-\phi )-1}{1-\xi \phi }\), which happens to be less than n. Furthermore, it is straightforward to show that \(d\delta _{2}(\alpha )/d\alpha >0\), which means that \(\delta _2(\alpha )\) is an increasing function in \(\alpha \). Given the behaviour of the null clines at \(\alpha =0\), it follows that their intersection gives rise to a unique steady state. Requiring that \(\xi n (1-\phi )-1>0\) is equivalent to
This is the same as found by Keeling (1999) in the limit of \(\beta =\tau n\) large and when assuming that at the threshold \(C_{SI}=\gamma /\beta \). This condition can also be derived directly from Eq. (4.21) by replacing \(\alpha =\tau /\gamma \) (which corresponds to the threshold \(R=\frac{\tau \alpha }{\gamma }\)) and then taking the limit of large \(\tau \). In fact, when \(\phi > (n-2)/(n-1)\) the disease dies out. Hence, the two null clines define a unique point of intersection as long as the condition above, (4.15), is met. The same argument holds even if the singularity of the second null cline happens to be in (0, n). However, we must also exclude the possibility that the intersection point will lie outside D. For example if the \(\delta _2\) null cline lies to the left of \(\delta _1\) then the \(\delta _{2}\) null cline may cross the \(\alpha +\delta =n\) boundary at a smaller value of \(\alpha \) than the \(\delta _{1}\) null cline does. However, this cannot happen because, in such a case, D would not be invariant since the solutions would leave D across the region of this boundary limited by the two null clines, which contradicts that fact that \(d(\alpha +\delta )/dt<0\) on this boundary.
Provided that condition (4.15) holds, Fig. 4 shows that a unique steady state exists. Trapman (2007b) showed that this steady state is locally stable but global stability was not confirmed. Here, in addition to the results by Trapman (2007b) we show that under certain assumptions the existence of a limit cycle can be ruled out by applying the Bendixson criterion. This also ensures the global stability of the unique steady state. Dividing the equations by \(\alpha \), the divergence of the system takes the form:
We separated the above expression into the non-clustered and clustered parts of the network. It is obvious that when \(\phi =0\) then \(B(\alpha ,\delta )<0\) and thus no limit cycle can occur. Now setting \(B(\alpha ,\delta )=0\) and neglecting the \(-\gamma /\alpha \) term defines the following curve
This intersects the horizontal axis at \(\alpha _B=\frac{n}{\xi \phi }-n/2\). Given that the slope of \(\delta _{B}\) is positive, the divergence will remain negative in D as long as the intersection with the horizontal axis is beyond n. This requires that
Rearranging this, we obtain
Hence, if the above holds then the unique steady state is globally stable. It is worth noting that if
then the global stability also holds for all \(\phi <(n-2)/(n-1)\), and as long as \(n< 6\) the above inequality holds.
Numerical tests suggest that global stability holds for all reasonable parameter values. For example, if the point of intersection of \(\delta _{B}\) with the horizontal axis is in \((\alpha _{\beta },n)\), then the non-existence of the limit cycle can be shown as follows. To the left of \(\delta _{B}\) the divergence is negative and the lower right quadrant of D is repellent.
4.3.2 Fast variables without clustering
When clustering is negligible and hence \(\phi =0\), we find that
where \(\xi =\frac{(n-1)}{n}\). The steady states of the system (4.18)–(4.19) are given by \((\alpha _{1}^{*},\delta _{1}^{*})=(0,0)\) and \((\alpha _{2}^{*}, \delta _{2}^{*})= \left( (n-2),\frac{2\tau (n-2)}{\gamma +\tau (n-2)}\right) \). Based on Eq. (4.10), it follows that \(R=\frac{\tau \alpha _{2}^{*}}{\gamma }=\frac{\tau (n-2)}{\gamma }.\)
4.3.3 Fast variables with clustering
When clustering is present in the network, the differential equations for \(\alpha \) and \(\delta \) are more complex and thus steady states are harder to compute. Firstly, we set Eq. (4.11) equal to zero and rearrange to isolate \(\delta \), finding
Plugging Eq. (4.20) into Eq. (4.12) leads to the following cubic equation in \(\alpha \):
The solution of the cubic equation (4.21) provides the steady state(s) of system (4.11)–(4.12), and allows the computation of the threshold via the formula \(R^{c}=\frac{\tau \alpha ^{*}}{\gamma }\). We note that the steady state in \(\alpha \) has to be biologically plausible. \(\alpha =\frac{[SI]}{[I]}\) restricts the steady state to be positive and to be less than n, since the average number of susceptible neighbours averaged over all infected nodes needs to be less than the average degree.
4.4 Asymptotic expansion of the epidemic threshold
The case of \(\phi \ne 0\) can be regarded as a perturbation of the case without clustering and we thus set out to find \(\alpha \) using a perturbation method. More precisely, we seek to find the roots of the cubic polynomial, given in Eq. (4.21), in terms of an asymptotic expansion in powers of \(\phi \), that is
Plugging (4.22) into Eq. (4.21) leads to
Collecting terms of order \(\phi ^{0}\) in (4.23) and after some algebra we find that \(\alpha _{0}\) satisfies:
Hence, \(\alpha _{0}=(n-2)\). The other solution, \(\alpha _0=-\gamma /\tau \) is not biologically feasible since by definition \(\alpha \) is positive. As expected, \(\alpha _{0}=(n-2)\) corresponds to the unclustered case. Collecting terms of order \(\phi \) in (4.23), we find a polynomial in terms of \(\alpha _{0}\) and \(\alpha _{1}\):
Equation (4.25) leads to
which, after substituting \(\alpha _{0}=(n-2)\) and \(\xi =\frac{(n-1)}{n}\) yields
To summarise, we have determined the first two coefficients \(\alpha _{0}\) and \(\alpha _{1}\) of the asymptotic expansion (4.22) which solves the cubic equation (4.21). Hence, the true solution is approximated by:
We make several remarks. First, the epidemic threshold will be given by \(R^{c}=\tau \alpha /\gamma \). Second, the coefficient of the first order correction of \(\alpha \) can be rearranged in terms of \(R=\frac{\tau (n-2)}{\gamma }\), the threshold for the case of unclustered networks, leading to
where \(a=2(n-1)/n\) and where terms in \(\phi \) of order larger than one have been neglected.
Finally, it is clear that due to the first order correction being negative, we have that
The goodness of the estimate for \(\alpha \) (4.27) is tested by comparing it to the numerical solution of the cubic equation (4.21). This is done in Fig. 5 for five different values of the clustering coefficient. The asymptotic approximation performs well and only breaks down for values of clustering larger than \(\phi \simeq 0.3\). From the same figure it is clear that higher values of clustering push the critical \(R^{c}=1\) curve to higher values of \(\tau \) and n. Hence, in the presence of clustering a viable epidemic requires either a denser network or a higher transmission rate, noting that the transmission rate and the recovery rate \(\gamma \) are not strictly independent.
4.5 Numerical examples
In the previous section we have demonstrated that for the pairwise model with the simplest closure for clustered networks, the determination of the epidemic threshold involves the solution of a cubic equation. While this solution can be obtained numerically, we presented an asymptotic approximation of the solution in terms of powers of the clustering coefficient \(\phi \). In Fig. 5 we present a systematic test of the newly determined threshold by comparing the threshold based on the numerical solution of the cubic equation (4.21) (continuous line in the (\(\tau ,n,0\)) plane), the asymptotic approximation of the solution to the cubic equation (4.27) (dashed line and markers—\(\circ \)) and the numerical solution of the full ODE system corresponding to the closed pairwise model (4.1)–(4.5).
The agreement between the explicit numerical solution of the closed pairwise system and threshold based on the numerical solution of the cubic equation is excellent for all clustering values and other parameter combinations. Moreover, the agreement of these results with the threshold based on the asymptotic approximation is also excellent and remains valid for values of \(0\le \phi \le 0.3\). The initial conditions for the closed pairwise systems were set in the following way: \([I](0)=I_0=1\), \([S](0)=N-I_0=S_0\), \([SI](0)=nI_0\frac{S_0}{N}\), \([SS](0)=nS_0\frac{S_0}{N}\) and \([II](0)=nI_0\frac{I_0}{N}\). The ODEs were run for a sufficiently long time (\(T_{max}=1000\)) to ensure that the epidemic died out. It is worth noting that the correct numerical solution of the cubic equation can be chosen by keeping in mind that \(0\le \alpha =\frac{[SI]}{[I]}\le n\).
5 Results for the pairwise model with the compact improved closure
Starting from the improved closure (3.14) but in line with Proposition 2, we adapt the closure so that the term responsible for the approximation on the clustered part of the network does not consider variables, singles or pairs involving the R class. This leads to the new closure
which we refer to as the compact improved closure. Plugging Eq. (5.1) into the exact system (2.2)–(2.6) leads to a self-consistent system that is written out in full in Appendix A.
In line with our procedure so far, we aim to find the epidemic threshold of this new pairwise system with the compact improved closure. It turns out that the approach used for the pairwise system with the simple closure is applicable to this case, and the steps and results are summarised below.
5.1 Fast variables with the compact improved closure
As we have shown before, finding the threshold relies on finding the quasi-equilibrium of \(\alpha =\frac{[SI]}{[I]}\). In Appendix A we show that this requires knowledge about the behaviour of \(\delta =\frac{[II]}{[I]}\) variable and indeed a system of differential equations involving these two variables can be derived. This system is given below
As previously, the steady states of this system are of interest and apart from the trivial \((\alpha ^{*},\delta ^{*})=(0,0)\) steady state, the quasi-equilibrium can be found by first expressing \(\delta \) as a function of \(\alpha \). This can be done by setting Eq. (5.2) equal to zero and rearranging, leading to
Plugging Eq. (5.4) into Eq. (5.3) and collecting powers of \(\delta \) leads to the following cubic equation
where \(A=(n-2)-2\phi (n-1)\) and \(B=\gamma /\tau \). It is worth noting that in this case it is easier to work with \(\delta \), but any results can be converted in terms of \(\alpha \) which is the main variable of interest. However, before we proceed with the asymptotic expansion of the solution, we show that there is a unique biologically feasible steady state.
5.2 Global stability of the steady state
It is worth considering whether the trajectories of the system governed by Eqs. (5.2)–(5.3) remain in \(D=\{(\alpha ,\delta ): 0 \le \alpha \le n, 0\le \delta \le n-\alpha \}\) for all appropriate initial conditions and all positive times. When \(\alpha =0\), then \(d\alpha /dt=0\), so the \(\alpha =0\) line is stationary and solutions remain in D. Moreover, on this line, \(\frac{d(d\alpha /dt)}{d\alpha }=\frac{\tau n(n-2)}{n+\delta }+\frac{\delta \tau ((n-2)-2\phi (n-1))}{n+\delta }\) which is greater than zero when \(2\phi <(n-2)/(n-1)\). This is a condition which will resurface later when the intersection of the null clines is analysed. If \(\delta =0\), then \(d\delta /dt=2\tau \alpha >0\) meaning that the solution cannot leave D along the \(\delta =0\) line. Finally, we need to show that if \(\alpha +\delta =n\) then \(d(\alpha +\delta )/dt<0\). By substituting \(\delta =n-\alpha \), and after some algebra we obtain that \(d(\alpha +\delta )/dt=-\gamma (n-\alpha )=-\gamma \delta <0\). These findings prove that D is invariant.
To continue we focus on showing that (5.2)–(5.3) admits a unique steady state which is biologically meaningful, i.e. \((\alpha ^{*},\delta ^{*}) \in D\). The null cline corresponding to \(d\alpha /dt\) can be rewritten to give
It is straightforward to check that
meaning that the function is decreasing for all \(\alpha \). Setting \(\alpha =0\) in (5.6) leads to \(\delta =n(n-2)/(2\phi (n-1)-(n-2))\), which can be both negative or positive. On the other hand setting \(\delta =0\) in (5.6) yields \(\alpha =(n-2)\). This null cline has a singularity at \(\alpha ^{*}=(n-2)-2\phi (n-1)\), with \(\alpha ^{*}<(n-2)<n\). If \(\alpha ^{*}<0\) then the branch on the left of the vertical asymptote will not intersect D. This happens exactly when \(2\phi > (n-2)/(n-1)\). So in this case the branch of the null cline to the right of the asymptote intersects the \(\alpha \)-axis at \(((n-2),0)\) and the \(\delta \)-axis at \((0,n(n-2)/(2\phi (n-1)-(n-2)))\), where the intersection with the \(\delta \)-axis happens at a positive value, namely \(n(n-2)/(2\phi (n-1)-(n-2))>0\), and this inequality holds true due to requiring that \(\alpha ^{*}\) is negative. This point may be greater than n but also intersects the horizontal axis at \((n-2,0)\). This is illustrated in Fig. 6 (left panel). When the singularity point is positive, \(\alpha ^{*}>0\), that is when \(2\phi < (n-2)/(n-1)\), then the intersection with the \(\delta \)-axis happens at a negative value of \(\delta \). This is also illustrated in Fig. 6 (right panel), where the positive singularity is clearly visible with the intersection with the \(\delta \)-axis being out of the range of the plot.
The null cline corresponding to \(d\delta /dt\) is given by
This null cline passes through \((\alpha ,\delta )=(0,0)\) and the derivative of \(\alpha _n(\delta )\) is always positive, namely,
The denominator is a quadratic polynomial in \(\delta \) with the discriminant being always positive and thus leading to two distinct real roots. From the equation it follows that sum of the roots is \((n-2)-2\phi (n-1)\) and their product is \(-2n<0\). Therefore, two singularity points exist, one for negative and the other for positive \(\delta \). \(\alpha _n(\delta )\) is such that
Hence, D happens to lie, at least partly, in the area defined by the two singularity points (i.e. the region between the two vertical asymptotes if considered in the \((\delta ,\alpha )\) plane). In this area the null cline increases with \(\delta \) starting from \((\alpha ,\delta )=(0,0)\), see both panels in Fig. 6. Summarising, we have shown that the null clines will intersect at a unique point, and this point cannot be outside D due to the orientation of the vector fields, see also the argument presented in Sect. 4.3.1.
Finally, we show that the existence of a limit cycle can be ruled out by applying the Bendixson criterion. This also ensures the global stability of the unique steady state. Dividing Eqs. (5.2)–(5.3), and computing \(B(\alpha ,\delta )=\frac{d}{d\alpha }\left( \frac{1}{\alpha } \frac{d\alpha }{dt}\right) +\frac{d}{d\delta }\left( \frac{1}{\alpha } \frac{d\delta }{dt}\right) \), the divergence of the system yields
It is easy to show that this is negative. Even if \(-\frac{\gamma }{\alpha }\) is neglected, we have that
since \((n+\delta )\) is greater than both n and \((n-1)\).
5.3 Asymptotic expansion of the epidemic threshold
As in Sect. 4.4, we require the roots of the cubic polynomial given in Eq. (5.5). To do so, we express \(\delta \) as an asymptotic expansion in powers of \(\phi \). We substitute
Plugging the expansion for \(\delta \) (5.11) into Eq. (5.5) leads to
Alternatively, substituting (5.4) into the differential equation for \(\delta \) (5.3), setting the expression equal to zero and rearranging leads to
Substituting (5.11) into (5.13) and collecting terms of order \(\phi ^{0}\) yields
Following the same process to collect terms of order \(\phi ^{1}\), we find
which can be rearranged to yield
with \(\delta _{0}\) defined in (5.17). In summary, we have determined the first two coefficients \(\delta _{0}\) and \(\delta _{1}\) of the asymptotic expansion for \(\delta \) given in Eq. (5.11). Hence, the true solution is approximated by the following expression:
Finally, we are able to plug (5.20) into the quasi-equilibrium point for \(\alpha \), given in Eq. (5.4), to obtain
which, upon neglecting terms in \(\phi \) of order larger than one, can be rearranged to find
The expression for \(\alpha \) (5.22) can be used to determine the epidemic threshold as follows
It is straightforward to see that again \(R^{cci}\le R\), with clustering making the spread of the epidemic less likely.
5.4 Numerical examples
In Fig. 7 we repeat the systematic test of comparing the epidemic threshold generated via the numerical solution of the cubic equation (5.5), the epidemic threshold generated by the asymptotic expansion (5.23) and the numerical value of the final epidemic size predicted by the pairwise model with the compact improved closure, over a wide range of \((\tau ,n)\) values. Several observations can be made. First, it is clear that higher values of clustering push the location of the threshold to higher \(\tau \) and n values, meaning that the limiting effect of clustering on the epidemic spread can only be overcome if either the value of the transmission rate or average degree increases. Second, the agreement between the threshold based on the numerical solution of the cubic equation (5.5) and the asymptotic expansion (5.20) is excellent over a wide range of \(\phi \) values. In fact, in this case the agreement is excellent for \(0\le \phi \le 0.45\), with only small deviations even for \(\phi =0.6\). The agreement between the numerical solution of the pairwise model and the threshold based on the numerical solution of the cubic equation (5.5) remains excellent across all parameter values.
6 Comparing epidemic thresholds based on different models
Exploiting the presence of fast variables and combining this with elements of perturbation theory allowed us to compute the epidemic threshold for the pairwise model with two different closures that take clustering into account. Our results are in line with the findings by Li et al. (2018) and Miller (2009b). Li et al. (2018) calculated the epidemic threshold in a pairwise model for clustered networks with a closure based on the number of links in a motif, rather than nodes. This led to
Equation (6.1) can be expanded in terms of \(\phi \) to give
which again reflects our finding that clustering reduces the epidemic threshold.
Similarly but for clustered networks with heterogeneous degree distributions, Miller (2009b) found that
where \(\langle k^i \rangle \) stands for the ith moment of the degree distribution, T is the probability of infection spreading across a link connecting an infected to a susceptible node and \(\langle n_{\triangle }\rangle \) denotes the average number of triangles that a node belongs to. The expression above again shows that clustering reduces the epidemic threshold when compared to the unclustered case. Furthermore, if the network is regular and we assume that infections and recoveries are Markovian processes with rates \(\tau \) and \(\gamma \) respectively, giving \(T=\tau /(\tau +\gamma )\), \(R_0\) above reduces to
where we have used the fact that a global clustering coefficient of \(\phi \) translates to a node on average being part of \(\frac{1}{2}n(n-1)\phi \) uniquely counted triangles. This in turn coincides with Eq. (6.2). This is perhaps unexpected since the first expression was obtained based on a new type of closure for pairwise models while the other expression was based on percolation theory type arguments. Trapman (2007a) considered specific networks with household structure to investigate the effects of clustering and infectious period distribution on a modified version of \(R_0\) referred to as \(R_{*}\), and lower and upper bounds for the value of this quantity were found. Similarly Ball et al. (2010) considered a random network incorporating household structure and provided the basic reproduction number which takes into account within household and global contacts.
However, as elaborated upon in Sect. 4.1, the R threshold that we compute is a growth-rate-based threshold and whilst at the threshold \(R=1 \Longleftrightarrow R_0=1\), R does not have the same biological interpretation as \(R_0\). Despite this, our analysis confirms that clustering starves the spreading epidemic of susceptible neighbours such that the epidemic is less likely to spread if the networks are clustered, all other parameters being equal. More importantly, the epidemic threshold is model-dependent and the pairwise model with the compact improved closure leads more readily to epidemic outbreaks when compared to the pairwise model with the simple closure, see Figs. 5 and 7. While this ordering is true for the parameters used in this paper, we cannot conclude that this ordering holds true for all parameter values. Further research may focus on the ordering of these thresholds and gaining a better understanding of the impact of model choice on the values of the epidemic threshold.
The computation of the true \(R_0\) for pairwise models can be attempted by considering the next generation matrix approach (Van den Driessche and Watmough 2002). Looking at the pairwise model with the simplest closure and ordering the variables involved in the spreading process as: [I],[SI], the generation of new infectious cases at the the disease-free steady state is given by
where the lower right term is obtained from Eq. (4.3) by looking at the rate of growth of [SI] in terms of [SI] itself and evaluating it at the disease-free equilibrium, that is
Now all other transfers between compartments are summarised in the V matrix, which is given below
where the lower right term describes the rate at which [SI] pairs are depleted. This is obtained from Eq. (4.3) as follows
where again all expressions were evaluated at the disease free steady state. Now \(R_0\) is given by the leading eigenvalue of \(FV^{-1}\), which is
Obviously, this seems like a rather complicated expression since the quasi-equilibrium values for \(\alpha \) and \(\delta \) are needed. These are only available as asymptotic expansions in powers of \(\phi \). Nevertheless, for \(\phi =0\), \(R_0=\frac{\tau (n-1)}{\tau +\gamma }\), which agrees perfectly with the two results quoted above. Considering the \(\phi >0\) case, we write \(R_0=r_0+\phi r_1\), \(\alpha =\alpha _0+\phi \alpha _1\) and \(\delta =\delta _0+\phi \delta _1\). Plugging these into Eq. (6.7), leads to
While the first term in the expansion for \(R_0\) agrees with the results quoted above, the second term seems less likely to be equivalent to those shown above. This same approach can be used to compute \(R_0\) when the compact improved closure is used. We believe that comparing these different ways of computing the epidemic threshold can contribute to reconciling different methods and will lead to more clarity and transparency between various modelling approaches.
Finally, we report some results concerning networks composed of two layers, local within household and global contacts, where epidemic threshold-like quantities have been proposed (Ball et al. 2010). Taking the infection rates over global/network and local/household edges to be the same means that households in the model can be viewed as a device for introducing clustering into the network. This observation motivates our short analysis below. We consider the simple example of a network with all households of size three with additional global contacts assigned to nodes according to a configuration-like network with a regular degree, say \(\mu _D\). This is to keep in line with our assumption of regular random networks. Based on results by Ball et al. (2010), the clustering in such a network is
which can be inverted to give \(\mu _D\) in terms of clustering
Assuming that both infection and recovery are Markovian with rates \(\lambda _G\) (infection through global links), \(\lambda _L\) (infection within households) and \(\gamma \), and following the calculations by Ball et al. (2010) it is easy to show that the epidemic threshold is
where
Plugging in the expression for \(\mu _D\), as in Eq. (6.9), leads to
It is now obvious that \(R_{*}\) decreases as \(\phi \) increases, but to keep in the spirit of this section we expand the above in terms of \(\phi \). Given that around \(x=0\) the following expansion holds \(\sqrt{1+8/x}=\sqrt{\frac{1}{x}}\left( 2\sqrt{2}+\frac{1}{4\sqrt{2}}x-\cdots \right) \), we can rewrite \(R_{*}\) to give
Two important remarks can be made. First, even though \(R_{*}\) defines an epidemic threshold, it does not have the same interpretation as the basic reproduction number: it is the household reproduction number. However, it is a threshold parameter so it takes a value below/at/above its threshold value (\(=1\)) precisely when any other threshold parameter (such as \(R_0\)) is below/at/above its threshold value. Secondly, the dependency on \(\phi \) for the various epidemic thresholds differs. While for most thresholds considered here this dependency is via a negative term of \({\mathcal {O}}(\phi )\), the threshold from the household model decreases as \({\mathcal {O}}((\phi )^{-1/2})\) as \(\phi \) increases away from zero. This may indicate a clear difference in the underlying models but all models may be correct as long as their individual assumptions are met. Therefore, further exploration may focus on understanding which assumptions lead to this discrepancy and what the implications of the various modelling approaches are when applying such models in reality.
7 Discussion
In this paper we derived an analytic epidemic threshold using pairwise models but for clustered networks. For the unclustered case this problem has been solved previously (Keeling 1999). Here, however, by exploiting the presence of fast variables and using elements of perturbation theory, we were able to find the epidemic threshold as an asymptotic expansion in powers of the clustering coefficient.
Our analysis confirms that clustering starves the spreading epidemic of susceptible nodes such that the epidemic is less likely to spread if the networks are clustered, all other parameters being equal. More importantly, the epidemic threshold is model-dependent and the pairwise model with the compact improved closure leads more readily to epidemic outbreaks when compared to the pairwise model with the simple closure, see Figs. 5 and 7. While this ordering is true for the parameters used in this paper, it is easy to show that this relation can change if parameters are tuned accordingly.
We carried out a full analysis of two systems of fast variables (one corresponding to the simplest closure with no clustering, the other corresponding to the compact improved closure for clustered networks). Both systems exhibit similar behaviours but, surprisingly, the more complicated one (that with the compact improved closure) yields results with virtually no constraints on the parameter values.
It is obvious that the complexity of the closure has a bearing on the complexity of the resulting model. As shown in the paper, using the compact improved closure leads to a more complex model whose analysis is slightly more complicated. After submitting the present paper and while waiting for the reviews, we analysed the system with the full improved closure (Kiss et al. 2018). However, our analysis only included the asymptotic expansion of the epidemic threshold without considering the detailed analysis of the system of fast variables (e.g. existence and uniqueness of a biologically feasible steady-state). This system is now four dimensional with not two but four fast variables (the extra variables being [SR] / [R] and [IR] / [I]). In doing so, we were able to confirm the effectiveness and generality of the approach presented in the paper.
It will also be worthwhile to compare different models in order to identify the impact of clustering on epidemics by mapping out regions in the parameter space where its effect is strongest. It is known that when the network is dense the effect of clustering is limited and the same holds when the transmission/recovery rates are high/low, respectively. Moreover, as we have shown in Sect. 6 there is scope for reconciling epidemic thresholds computed from different mean-field or stochastic models where the network is a key ingredient. More importantly, while there is some agreement between the different epidemic threshold expressions, especially in some limits or particular cases, it is clear that the epidemic threshold is model dependent. Hence, the biology of the disease and the contact pattern has to be carefully analysed and taken into account when choosing models that are to be used in relation to actual epidemics.
Of course there remains the issue of accounting for degree heterogeneity in the network and this has been addressed to some extent by using percolation type approaches. The approach that we presented in this paper may be extended to degree-heterogeneous clustered networks, but this will require more sophisticated models such as effective-degree, or compact/super-compact pairwise models (Simon and Kiss 2015). These will no doubt lead to more complex systems which are more challenging to analyse. The simplest starting point could be to consider a network with nodes having either degree \(k_1\) or \(k_2\). For ease of treatment, let \(N_i\) be the number of nodes with degree \(k_i\) with \(i \in \{1,2 \}\). Now one can assume that clustering in the network is introduced at random so it is going to be proportional to the degree and the mixing between the two groups of nodes. One can assume the simplest case of proportional mixing, where the number of links between nodes of degree \(k_i\) and \(k_j\), \(n_{i,j}\) is simply \(n_{ij}=\frac{k_i k_j N_i N_j}{\sum _l k_l N_l}\). Then, the closure could be considered as follows
where \(S_{i}\) denotes the class of susceptible nodes of degree \(k_i\). Now appropriately scaled closures for the triples are needed, which will depend on the degree of the nodes and how clustering is apportioned over nodes of different degrees. The viability of such a model will then rely on whether such closures are compact and compatible enough to derive a reasonably simple overall expression for [ASI], ideally one where the closure no longer depends on degree, but rather such information appears as some factor in the closure.
Finally, it would be worthwhile to test our findings against explicit stochastic network simulations. Since our focus was on exploiting the presence of fast variables and the use of perturbation analysis to determine the epidemic threshold analytically, such empirical validation was thought to be beyond the scope of the present work. We hope that the results of this paper may encourage other researchers to consider and tackle the challenges posed by modelling epidemic dynamics on clustered networks with heterogeneous degree distributions.
References
Ball F, Sirl D, Trapman P (2010) Analysis of a stochastic SIR epidemic on a random network incorporating household structure. Math Biosci 224(2):53–73
Britton T, Juher D, Saldaña J (2016) A network epidemic model with preventive rewiring: comparative analysis of the initial phase. Bull Math Biol 78(12):2427–2454
Cator E, Van Mieghem P (2014) Nodal infection in Markovian susceptible-infected-susceptible and susceptible-infected-removed epidemics on networks are non-negatively correlated. Phys Rev E 89(5):052802
Decreusefond L, Dhersin J-S, Moyal P, Tran VC et al (2012) Large graph limit for an SIR process in random network with heterogeneous connectivity. Ann Appl Probab 22(2):541–575
Eames KT (2008) Modelling disease spread through random and regular contacts in clustered populations. Theor Popul Biol 73(1):104–111
Eames KT, Keeling MJ (2002) Modeling dynamic and network heterogeneities in the spread of sexually transmitted diseases. Proc Natl Acad Sci 99(20):13330–13335
Gross T, D’Lima CJD, Blasius B (2006) Epidemic dynamics on an adaptive network. Phys Rev Lett 96(20):208701
Holme P (2017) Three faces of node importance in network epidemiology: exact results for small graphs. Phys Rev E 96(6):062305
House T, Keeling MJ (2010) The impact of contact tracing in clustered populations. PLoS Comput Biol 6(3):e1000721
House T, Davies G, Danon L, Keeling MJ (2009) A motif-based approach to network epidemics. Bull Math Biol 71(7):1693–1706
Janson S, Luczak M, Windridge P (2014) Law of large numbers for the SIR epidemic on a random graph with given degrees. Random Struct Algorithms 45(4):726–763
Juher D, Ripoll J, Saldaña J (2013) Outbreak analysis of an SIS epidemic model with rewiring. J Math Biol 67(2):411–432
Karrer B, Newman ME (2010a) Message passing approach for general epidemic models. Phys Rev E 82(1):016101
Karrer B, Newman ME (2010b) Random graphs containing arbitrary distributions of subgraphs. Phys Rev E 82(6):066118
Keeling MJ (1999) The effects of local spatial structure on epidemiological invasions. Proc R Soc Lond B Biol Sci 266(1421):859–867
Keeling M, Rand D, Morris A (1997) Correlation models for childhood epidemics. Proc R Soc Lond B Biol Sci 264(1385):1149–1156
Kiss IZ, Miller JC, Simon PL (2018) Fast variables determine the epidemic threshold in the pairwise model with an improved closure. In: International workshop on complex networks and their applications. Springer, pp 365–375
Kiss IZ, Berthouze L, Taylor TJ, Simon PL (2012) Modelling approaches for simple dynamic networks and applications to disease transmission models. Proc R Soc A 468(2141):1332–1355
Kiss IZ, Miller JC, Simon PL (2017) Mathematics of epidemics on networks. Springer, Berlin
Li J, Li W, Jin Z (2018) The epidemic model based on the approximation for third-order motifs on networks. Math Biosci 297:12–26
Lindquist J, Ma J, Van den Driessche P, Willeboordse FH (2011) Effective degree network disease models. J Math Biol 62(2):143–164
Llensa C, Juher D, Saldana J (2014) On the early epidemic dynamics for pairwise models. J Theor Biol 352:71–81
Miller JC (2009a) Percolation and epidemics in random clustered networks. Phys Rev E 80(2):020901
Miller JC (2009b) Spread of infectious disease through clustered populations. J R Soc Interface 6:1121–1134
Miller JC, Kiss IZ (2014) Epidemic spread in networks: existing methods and current challenges. Math Model Nat Phenom 9(2):4–42
Miller JC, Volz EM (2013) Model hierarchies in edge-based compartmental modeling for infectious disease spread. J Math Biol 67(4):869–899
Miller JC, Slim AC, Volz EM (2012) Edge-based compartmental modelling for infectious disease spread. J R Soc Interface 9(70):890–906
Newman ME (2009) Random graphs with clustering. Phys Rev Lett 103(5):058701
Pastor-Satorras R, Vespignani A (2001) Epidemic dynamics and endemic states in complex networks. Phys Rev E 63(6):066117
Pastor-Satorras R, Castellano C, Van Mieghem P, Vespignani A (2015) Epidemic processes in complex networks. Rev Mod Phys 87(3):925
Rand D (1999) Correlation equations and pair approximations for spatial ecologies. In: McGlade J (ed) Advanced ecological theory: principles and applications, vol 100. Wiley, Hoboken
Rattana P, Blyuss KB, Eames KT, Kiss IZ (2013) A class of pairwise models for epidemic dynamics on weighted networks. Bull Math Biol 75(3):466–490
Ritchie M, Berthouze L, Kiss IZ (2016) Beyond clustering: mean-field dynamics on networks with arbitrary subgraph composition. J Math Biol 72(1–2):255–281
Sharkey KJ, Fernandez C, Morgan KL, Peeler E, Thrush M, Turnbull JF, Bowers RG (2006) Pair-level approximations to the spatio-temporal dynamics of epidemics on asymmetric contact networks. J Math Biol 53(1):61–85
Sherborne N, Miller JC, Blyuss KB, Kiss IZ (2018) Mean-field models for non-Markovian epidemics on networks. J Math Biol 76(3):755–778
Simon PL, Kiss IZ (2015) Super compact pairwise model for SIS epidemic on heterogeneous networks. J Complex Netw 4(2):187–200
Szabó-Solticzky A, Berthouze L, Kiss IZ, Simon PL (2016) Oscillating epidemics in a dynamic network model: stochastic and mean-field analysis. J Math Biol 72(5):1153–1176
Taylor M, Simon PL, Green DM, House T, Kiss IZ (2012) From Markovian to pairwise epidemic models and the performance of moment closure approximations. J Math Biol 64(6):1021–1042
Trapman P (2007a) On analytical approaches to epidemics on networks. Theor Popul Biol 71(2):160–173
Trapman P (2007b) Reproduction numbers for epidemics on networks using pair approximation. Math Biosci 210(2):464–489
Van den Driessche P, Watmough J (2002) Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission. Math Biosci 180(1–2):29–48
Volz EM, Miller JC, Galvani A, Meyers LA (2011) Effects of heterogeneous and clustered contact patterns on infectious disease dynamics. PLoS Comput Biol 7(6):e1002042
Acknowledgements
Rosanna C. Barnard acknowledges funding for her Ph.D. studies from the Engineering and Physical Sciences Research Council (EP/M506667/1). István Z. Kiss acknowledges support from the Leverhulme Trust Research Project Grant (RPG-2017-370). Péter L. Simon acknowledges support from Hungarian Scientific Research Fund, OTKA, (Grant No. 115926) and acknowledges that this work was completed within the ELTE Institutional Excellence Program (1783-3/2018/FEKUTSRAT) supported by the Hungarian Ministry of Human Capacities. István Z. Kiss wishes to thank Dr. David Sirl and Prof. Frank Ball for pointing out the potential link between the epidemic threshold computed by Ball et al. (2010) and that obtained in the current paper along with others from the appropriate literature.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Derivation of evolution equations for the fast variables with the compact improved closure
Using the improved closure (3.14) in line with Proposition 2, which we refer to as the reduced improved closure, we find that
Using Eq. (A.2) to close the original pairwise Eqs. (2.2)–(2.6), we obtain the following system of equations:
As we have shown in the main body of the paper, the computation of the threshold requires a system of differential equations for the fast variables \(\alpha =[SI]/[I]\) and \(\delta =[II]/[I]\). We find
and substituting \([SI]^{\prime }\) from Eq. (A.5) and \([I]^{\prime }\) from Eq. (A.4), we obtain
Replacing all \(\frac{[SI]}{[I]}\) terms by \(\alpha \) and all \(\frac{[II]}{[I]}\) terms by \(\delta \) gives
and evaluating \(\frac{d\alpha }{dt}\) at the disease-free steady state \(([S],[I],[SI],[SS],[II])=(N,0,0,nN,0)\) (B.1) gives
After simplification we find that
Differentiating \(\delta =\frac{[II]}{[I]}\) gives
and substituting \([II]^{\prime }\) from Eq. (A.7) and \([I]^{\prime }\) from Eq. (A.4), we obtain
Replacing all \(\frac{[SI]}{[I]}\) terms by \(\alpha \) and all \(\frac{[II]}{[I]}\) terms by \(\delta \) gives
and evaluating \(\frac{d\delta }{dt}\) at the disease-free steady state (B.1) gives
Combining the differential equations for both \(\alpha =\frac{[SI]}{[I]}\) and \(\delta =\frac{[II]}{[I]}\), we have
Standard linear-stability analysis for the case of the simple closure
An alternative way to determine the epidemic threshold is to consider the stability of the disease-free steady state
When the disease-free steady state is stable, the system will always end up at the disease-free steady state and thus no epidemic will occur. When the disease-free steady state becomes unstable, there exists (at least) a second steady state whereby an epidemic will occur and [S] will no longer be equal to N. To determine a stability condition for the disease-free steady state (B.1), we must compute the Jacobian matrix J of the system (4.1)–(4.5), evaluated at the disease-free steady state, and solve to find its eigenvalues.
By computing partial derivatives of each differential equation (4.1)–(4.5) with respect to each model variable [S], [I], [SI], [SS] and [II], and evaluating each expression at the disease-free steady state (B.1), we obtain
with \(\frac{\partial \dot{[SI]}}{\partial [I]}= \tau \xi \phi \left( \frac{2[SI]^{2}[II]}{n[I]^{3}}-\frac{[SI]^{2}}{[I]^{2}}\right) \), \(\frac{\partial \dot{[SI]}}{\partial [SI]}= -(\tau +\gamma )+\tau \xi (1-\phi )n+ 2\tau \xi \phi \left( \frac{[SI]}{[I]}-\frac{[SI][II]}{n[I]^{2}}\right) \), \(\frac{\partial \dot{[SI]}}{\partial [II]}=-\tau \xi \phi \frac{[SI]^{2}}{n[I]^{2}}\), \(\frac{\partial \dot{[SS]}}{\partial [I]}=2\tau \xi \phi \frac{[SI]^{2}}{[I]^{2}}\), \(\frac{\partial \dot{[SS]}}{\partial [SI]}=-2\tau \xi (1-\phi )n-4\tau \xi \phi \frac{[SI]}{[I]}\), \(\frac{\partial \dot{[II]}}{\partial [I]}=-4\tau \xi \phi \frac{[SI]^{2}[II]}{n[I]^{3}}\), \(\frac{\partial \dot{[II]}}{\partial [SI]}=2\tau +4\tau \xi \phi \frac{[SI][II]}{n[I]^{2}}\) and \(\frac{\partial \dot{[II]}}{\partial [II]}= -2\gamma +2\tau \xi \phi \frac{[SI]^{2}}{n[I]^{2}}\) all containing variables \(\frac{[SI]}{[I]}\) and \(\frac{[II]}{[I]}\). The zero entries in \(J_{df}\) reflect the true values that the respective partial derivatives attain at the disease-free equilibrium. However, the majority of the non-zero matrix entries involve \(\frac{[SI]}{[I]}\) and \(\frac{[II]}{[I]}\). Since \([I]=[SI]=[II]=0\) at the disease-free steady state, both of these quantities are ill-defined. Hence, not all entries of the Jacobian can be evaluated at the equilibrium. This issue prevents the computation of the eigenvalues of \(J_{df}\) and thus the value of the epidemic threshold. In order to progress, we need to determine the correct values for \(\alpha =\frac{[SI]}{[I]}\) and \(\delta =\frac{[II]}{[I]}\). We note that the correct value of \(\alpha =\frac{[SI]}{[I]}\) is also required in Eq. (4.10), and the threshold cannot be computed without it.
In fact, using only \(\phi =0\), the Jacobian at the disease-free steady state (B.1) becomes
It is straightforward to show that the eigenvalues are given by \(\lambda _{1}=0\), \(\lambda _{2}=-\gamma \), \(\lambda _{3}=\tau (n-2)-\gamma \), \(\lambda _{4}=0\) and \(\lambda _{5}=-2\gamma \). The only eigenvalue that can be non-zero and non-negative is \(\lambda _{3}=\tau (n-2)-\gamma \). Hence, we know that the disease-free steady state (B.1) is stable when \(\lambda _{3}\le 0\) and becomes unstable when \(\lambda _{3}>0\). Thus, the epidemic threshold is given by \(\lambda _{3}=0\) and this can be rearranged to give \(\tau (n-2)/\gamma =1\). This is equivalent to the calculation based on determining the quasi-equilibrium of the fast variables.
Standard linear-stability analysis for the case of the compact improved closure
To determine an epidemic threshold, we consider conditions for stability of the disease-free steady state (B.1). To do so, we compute the Jacobian matrix evaluated at the disease-free steady state as
where \(\frac{\partial \dot{[SI]}}{\partial [I]}= 2\tau (n-1)\phi \alpha \delta \frac{n}{n^{2}+2n\delta +\delta ^{2}}\), \(\frac{\partial \dot{[SI]}}{\partial [SI]}= -(\tau +\gamma )+\tau (n-1)\Big ((1-\phi )+\phi \left( \frac{n-\delta }{n+\delta }\right) \Big )\), \(\frac{\partial \dot{[SI]}}{\partial [II]}= -2\tau (n-1)\left( \phi \alpha \frac{n}{n^{2}+2n\delta +\delta ^{2}}\right) \), \(\frac{\partial \dot{[SS]}}{\partial [I]}= -2\tau (n-1)\left( \phi \alpha \delta \frac{n}{n^{2}+2n\delta +\delta ^{2}}\right) \), \(\frac{\partial \dot{[SS]}}{\partial [SI]}= -2\tau (n-1)\left( (1-\phi )+\phi \frac{n}{n+\delta }\right) \), \(\frac{\partial \dot{[SS]}}{\partial [II]}= 2\tau (n-1)\left( \phi \alpha \frac{n}{n^{2}+2n\delta +\delta ^{2}}\right) \), \(\frac{\partial \dot{[II]}}{\partial [I]}= -2\tau (n-1)\left( \phi \alpha \delta \frac{n}{n^{2}+2n\delta +\delta ^{2}}\right) \), \(\frac{\partial \dot{[II]}}{\partial [SI]}= 2\tau +2\tau (n-1)\left( \phi \delta \frac{1}{n+\delta }\right) \) and \(\frac{\partial \dot{[II]}}{\partial [II]}=-2\gamma +2\tau (n-1)\left( \phi \alpha \frac{n}{n^{2}+2n\delta +\delta ^{2}}\right) \) cannot be fully evaluated as they contain products of the problematic variables \(\alpha =\frac{[SI]}{[I]}\) and \(\delta =\frac{[II]}{[I]}\).
The Jacobian above becomes useful once analytical expressions for \(\alpha \) and \(\delta \) are obtained (or it could be an asymptotic expansion or even numerical values). Plugging these in the Jacobian will allow to either numerically or analytically compute the threshold. We note that using the linear-stability analysis or focusing on the initial growth rate should lead to the same threshold value, as was already shown for the case of the system with the simple closure in Sect. B.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Barnard, R.C., Berthouze, L., Simon, P.L. et al. Epidemic threshold in pairwise models for clustered networks: closures and fast correlations. J. Math. Biol. 79, 823–860 (2019). https://doi.org/10.1007/s00285-019-01380-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-019-01380-1