1 Introduction

Paleoenvironmental and paleoclimate research relies heavily on analysing proxy data to reconstruct past environmental conditions (Juggins and Birks 2012). Proxy indicators are physical, chemical, and biological evidence that persist in naturally occurring geologic or biological archives with characteristics that can be tied to climatic or environmental conditions or phenomena (e.g. Mann et al. 1998; Hegerl et al. 2006). Sources for these types of data include, but are not limited to, ice, coral, and sediment cores, as well as tree rings, from which measurements such as tree ring width and density, variation in the isotopic composition, or changes in pollen abundances can be analysed and subsequently used as proxies for environmental phenomena (Geological Survey 2023). Examples of the relationships between proxies and environmental parameters are the calibration of tree ring growth with historical patterns of temperature and precipitation (Esper et al. 2002), and studying the variation of \(\delta ^{18}\)O ratios in ice cores (Jouzel et al. 2007) to reconstruct climate conditions on timescales from decades to millions of years. The foundation for such studies lies in the well-understood relationship between proxies and environmental phenomena. Notably, the underlying assumption is that environmental conditions affect the proxy at the surface, and this proxy becomes encapsulated through the natural accumulation of material, allowing researchers to infer past environmental conditions by observing the preserved proxies.

While scientists rely on proxy data to study how past environmental conditions have evolved, the relationship between data and the environmental condition of interest is complex. For example, each proxy may respond to environmental conditions at different levels of sensitivity, may record local vs. regional signals, and often respond to a cumulative influence of multiple environmental factors (Aykroyd et al. 2001; Battarbee 2000; Power 1993). Consequently, these data are subject to multiple sources of uncertainty. Therefore, it is critical to quantify this uncertainty (including foundational data, proxy calibration, and interpretation) to determine the inherent confidence of the environmental reconstructions derived from these proxy relationships.

Addressing one source of uncertainty common to these proxies—excluding tree rings—is the uncertainty related to the formation age. The age at which proxies were deposited in the sediments is unknown and can only be inferred through dating techniques such as radioactive isotopes (e.g. radiocarbon, \(^{210}\)Pb, etc.) or time markers like tephra, which can be attributed to specific volcanic events (Lowe 2011; Reimer et al. 2013; Stuiver and Polach 1977). These dating techniques provide a means to determine the age of sediment core sections, enabling the reconstruction of proxy records. However, the application of these dating methods is often limited by the scarcity of suitable materials or the high costs involved in their analysis (Appleby 2008). To obtain age estimates for every layer where a proxy was measured, researchers employ age–depth models. The accuracy and precision of these models depend on the sampling density, but the models themselves introduce an element of uncertainty into all environmental studies involving sediment cores (Blaauw and Christen 2011; Blaauw et al. 2018).

Improvements in age–depth modelling have demonstrated clear and objective benefits compared to traditional techniques (Blaauw et al. 2018). However, these advancements have largely remained within the age–depth modelling community and have not yet been fully integrated into proxy analysis. Despite the ability of Bayesian age–depth models to provide comprehensive uncertainty quantification, the full posterior distribution is often discarded in favour of point estimates, such as the mean or median age, when associating ages with proxy values. Such practices potentially omit crucial information and may lead to inaccurate conclusions see Asena (2021), especially in cases where the age–depth model exhibits significant uncertainty. Recently, there has been a movement towards integrating age uncertainty into proxy analysis (McKay et al. 2021). This tool provides a way to employ standard statistical techniques without overlooking the uncertainty inherent in age–depth models. The methodology leverages the Markov chain Monte Carlo (MCMC) samples generated by Bayesian age–depth models, offering a direct way to circumvent potential biases or errors associated with assigning a singular age to a specific proxy measurement. In this paper, we present a mathematical justification for using this tool. Furthermore, we introduce a potential new approach to utilising the resulting MCMC sample of proxy values at a given age. This will enable modellers to create environmental reconstructions using these tools without being constrained to standard distributions.

2 Role of Age–Depth Models in Paleoecology

Reconstructing past environmental conditions relies heavily on analysing proxy data obtained from sediment cores in marine and terrestrial environments (Bradley 2015). A critical source of uncertainty in such paleoenvironmental studies arises from the chronological framework used to estimate the ages of proxy measurements at discrete depths along the sediment core. This chronological framework is established through age–depth models, which aim to relate the depth of a sediment layer to its corresponding age.

Age–depth model uncertainties can stem from various factors, including the precision of radiometric dating techniques, changes in sedimentation rates over time, and the number of available age determinations relative to the modelled period (Anderson et al. 2022; Blaauw et al. 2018; Franke and Donner 2019). These uncertainties can diminish the temporal precision with which proxy data can be calibrated against historical observations, ultimately affecting the accuracy of any derived paleoenvironmental reconstructions.

The age–depth relationship is a fundamental challenge in paleoenvironmental studies, where researchers aim to determine the age of sediment layers within a core, each corresponding to a different point in time. This relationship is typically modelled as a monotonically increasing function, \(G(d; \Theta )\), where d represents the depth in the sediment core and \(\Theta \) is a vector of parameters characterising the function. The monotonic nature of G ensures that deeper core sections correspond to earlier times, aligning with the depositional sequence of the layers. To address this challenge, robust statistical methods are required to estimate sample ages and their associated uncertainties. Bayesian age–depth modelling, which employs a probabilistic framework, has proved particularly powerful in this regard (e.g. Bacon-Blaauw and Christen 2011; Plum-Aquino-López et al. 2018; and Bchron-Haslett and Parnell 2008). The Bayesian approach allows researchers to specify prior distributions for unknown parameters, such as sedimentation rates, and then update these priors based on experimental data, resulting in posterior distributions that quantify the uncertainties in the age estimates. The parameter vector \(\Theta \) captures the dynamical aspects of sedimentation, including sedimentation rates, compaction factors, and potential episodic events that could disrupt steady-state deposition (e.g. turbidites, slumps).

The mathematical justification for using the function G stems from the law of superposition, which states that in any undisturbed sequence of sediments, the oldest layer is at the bottom, and layers become progressively younger towards the top. The function G captures this principle within a mathematical construct, allowing for quantification and modelling.

Bayesian statistical methods are often employed to estimate the parameter vector \(\Theta \) due to their ability to incorporate prior knowledge and quantify uncertainty (Blaauw et al. 2018). Through Markov chain Monte Carlo (MCMC) simulations, posterior distributions for \(\Theta \) can be obtained, providing information about the most probable parameter values given the data and prior beliefs.

By mathematically modelling the age–depth function \(G(d; \Theta )\) and its parameters \(\Theta \), researchers can capture the complex interplay between depth and time in sediment cores, providing a robust foundation for subsequent environmental reconstructions and interpretations. This approach quantifies the direct relationship between depth and age and allows for the inference of secondary environmental variables of interest with quantifiable levels of confidence, which are important for environmental studies and can sometimes serve as indicators of environmental change.

Figure 1 illustrates a resulting age–depth model from a lake sediment core (Santa Marí­a del Oro) using the Bayesian age–depth model Plum. The lake’s age model was based on \(^{210}\)Pb and \(^{137}\)Cs isotopes, commonly used dating techniques for recent sediments. Although these dating techniques are not the focus of this research, details about them can be found in Aquino-López et al. (2018) and Sanchez-Cabeza and Ruiz-Fernández (2012).

Fig. 1
figure 1

Age–depth model for the SAMO141 sediment core. The top left panel shows the log-objective trace of the MCMC process for assessing convergence. The main panel depicts the age–depth relationship derived from the MCMC samples (grey area and dotted lines) along with raw \(^{210}\)Pb measurements (blue boxes) and the 1963 age inferred from caesium data (green globe). The top middle panel shows the prior (green) and posterior (grey) distribution for the accumulation rate and memory. The top right panel shows the prior/posterior for the supported \(^{210}\)Pb concentration. The panel second from the left shows the prior/posterior for the \(^{210}\)Pb influx to the site

3 Mathematical Justification

To fully leverage the uncertainty quantification from Bayesian age–depth modelling, it is necessary to propagate the modelled age uncertainties through to the proxy values, resulting in proxy value distributions associated with each given age.

In this justification, we adopt the notation and output format of Blaauw and Christen (2011), although it is crucial to recognise that the underlying concepts are broadly applicable to any Bayesian age–depth modelling approach. The age–depth model can be represented by two sets of parameters: the initial date (indicating the surface date of the core) and the accumulation rates at K sections (slopes), where a piecewise linear model is fitted. The output from these age–depth models is a Monte Carlo ensemble, represented as \(G(d; \theta ^{(t)}, x^{(t)})\), which comprises various age–depth model instances.

Additionally, we incorporate data on proxy k extracted from the core, specifically through measurements denoted as \(p_k(d)\)—the function representing proxy k across various depths \(d = d_1, d_2, \ldots , d_m\). This data set provides insight into the variations of proxy k at the specified depths.

Theorem 3.1

Let \(G(d; \Theta , X)\) be a Bayesian age–depth model, where \(\Theta \) represents the surface date, X represents the accumulation rates, and d represents the depth. Let \(p_k(d)\) be the proxy measurement function for proxy k at depth d. Then, the posterior distribution of the proxy values \(Q_k(g)\) for any calendar age g can be obtained as

$$\begin{aligned} Q_k(g) = p_k(G^{-1}(g; \Theta , X)), \end{aligned}$$
(1)

where \(G^{-1}(g; \Theta , X)\) is the inverse of the age–depth model, mapping a calendar age g to the corresponding depth.

Proof

The age–depth model \(G(d; \Theta , X)\) is a random variable for any depth d since \(\Theta \) and X are random variables with a joint posterior distribution obtained from Bayesian inference using the available data. \(Q_k(g)\) is well defined as a random variable and may be viewed as the predictive random proxy value at the calendar age g (see Fig. 2). \(\square \)

This theorem shows how the uncertainty in the age–depth model, represented by the joint posterior distribution of \((\Theta , X)\), is propagated to the proxy values through the inverse age–depth function \(G^{-1}(g; \Theta , X)\), resulting in a posterior distribution of proxy values \(Q_k(g)\) for any calendar age g. Moreover, envisaging an MC sampling algorithm to produce samples from \(Q_k(g)\) is straightforward.

Let \((\theta ^{(t)}, x^{(t)})\) be a sample from the joint posterior distribution of \((\Theta , X)\), where \(t = 1, 2, \ldots , T\) represents an MCMC iteration. For a given calendar age g, we want to find the corresponding proxy value \(Q_k(g)\). Since the age–depth model is a random function, the depth d corresponding to the age g is also a random variable. We can express the relationship as

$$\begin{aligned} d = G^{-1}(g; \Theta , X). \end{aligned}$$
(2)

Substituting the MCMC sample \((\theta ^{(t)}, x^{(t)})\), we get

$$\begin{aligned} d^{(t)} = G^{-1}(g; \theta ^{(t)}, x^{(t)}). \end{aligned}$$
(3)

Now, we can evaluate the proxy function \(p_k(d)\) at the depth \(d^{(t)}\) to obtain the corresponding proxy value

$$\begin{aligned} q_k^{(t)}(g) = p_k(d^{(t)}) = p_k(G^{-1}(g; \theta ^{(t)}, x^{(t)})). \end{aligned}$$
(4)

Note that \((\theta ^{(t)}, x^{(t)})\) is a sample from the joint posterior distribution of \((\Theta , X)\), therefore the set of values \({q_k^{(t)}(g)}_{t=1}^T\) represents a Monte Carlo sample from the posterior distribution of \(Q_k(g)\).

Algorithm 1 outlines this procedure for producing samples from \(Q_k\), using the MC sample obtained from the age–depth model. The uncertainty in the proxy measurement error can be considered by assuming that the measurement is an observation from a normal distribution, for example, and the reported error is assumed to be the standard deviation of such a normal distribution. Thus, sub-sampling from \(\mathcal {N}(p_k,\sigma _{k,d}\mid d)\) allows for the incorporation of uncertainty in \(p_k\). An illustration of the algorithm is shown in Fig. 2.

Algorithm 1
figure a

Sampling proxy values from an age–depth model

Fig. 2
figure 2

Depiction of algorithm 1. The bottom left panel shows the inferred chronology (age–depth model) using Plum (Aquino-López et al. 2018), with blue and green lines indicating two discrete iterations of the MCMC. Horizontal arrows depict the specific depth and age relationship and the flow of movement in the algorithm. The bottom right panel displays identity from depth to depth, with blue and green lines helping the reader follow the depths the algorithm is using. The top right panel shows the proxy measurements in the depth scale, with green and blue lines assisting the reader in relating these depths back to the age–depth model. Finally, the top left plot presents the output, with green and blue lines showing the corresponding proxy value in the output given the specific age–depth iteration and proxy value at such depth

It is important to note that \((q_k^{(t)}(g_i))\) is a (posterior) Monte Carlo sample of proxy (k), from which the posterior distribution of \((Q^{(t)}_k(g_i))\) may be approximated. This is achieved through a kernel density or histogram of \((q_k^{(t)}(g_i))\), where kernel density estimation and histograms, as nonparametric methods, play a pivotal role. These methods can estimate the probability density function (PDF) of the underlying distribution from a sample. In this case, they are well suited for inferring the PDF from a MCMC sample, which would rarely follow a known parametric distribution (Rosenblatt 1956; Scott 1979; Gelman et al. 2013). Both methods infer the distribution’s shape solely from the sample without assuming a parametric form. This makes them appropriate techniques for estimating the PDF of an unknown distribution like that of an MCMC sample.

Furthermore, an evolution of the posterior distribution of the proxy over the timescale can also be plotted. The key difference between this and other approaches is that the evolution of the proxy now captures the uncertainty of both the proxies themselves and the age–depth model. Moreover, the probable proxy values are provided in the calendar scale, \((Q_k(g))\) for (g) being a calendar year, not merely as a function of depth as initially measured. This absolute scale allows for more informative analyses and comparisons with other cores, climatic events, etc.

The following provides a general evaluation of this approach (see McKay et al. 2021). To evaluate the methodology, we utilised a data set from a single lake sediment core dated using \(^{210}\)Pb and \(^{137}\)Cs. The sediment core is SAMO141 (Santa Marí­a del Oro Lake), acquired from a crater lake in north-west Mexico (Ruiz-Fernández et al. 2022). This core was previously used to assess recent anthropogenic activities promoting increased trace element inputs in the region. As the original implementation by McKay et al. (2021) does not allow for \(^{210}\)Pb dating, we developed our own version in Python (Python Software Foundation 2021) to incorporate \(^{210}\)Pb. This is crucial for reconstructing high-resolution environmental records in the most recent period, where direct measurements are available for comparison.

3.1 Example: SAMO141

Santa Marí­a del Oro Lake (SAMO) is a small crater lake in north-western Mexico. The area’s tourism-related development rate has escalated significantly since 1995, leading to haphazard shoreline modifications. In a study by Ruiz-Fernández et al. (2022), temporal trends of trace elements (As, Co, Cr, Cu, Ni, Pb, V, and Zn) were analysed to investigate the hypothesis that recent anthropogenic activities have resulted in elevated trace element inputs. The study focused on three periods from which background levels were estimated: pre-1900s, pre-1950, and post-1950. Here, we use this result to show the potential of this approach to test a hypothesis that is well supported by research results and has a solid Bayesian foundation.

Fig. 3
figure 3

Temporal evolution of sedimentary lead (Pb) concentrations in Santa Marí­a del Oro Lake found in Ruiz-Fernández et al. (2022). The density plots illustrate the posterior distributions of Pb levels measured in parts per million (ppm) across distinct time periods: pre-1900 (blue), 1900–1950 (red), and post-1950 (green). A pronounced peak in Pb concentrations is evident prior to the twentieth century, followed by increased variability during 1900–1950, and a subsequent decline in the likelihood of high Pb levels post-1950

Santa Marí­a del Oro’s posterior distributions of lead (Pb) concentrations at different periods are displayed in Fig. 3. The results indicate that the probability of Pb levels exceeding 25 ppm before 1900 is 76%, 0% between 1900 and 1950, and only 3% post-1950. In contrast, the probability of Pb levels ranging from 5 ppm to 25 ppm is 22%, 79%, and 23% for the three periods, respectively. Furthermore, the probabilities of Pb levels below 5 ppm are 1%, 20%, and 73% for the same periods. These probabilities demonstrate that the behaviour of lead levels varied considerably between the three time periods and cannot be considered anthropogenic. The probabilities are calculated from the posterior distributions of the proxies and account for all the uncertainties in the study; thus, they can be used to support the findings presented in Ruiz-Fernández et al. (2022).

Notably, Ruiz-Fernández et al. (2022) achieved comparable conclusions by utilising classical analysis of variance and box plots. Nevertheless, our method allows the calculation of actual probabilities that can be employed to conduct further analysis and facilitate discussions on the findings.

4 Use of the Posterior Distribution

We have demonstrated how researchers can evaluate their hypotheses using the MCMC sample in conjunction with classical statistics. However, we aim to go a step further. We intend to suggest an alternative method that exploits the resulting density of the proxies for any given age. To illustrate, we define a linear regression.

Regression analysis has long been an important tool for environmental reconstruction. It offers a quantitative means of inferring past environmental conditions using proxy data (Birks and Birks 1998; Juggins and Birks 2012). This statistical tool has enabled researchers to discern historical climatic and environmental transitions, bridging the gap between contemporary observations and historical records (Simpson 2007). Such insights are invaluable for understanding the persistent dynamics and trends of environmental evolution. McKay et al. (2021) apply linear regression to each iteration of the MCMC, with the resulting parameters serving as a basis for inference and subsequent conclusions.

Our proposed approach diverges from this foundational approach. We treat the proxy value at a given age as a random variable drawn from the defined posterior distribution. We present a novel and robust methodology centred on the densities of the posterior sample of the proxies designed to craft Bayesian models that deduce relevant relationships between proxies and past environmental conditions.

Let \((\hat{Q}_{k}^{g_i}(p))\) be the posterior distribution of proxy k at year \(g_i\). If we assume a linear relationship between proxy k and the mean precipitation in the same year,

$$\begin{aligned} Y= & m q_{k} + b + \epsilon _i \end{aligned}$$
(5)
$$\begin{aligned} q_{k}\sim & \hat{Q}_{k}^{g_i} \end{aligned}$$
(6)
$$\begin{aligned} m\sim & \mathcal {N}\left( 0, \sigma _m \right) \end{aligned}$$
(7)
$$\begin{aligned} b\sim & \mathcal {N}\left( 0,\sigma _b \right) \end{aligned}$$
(8)
$$\begin{aligned} \epsilon\sim & \mathcal {N}\left( 0,\sigma \right) . \end{aligned}$$
(9)

Note that a known physical model can substitute this linear relationship. In this context, we introduce a classical Bayesian linear regression. Our approach distinguishes itself by inferring the proxy value using the posterior density—obtained from the kernel estimator of the MCMC sample—as the prior distribution for the system we aim to study. Importantly, this method streamlines the inference of all parameters in line with the physical model, adding depth to the learning process and producing refined posterior distributions for both the sample and the presupposed model.

While we provide this primarily as an illustrative example, the framework can readily be applied to any physical model wherein a proxy signifies an environmental condition. By harnessing the full distribution of potential proxy values instead of relying exclusively on point estimates, we gain a richer and more detailed understanding of the inherent environmental relationships. This method captures the central tendencies of proxy–environment correlations and the inherent uncertainties accompanying each proxy measurement. As a result, our density-centric approach bolsters the credibility and robustness of environmental reconstructions, setting the stage for sharper and more precise depictions of historical environmental contexts.

To illustrate the practical application of our proposed approach, we have analysed SAMO141’s clay content, a proxy with established correlations to precipitation levels, as reported in the studies by Webb et al. (1986) and Zhong et al. (2018). We aimed to explore the potential of adopting a fully Bayesian framework for environmental reconstruction and prediction. We divided the data into a reconstruction period (year \(\le \) 2000) and a prediction period (year > 2000). Figure 4 presents the outcome of applying our model to reconstruct historical precipitation levels from the clay content proxy data in SAMO141.

We first obtained the posterior distribution of clay content at each dated age using the MCMC kernel density estimation approach described previously. This provided the prior distribution \(\hat{Q}_{g_i}(p_{\textrm{clay}})\) for the clay content proxy \(p_{\textrm{clay}}\) at each age \(g_i\). We then implemented the Bayesian linear regression model linking clay content to precipitation during the reconstruction period, where \(Y_i\) represents the precipitation level at age \(g_i\), which is modelled as a linear function of the clay content proxy \(p_{\textrm{clay},i}\) plus noise terms (\(Y = m q_{\textrm{clay}} + b + \epsilon _i \)). The prior distributions for the slope \(m \sim \mathcal {N}\left( 0, 10 \right) \), intercept \(b \sim \mathcal {N}\left( 0, 10\right) \), and noise variance \(\sigma \sim \mathcal {G}(1.5, 2/3)\) were chosen to be weakly informative.

For comparative purposes, we also constructed a traditional linear regression reconstruction following the common practice in paleoclimatology. This involved first interpolating the clay content measurements to a regular age sequence using a classical age–depth model (CRS model,Appleby and Oldfield 1978). Ordinary least-squares linear regression during the reconstruction period was then applied to derive a linear relationship between these interpolated clay values and precipitation estimates from an instrumental climate record over the same period. The resulting regression coefficients were finally used to reconstruct precipitation from the interpolated clay data. The period from 2001 to 2013 was not included in constructing either the Bayesian or classical approach to observe the method’s performance in predicting precipitation from clay content.

As shown in Fig. 4, our fully Bayesian approach yields precipitation reconstructions with associated 95% credible intervals (dotted red lines) that exhibit notably improved uncertainty quantification compared to the classical regression method’s two standard deviation confidence intervals (shaded green regions). A significant limitation is observed in the linear regression model’s ability to generate realistic interval estimates for the period spanning 2001 to 2013. During this time period, the model produces excessively narrow interval estimates, failing to appropriately account for the inherent uncertainties associated with the predictions. In contrast, our Bayesian approach furnishes a more conservative interval estimate and a prediction with greater structure, hinting at a potential low-precipitation event around 2006. By seamlessly integrating Bayesian principles and leveraging the full posterior distributions of the proxy data, our model demonstrates an enhanced capability to accommodate noisy proxy measurements and data gaps robustly.

The proposed approach provides a principled and objective framework for crafting reconstructions while simultaneously improving the fidelity of the proxy records. Our methodology accommodates the comprehensive spectrum of proxy value distributions to facilitate a more holistic depiction of past environmental conditions and potential predictions.

Fig. 4
figure 4

Annual precipitation reconstructions (\(\le \) 2000) from sediment clay content using Bayesian modelling (red/orange lines and shaded regions showing 95% credible intervals) compared to a regular linear regression approach (green line and shaded two standard deviation intervals). The solid blue line represents the observed mean annual precipitation data used for calibration. The Bayesian model offers a more appropriate representation of uncertainty, as evidenced by the wider interval estimates it provides, particularly for the 2001–2013 period

5 Discussion and Conclusions

While previous attempts to visualise proxy uncertainty were made (Blaauw et al. 2007), the methodology introduced by McKay et al. (2021) represents a significant advancement. Their innovative approach successfully integrates the full spectrum of uncertainty into the traditional framework. This groundbreaking methodology highlights the potential to derive more robust interpretations by managing age–depth uncertainties while simultaneously enabling classical statistical analyses on the posterior samples generated by a Bayesian approach.

One of our method’s salient advantages is interpreting the posterior distribution of the proxy samples. By harnessing the age–depth samples produced through MCMC, we can discern probability distributions for the proxy values corresponding to each age, which kernel estimators facilitate. Such an approach allows for hypothesis testing across varying time frames or even formulating a secondary Bayesian model to estimate environmental conditions, as demonstrated in our SAMO141 clay content data analysis.

The methodology introduced in this study extends beyond the SAMO141 example to any paleoclimate study involving an age–depth model and proxy values associated with depth (and their corresponding ages). By systematically integrating proxy data with environmental patterns using a consistent statistical framework, our approach supports a more nuanced view of past environmental conditions and the probabilities associated with the paleoreconstructions.

In summary, our work not only provided a simple and easy-to-follow mathematical justification for using the MCMC samples from age–depth modelling (McKay et al. 2021) but also lays the groundwork for robust statistical inference via kernel estimators. It offers a valuable contribution to paleoenvironmental research and opens up new avenues for advancing our understanding of past climate conditions in a fully Bayesian framework.