Introduction

The 1998 Aznalcóllar mine disaster that occurred in Spain was, and still is, one of the most catastrophic mine disasters in the Iberian Peninsula and Europe (CSIC 2008). The spill of the mine tailings retained in the pond affected more than 85 km of the Agrio and Guadiamar rivers, flooded more than 4600 ha, and caused major environmental damage. It is Europe’s largest mine spill (Nikolic et al. 2011) and remains the fifth largest spill worldwide (WISE 2020).

The event has been studied from different disciplines generating more than 400 scientific publications (Madejón et al. 2018a), which include two special issues (Grimalt and MacPherson 1999; IGME 2001) and three reviews (Ayala-Carcedo 2004; Madejón et al. 2018b; Sanz-Ramos et al. 2022) that cover the geotechnical aspects, polluted and contaminated soils, and the hydraulics of the spill. This large amount of research aided the development of guidelines for a better design, construction, monitoring, and closure of tailing dams aiming to prevent hazardous situations, to assess the potential effects when a dam-break occurs, to improve remediation activities, and to carry out reclamation activities after decommissioning (Dysarz et al. 2024; Kheirkhah Gildeh et al. 2021; Klose 2007; Penman et al. 2001).

Despite that, only two attempts to simulate the spill propagation process can be found in the literature (Sanz-Ramos et al. 2022). Castro-Díaz et al. (2008) presented a numerical scheme to solve the two-dimensional shallow water Eqs. (2D-SWE), and then applied it to simulate the rupturing of the Aznalcóllar dike and the subsequent spill propagation caused by the breach in the laterals of the pond. And Padilla et al. (2016) introduced a depth-averaged distributed hydrological model to simulate surface–groundwater interactions, accounting for the propagation of the surface flow throughout a diffusive wave approach using the finite element method. In the aforementioned research, the Aznalcóllar mine disaster was reconstructed. In addition to the differences in the fluid motion equations and numerical approaches used to solve them, different assumptions were made in both cases that might condition the reconstruction of the spill. First, the fluid was treated as if it was clear water, even though it was reported as mine tailings with a high concentration of solids (AGE and JA 1999; Ayala-Carcedo 2004; Ayora et al. 2001; Gallart et al. 1999) and different particle sizes (CMA 1998; Gallart et al. 1999; Gens and Alonso 2006; ITGME 1998; López-Pamo et al. 1999; Manzano et al. 2000; Querol et al. 1998; Vidal et al. 1999). In this sense, the fluid was assumed to behave as a Newtonian fluid. Second, the topographical data utilised were either produced immediately after the failure (Consultec Ingenieros 1999), which includes the deposited muds (Castro-Díaz et al. 2008), or that available prior to the disaster, which consisted of a 1:10,000 scale map with contour lines 10 m each (Padilla et al. 2016) and is quite limited for this kind of study because the downstream area is extremely flat. Third, the release of the spill was approached in a different way by the two authors: Castro-Díaz et al. (2008) simulated a breach formation process while Padilla et al. (2016) implemented a hydrograph as an inlet discharge in the numerical model.

Nonetheless, it appears that both works based on different numerical tools suitably reproduced the fluid behaviour, reaching good results in terms of the flood wave arrival time and the maximum height/discharge at the EA90 gauge station, located 7.1 km downstream of the breaking point (Fig. 1a). However, in any flood reconstruction process, the utilisation of post-event topography, or pre-event ones that lack resolution, might notably condition not only the flood behaviour (Haile and Rientjes 2005; Horritt and Bates 2001; Yan et al. 2015a, 2015b) but also how the calculation domain is discretised and, thus, the reconstruction of the flood event itself. Furthermore, the consideration of a two-phase breach formation or a hydrograph with a no-discharge period between the two peaks, contrasts with the observations (Alonso and Gens 2006a, 2006b; Gens and Alonso 2006; Sanz-Ramos et al. 2022, 2021b).

Fig. 1
figure 1

a Study area: location of the breaking point (red star) and the EA90 gauge station location #1) (source: adapted from Sanz-Ramos et al. (2022)). b Limnograph at EA90 gauge station: observed (dashed line, with operator’s corrections), and proposed by Consultec Ingenieros (1999) (dotted line) and by Borja et al. (2001) (continuous line). c Proposed hydrograph spilled at the breaking point: Consultec Ingenieros (1999) (continuous line), and Padilla et al. (2016) (dashed line); and estimated hydrograph at EA90 gauge station (dotted line) according to Consultec Ingenieros (1999)

Therefore, we numerically reconstructed the spill propagation process of the Aznalcóllar disaster using new topographical data to try to represent the morphology of the study area prior to the flood event based on the observed limnograph at EA90 gauge station and considering, on the one hand, the fluid as water with suspended sediment load (a Newtonian fluid) and, on the other hand, the fluid as mud-like (a non-Newtonian fluid).

Methods and Materials

Spilled Hydrograph

The hydrograph and volume of the fluid spilled during the Aznalcóllar disaster is very controversial, and only partially known. The review by Sanz-Ramos et al. (2022) from a hydraulic point of view revealed the uncertainties in these data, from observations at the EA90 gauge station (limnograph) to the hydrographs inferred from it at this location, even at the breaking point.

The measurements made at the EA90 gauge station (Fig. 1b) lack data from approximately 3:00 to 6:00 AM, when the fluid depth was above the 2.5 m measurement range of the facility. The first peak of discharge probably occurred during this gap. On the original document, the facility operator manually wrote a value for the maximum fluid depth (3.86 m). Despite that, Borja et al. (2001) and Alonso et al. (2010) proposed not only different values for this peak discharge, of 3.94 and 3.60 m respectively, but also refilled this gap with new data (Fig. 1b).

Only two hydrographs of the spill have been presented at the breaking point using that data (Fig. 1c). On one hand, a continuous discharge with two peaks of approximately 1050 and 300 m3/s respectively, was presented in an unpublished document of Consultec Ingenieros (1999). This data came from an inverse convolution process supported by the numerical modelling tool HEC-1. On the other hand, Padilla et al. (2016) also presented a spilled hydrograph with two peaks, with the second one greater than the first. The volumes of these proposed hydrographs were 8.1 and 11.1 hm3, respectively, which exceed those commonly found in the literature, which usually range from 4.5 to 6 hm3 (Sanz-Ramos et al. 2022).

However, a detailed analysis revealed that the volume of the spilled hydrograph estimated in Consultec Ingenieros (1999) is ≈0.3–0.4 hm3 greater than the one calculated at the EA90 gauge station by the same authors, even though the fluid was considered as if it were water without sediment. Besides, the data used in the inverse convolution process was the 1:2,000 scale post-failure topography that included the deposited muds, which could largely condition the entire process. Although the peak discharges of the hydrograph of Padilla et al. (2016) agree with the observations in time, their magnitude and the no-discharge period between them highly contrast with the observed data. In this sense, the data used to propagate this hydrograph lack resolution. The topography used in this research had a 1:10,000 scale, with contour lines of 10 m, not detailed enough to properly represent the riverbed, riverbanks, and floodplains.

Although the volume of the hydrographs is within the potential storage capacity estimated by Sanz-Ramos et al. (2022), both present potential issues that might condition the numerical reconstruction of the spill if they are used for that purpose. To that end, a new spilled hydrograph was calculated using a calibration process based on a least squares adjustment and two-dimensional numerical modelling. The aim was to achieve a good fit to the observed limnograph – not the estimated hydrograph – at the EA90 gauge stage. For that purpose, the 1977 DEM was used as reference.

Fluid Characteristics

Knowing the fluid properties is essential to achieve good results in the numerical reproduction of any flood event. However, there is a lack of agreement on them in the hundreds of scientific publications (Madejón et al. 2018b) and official reports about the hydraulics of the Aznalcóllar disaster (Sanz-Ramos et al. 2022). This is probably one of the reasons that persuaded other authors to carry out the numerical reconstruction of the spill considering the fluid as clear water (Castro-Díaz et al. 2008; Padilla et al. 2016).

In the large number of references to Aznalcóllar disaster, the spilled fluid is generally classified as tailings. The residuals of the Aznalcóllar mine activities were separated into different lagoons in Los Frailes pond: pyroclastic tailings in the northern, and pyritic tailings in the southern. Thus, a priori, two kind of fluid behaviour are expected in the spill process, one related to the pyroclastic tailings, commonly referred as “acid waters” in the literature, and the other to the pyritic tailings or “muds”.

Both fluids were mainly composed of fine solids and water with dissolved metals in different concentrations (Ayala-Carcedo 2004; López-Pamo et al. 1999; Madejón et al. 2018a, b; Santofimia et al. 2013). Although a detailed description of the fluid deposited into the pond is not available (Sanz-Ramos et al. 2022), the size of the particles of the mine tailings retained in the pond did not exceed 200 µm (CMA 1998). This value was later readjusted to values between 4.5 and 13 µm (Antón-Pacheco et al. 2001; Gallart et al. 1999; ITGME 1998; López-Pamo et al. 1999; Manzano et al. 2000; Madejón et al. 2018a; Querol et al. 1998; Vidal et al. 1999). According to Gens and Alonso (2006), the d50 of the particles still stored in the pond after the spill were between 10 and 15 µm in the pyritic lagoon (southern), while in the pyroclastic lagoon (northern) they had a wider diameter range, from 18 to 250 µm. According to that, the transport mechanism of particles had to be mainly by suspension load.

The dynamic and static behaviour of a fluid is a function of, among other properties and components, its water–sediment ratio (Pierson and Costa 1987; Hungr et al. 2001). A criterion based on the concentration of particles is widely used in the river engineering field in order to decide the fluid characteristics: for sediment concentrations ≤ 100–200 g/L, the fluid is supposed to retain the properties of water (Newtonian fluid); but at concentrations > 500 g/L, the fluid behaves like a sludge or debris (non-Newtonian fluid). Therefore, intermediate states would correspond to hyperconcentrated fluids, this state being a function of the composition of the fluid itself (Beverage and Culbertson 1964; Costa 1998; Nemec 2009).

Some in situ and post-failure data were reported showing sediment concentrations between 26.87 g/L and 660 g/L (AGE and JA 1999; Ayala-Carcedo 2004; Ayora et al. 2001; Gallart et al. 1999; Martín-Peinado 2002). This wide range of values extend from ‘non-clear water’ (Newtonian fluid), a fluid with a certain concentration of suspended particles such as in natural rivers, to ‘non-Newtonian fluids’, referring to fluids that appears to behave like a mud in both dynamic and static situations.

There is no in situ data regarding the bulk density. A first estimation of 3000 kg/m3 was made by CMA (1998). Other estimated values can be found in the literature, such as 2850 kg/m3 (Penman et al. 2001) and 2950 kg/m3 (Martí et al. 2021). In Alonso and Gens (2006a, b) and Gens and Alonso (2006), the density of the pyritic mineral was defined as about 4300 kg/m3, while a lower value of 3100 kg/m3 was proposed for the liquefied tailings. Ayala-Carcedo (2004) also split the fluid density, with a value of 2000–3100 kg/m3 for the acid waters and 3100 kg/m3 for the muds. Lower values could be achieved after the disaster due to liquefaction and sedimentation processes.

When the failure occurred, after a very fast breach formation that affected the two lagoons (Alonso and Gens 2006a, b; Gens and Alonso 2006), both fluids probably spilled and mixed, and their properties changed during the spill (Ayala-Carcedo 2004); thus, the flow probably propagated as a unique hyperconcentrated fluid (Sanz-Ramos et al. 2021b, 2022). Under the assumption of a monophasic fluid, the numerical reproduction of the spill propagation considered the presented values related to the suspended sediment (Newtonian) and mud (non-Newtonian) modelling.

Topographical Data

One of the main issues when dealing with the reconstruction of historical floods is the availability and quality of the topographical data. Utilisation of current topographical data and/or land uses maps to carry out the simulation of the flood propagation process might lead to results that are numerically valueless and unsuitable for comparison with the observed data.

In this regard, Rediam, the acronym in Spanish for the Andalusian Environmental Information Network, collaborated with other public administrations of Spain to develop a methodology to obtain historical orthophotos for the generation of derived cartographical products (Vales-Bravo et al. 2010). The methodology consisted of collecting available historical information (using analogical information from different photogrammetric flights and developing a digital photogrammetric process); photo scanning; collecting camera calibration certificates; defining work areas; obtaining a ground control point (2nd order: XYZ from stereoscopic 3D models); aerial triangulation; backward update of the digital terrain model (DEM); and performing an orthorectification with that data that included a radiometric adjustment (homogenization) and a mosaic.

The reference data were Rediam’s network of support points (NSP) and the DEM obtained from the 2001–02 photogrammetric flight. The NSP consisted of historical information of the terrain points (XYZ) supported by field observations and other acquisition methods (3D stereoscopic models, orthophotography, DEM, etc.). Generated for supporting the 1998–1999 flight orthorectification process, this database is continuously being updated with other NSP coming from more recent flights and field campaigns. The DEM came from a 1:20,000 scale flight, and consists of a raster file with a 10 m-size grid resolution obtained through photogrammetric correlation and subsequent rigorous editing over stereoscopic 3D models. In the orthorectification processes of historical flights, not only the DEM and the NSP were reused, but also aerial triangulation calculations of the 2001–2002 flight, to obtain the necessary ground control points (XYZ measurements based on stereoscopic 3D models).

The selected historical orthophotos previous to the mine disaster were related to the 1977–1978 and 1984–1985 flights (Rediam 2023). The DEM of the 2001–2002 flight was used, jointly with the stereoscopic models, to carry out a “backward” update and, thus, to generate a new DEM that accounts for the particularities of previous orthophotos (Villa 2008). Due to the similarity in the scale and precision of the 1977–1978 (scale 1:18,000) and 2001–2002 (scale 1:20,000) flights, the DEM corresponding to the 1977–1978 flight was only updated “backwards”. However, due to the differences in the precision of the aerial triangulation and the scale of the flight (1:30,000), a “backward” update could not be directly applied to the 1984–1985 flight. To that end, the reconstructed DEM of the 1977–1978 flight was used to update the 1984–1985 flight using a “forward” technique. A detailed description of the methodology, applicability, and limitations can be found in Vales-Bravo et al. (2010).

The result of these operations were new DEMs with similar scale and resolution (10 m of cell-size) than the original one, both allowing for the representation of sudden changes in the territory (reservoirs, infrastructures, etc.) according to the orthophotos taken in previous flights. Figure 2 depicts the four DEMs used in the present study. In the 1977–1978 DEM (Fig. 2a), the mine pit does not appear (NW corner), while it is in the 1984–1985 DEM (Fig. 2b). After the restoration activities made previously to 2001, the topography of the affected area did not change (Sanz-Ramos et al. 2022) and, thus, the differences between the 2001–2002 (Fig. 2c) and 2019 (Fig. 2d) DEMs are related to new infrastructures (e.g. new bridges downstream of the study area, ponds, etc.).

Fig. 2
figure 2

Representation of the topographical data close to the mine pit: (a) DEM of 1977–1978 (product from 2001–2002 data); (b) DEM of 1984–1985 (product from 2001–2002 data); (c) DEM of 2001–2002 (original data); (d) DEM of 2019 (original data)

Numerical Modelling

The reconstruction of the spill propagation was made using the numerical tool Iber (Bladé et al. 2014a), a two-dimensional code that solves the depth-averaged shallow water Eqs. (2D-SWE) using the finite volume method (LeVeque 2002; Toro 2009) and the Roe scheme (Roe 1986).

Iber was originally developed as a numerical tool for flood hazard assessment and risk mapping (Bladé et al. 2014b; González-Aguirre et al. 2016; Sopelana et al. 2017, 2018) and sediment transport process (Arbat-Bofill et al. 2014; Bladé et al. 2019; Cea et al. 2014; Uber et al. 2021) in rivers and estuaries. Nowadays, Iber integrates several calculation modules and capabilities for the numerical modelling of environmental flows (Cea and Bladé 2015; Cea et al. 2016; Ruiz-Villanueva et al. 2019, 2020; Sañudo et al. 2020; Sanz-Ramos et al. 2020b, 2023a, c).

Governing Equations

Since there is a lack of agreement in the fluid properties and the observed behaviour during the flood propagation (Sanz-Ramos et al. 2021b, 2022), the reconstruction of the spill propagation of the Aznalcóllar mine disaster was conducted following two different modelling strategies: simulating the spill as water with suspended sediment, as a Newtonian fluid, or as non-Newtonian fluid flow (mud).

In the numerical solver, Iber couples the hydrodynamics with the sediment transport processes, both bedload and suspended. The suspended sediment transport module is based on the results of the water depth, velocity, and the turbulent viscosity fields computed by the hydrodynamic and turbulence modules (Bladé et al. 2019). In this work, neither bedload nor the turbulent stresses have been considered because of the high concentration of sediments (hyperconcentrated flow) considered in the simulation was beyond the application range of the mixing process with clear waters.

Iber solves the 2D-SWE, a hyperbolic nonlinear system of three partial differential equations (Eq. 1):

$$\begin{array}{c}\frac{\partial h}{\partial t}+\frac{\partial {q}_{x}}{\partial x}+\frac{\partial {q}_{y}}{\partial y}=0\\ \frac{\partial {q}_{x}}{\partial t}+\frac{\partial }{\partial x}\left(\frac{{q}_{x}^{2}}{h}+g\frac{{h}^{2}}{2}\right)+\frac{\partial }{\partial y}\left(\frac{{q}_{x}{q}_{y}}{h}\right)=gh\left({S}_{o,x}-{S}_{f,x}\right)\\ \frac{\partial {q}_{x}}{\partial t}+\frac{\partial }{\partial x}\left(\frac{{q}_{x}{q}_{y}}{h}\right)+\frac{\partial }{\partial y}\left(\frac{{q}_{y}^{2}}{h}+g\frac{{h}^{2}}{2}\right)=gh\left({S}_{o,y}-{S}_{f,y}\right)\end{array}$$
(1)

where \(h\) is the water depth, \({q}_{x}\) and \({q}_{y}\) are the two components of the specific discharge, \(g\) is the gravitational acceleration, \({S}_{o,x}\) and \({S}_{o,y}\) are the two bottom slope components computed as \({{\varvec{S}}}_{{\varvec{o}}}={\left(\partial {z}_{b}/\partial \text{x},\partial {z}_{b}/\partial \text{y}\right)}^{T}\), where \({z}_{b}\) is the bed elevation, and \({S}_{f,x}\) and \({S}_{f,y}\) are the two friction slope components computed using the Manning formula.

The suspended sediment transport module solves the depth-averaged turbulent convection–diffusion equation. Following the generic convection–diffusion equation presented in Cea et al. (2016), and particularised for suspended sediment transport, it can be described as follows:

$$\frac{\partial hC}{\partial t}+\frac{\partial {q}_{x}C}{\partial x}+\frac{\partial {q}_{x}C}{\partial y}=\frac{\partial }{\partial {x}_{j}}\left(h\left(\Gamma +\frac{{\nu }_{t}}{{S}_{c}}\right)\frac{\partial C}{\partial {x}_{j}}\right)+\left(E-D\right)$$
(2)

where \(C\) is the depth-averaged concentration of suspended sediments, \(\Gamma\) is the molecular diffusion coefficient, \({\nu }_{t}\) is the turbulent viscosity, \({S}_{c}\) is the Schmidt number, and the term \(\left(E-D\right)\) relates to the erosion (\(E\)) and deposition rates (\(D\)).

Considering that neither bedload transport nor turbulent stresses (the diffusive turbulent coefficient term \({\nu }_{t}/{S}_{c}\) is neglected), the evolution of the bed elevation \({z}_{b}\) due to erosion–deposition processes is calculated with the sediment conservation equation:

$$\left(1-p\right)\frac{\partial {z}_{b}}{\partial t}=D-E$$
(3)

where \(p\) is the material porosity. The term \(E\) is computed using the expressions presented by Ariathurai and Arulanandan (1978), which is valid for cohesive soils. The term \(D\) is computed with the expression presented by Einstein and Krone (1962):

$$D=\left(\frac{{\tau }_{cd}-{\tau }_{b}}{{\tau }_{cd}}\right){W}_{s}\alpha C$$
(4)

where \({\tau }_{cd}\) is the deposition critical stress, \({\tau }_{b}\) is the shear stress computed with the Manning’s formula, \({W}_{s}\) is the settling velocity calculated using the van Rijn (1987) formula, and \(\alpha\) is a parameter that relates the near-bed concentration to the depth-averaged concentration calculated from the Rouse (1937) profile. This last parameter was considered equal to 1 due to the nature of the fluid (hyperconcentrated flow).

The erosion term \(E\) is computed as the difference between the shear stress (\({\tau }_{b}\)) and the erosion critical stress (\({\tau }_{ce}\)) multiplied by a factor (\(M\)) that modules the erosion rate:

$$E=M\left(\frac{{\tau }_{b}-{\tau }_{ce}}{{\tau }_{ce}}\right)$$
(5)

where \(E=M\) when \({\tau }_{b}=2{\tau }_{ce}\).

Equations 4 and 5 are valid when the shear stress \({\tau }_{b}<{\tau }_{cd}\) and \({\tau }_{b}>{\tau }_{ce}\); otherwise, \(E\) and \(M\) are equal to 0.

Iber has been recently enhanced by including a specific numerical scheme and calculation module for simulating non-Newtonian fluid flows (Sanz-Ramos et al. 2023a). This module integrates particular rheological models of non-Newtonian fluids, e.g. dense snow avalanches (Sanz-Ramos et al. 2021a), allowing for the representation of equilibrium and quiescent states in irregular geometries without numerical instabilities.

The difference in the 2D-SWE when applied to non-Newtonian fluid flows is the term describing the friction slope (\({S}_{f}\)), which is usually represented by the Manning’s formula for water while it represents the particular characteristics of the rheological model (\({S}_{rh}\)).

In the current work, the friction model proposed by Bingham (1916) was considered due to the nature of the fluid. This formulation is commonly used to characterise viscous fluids, and considers the shear stress as the sum of the yield stress (\({\tau }_{y}\)), necessary to the movement inception (solid phase), and the viscous (or turbulent) stress (\({\mu }_{B}\)), which is velocity- and depth-dependent:

$${S}_{rh}=\frac{3}{\rho gh}\left(\frac{{\tau }_{y}}{2}+\frac{{\mu }_{B}v}{h}\right)$$
(6)

Domain Discretization

The study area was the riverbed and flood plains of the Agrio and Guadiamar rivers, from 500 m upstream of the breaking point to 700 m downstream of the EA90 gauge station (Fig. 3). This represents the first 9 km of the spill extent and includes about 6 km of the Agrio River and approximately 3 km of the Guadiamar River. The domain was discretised with a mesh of triangular elements of 10 m-side length, then updated with the DEM. This implies a density of around 230 els./ha, an order of magnitude below the flood studies (Sanz-Ramos et al. 2020a, 2023b, c) but in agreement with the resolution of all of the DEMs used herein (also of the 10 m cell-size in raster format).

Fig. 3
figure 3

Calculation domain and location of the inlet condition and the EA90 gauge station and representation of the calculation mesh, updated with the topographical data, at the Agrio–Guadiamar junction

Simulation Process and Scenarios

The numerical reconstruction of the spill propagation was performed in three steps. A first analysis of the available and estimated hydrographs was done to select the most suitable data to carry out the simulations. Then, the different available topographies (1977, 1984, 2001, and 2019; see topographical data) were tested with the selected hydrograph aiming to compare the results with the observed data (fluid depth evolution at EA90 gauge and flood extent). In this case, the fluid was considered as clear water, i.e. without suspended sediment transport or muds. Finally, as a first attempt to simulate the flood propagation and extent of deposited sediments at the end of the process, new simulations were performed considering the two above-mentioned modelling strategies with the selected hydrograph and topography: Newtonian fluid, water with suspended sediment transport; and non-Newtonian fluid, Bingham plastic flow (mud).

The simulations of water with suspended sediment transport (non-clear water) were conducted by varying the deposition (\({\tau }_{cd}\)) and erosion (\({\tau }_{ce}\)) critical stresses, the sediment concentration (\(C\)), and the settling velocity (\({W}_{s}\)) within the range of values detailed in Table 1. By contrast, in the simulations that considered the fluid as mud (non-Newtonian), only the yield stress (\({\tau }_{y}\)) and the Bingham viscosity (\({\mu }_{B}\)) were varied according to the range of values presented in Table 1. A bulk density of 3100 kg/m3 was considered in both cases, while 10 µm was assumed to be the characteristic diameter of the sediment particles for the water with suspended sediment modelling.

Table 1 Range of values of the involved variables of each numerical strategy considered in the simulations

Results

Spilled Hydrograph

The measurements at the EA90 gauge station, corrected with the facility operator’s data, are plotted in Fig. 4a (dotted line). This figure also presents the results of the fluid depth evolution of the simulation carried out with the 1977 DEM and the spilled hydrograph proposed by Consultec Ingenieros (1999) (blue line), Padilla et al. (2016) (green line), and with the proposed hydrograph obtained after the calibration process based on a least squares adjustment (red line). Note that the base flow was neglected in the simulations (≈0.3 m), but this did not affect the propagation process due to the magnitude of the spill with a peak discharge close to 4 m.

Fig. 4
figure 4

a Evolution of the flow depth at EA90 gauge station: observation (dotted line), Consultec Ingenieros (1999) (blue line), Padilla et al. (2016) (green line), and according to the proposed spilled hydrograph (red line). b Evolution of the flow discharge at the EA90 gauge station: Consultec Ingenieros (1999) (blue line), Padilla et al. (2016) (green line), according to the proposed spilled hydrograph (red continuous line), and the proposed hydrograph at the breaking point (red dotted line)

The simulated limnigraph obtained from the spilled hydrograph of Consultec presented two peaks with a maximum value of 3.2 and 2.3 m (Fig. 4a, blue line), with the second one being similar to the observations (2.4 m). However, both peaks were produced 1–2 h later and the falling limbs were greater than the observations. The arrival time of the flood front was also ≈1 h late. This lack of agreement can be due the utilisation of the post-failure topography to infer the hydrograph at the breaking point, which included the deposited muds and, hence, notably differs from the existing topography before the pond failure. This could lead to an underestimation of the maximum peak discharge and also of the volume of the spill.

On the other hand, the limnigraph obtained with Padilla’s hydrograph (Fig. 4a, green line) shows that the fluid reached the gauge station at the same time and with a similar maximum flow depth (3.8 m) than the observations. Nevertheless, although the second peak was produced almost at the same time than the observations, its magnitude was slightly higher than the measurements (≈1.4 m above). Both falling limbs achieved lower values, all of this being caused by the consideration of higher peak discharges separated by a no-discharge period.

By contrast, the utilisation of an ad hoc spilled hydrograph, which was calculated using a least squares adjustment with a two-dimensional numerical tool (Iber) and topographical data from before the disaster with a higher spatial resolution (1977 DEM), suitably adjusted to the measurements at the EA90 (Fig. 4a, red line). The good fit in the arrival time (≈2:20), both peak discharges (3.86 and 2.41 m, respectively), and the shape of the rising and falling simulated limbs demonstrate the validity of the calibration process and of using the new data (pre-failure topography).

The fit between the simulated results and the observations was assessed using several indicators, such as the R-squared correlation (R2), the Nash–Sutcliffe model efficiency coefficient (NSE) (Nash and Sutcliffe 1970), the mean absolute error (MAE), and the root mean square error (RMSE). Table 2 summarizes the performance of the model for the available spilled hydrographs in the literature (Consultec and Padilla) and the calculated one based on a least squares adjustment. As observed, this last hydrograph produced a better fit than those obtained with the Consultec and Padilla’s hydrographs, with both R2 and NSE values close to 1 and with the lowest values for MAE and RMSE.

Table 2 Performance between observed and simulated water depths using the available spilled hydrographs in the literature and the proposed one

The proposed spilled hydrograph at the breaking point is presented in Fig. 4b (red dotted line). It is also characterised by two peaks, 1600 and 275 m3/s respectively, separated by ≈5:30 h, and with a starting time at about 01:00 AM. This hydrograph has 11.2 hm3, a value within the potential capacity of the pond and the spilled volume estimated in Sanz-Ramos et al. (2022). In the same figure, the simulated hydrographs at the EA90 gauge station are plotted, showing, as expected, a flood abatement. The simulated peak discharges of the proposed hydrograph were reduced to ≈960 and 195 m3/s, with the time of these peaks at around 03:10 and 08:45 respectively. By contrast, the results with Padilla’s data (Fig. 4b, green line) show a similar first peak discharge (≈900 m3/s), but a higher second peak discharge (≈930 m3/s) and almost no discharge between them. The results with Consultec’s data (Fig. 4b, blue line) shows a lower magnitude simulated hydrograph at EA90 and a later arrival of the flood.

Topography Analysis

The performance of the topographical data (1977, 1984, 2001, and 2019 DEMs) was evaluated using the proposed hydrograph at the breaking point. Figure 5a compares the observed flow depth evolution at the EA90 gauge station and the results of the proposed hydrograph updated with the different DEMs considered. The DEMs corresponding to the topography of 1977 (red line) and 1984 (blue line) showed a good fit with the observed data, suitably reproducing both peaks discharge and the shape of the limnograph. By contrast, the hydrographs corresponding to the DEMs of 2001 (green line) and 2019 (brown line), obtained after the restoration activities, had a good shape but the flow depth was about 1 m below the observed data. This demonstrates that post-event topographies should not be used to reproduce historical floods.

Fig. 5
figure 5

a Comparison of the flow depth evolution at the EA90 gauge station between the observed (black dotted line) and the results of the numerical model updated with the DEM of 1977 (red line), 1984 (blue line), 2001 (green line), and 2019 (brown line); b Flooded area obtained with the different DEMs compared to the observed one (1998)

The simulated flood extent was compared to that observed in 1998, obtained from an aerial image taken five days after the spill and field campaigns (JA 2003). The total flood extent, limited to the study area, was ≈576 ha, while the simulated flood extents were ≈485, 457, 418, and 483 for the DEMs of 1977, 1984, 2001, and 2019, respectively. Figure 5b depicts these areas, considering those inside (blue column) or outside (orange column) the observed area. Although the 1977 DEM provided the closest flood extent to the observed one, almost 6% of the flood was outside the observed flood. The lower value in the flood extent obtained with the 1984 DEM probably came from the application, first, of a “backward” and, then, of a “forward” update process to obtain it.

Considering the previous results, the proposed spilled hydrograph presented in the spilled hydrograph and the DEMs of 1977 and 1984 provide good results in terms of flow behaviour and flood extent. However, it is important to highlight the uncertainties in the measurements at the EA90 gauge station. According to Sanz-Ramos et al. (2022), on the basis of the real measurements, from ≈3:00 to ≈6:00 AM, the fluid depth was above the measurement range of the facility (limited to 2.5 m). This fact might control both the flood extent and flow behaviour.

Flood Reconstruction

The following sections present the results of the first attempt to characterise the deposited sediments after the event throughout the numerical reconstruction of the flood using the 1977 DEM. To that end, two different approaches were considered for the fluid: as water with suspended sediment (a Newtonian fluid) and as Bingham plastic flow (non-Newtonian fluid). The results of the fluid depth at the EA90 gauge station are explored in the Discussion.

Newtonian Fluid: Water and Sediment Transport

In the simulations of non-clear water, the hydrodynamic and suspended sediment transport modules of Iber were applied by varying the parameters \(C\), \(\tau_{cd}\), \(\tau_{ce}\), and \(W_{s}\). The results are labelled as \(C\) _ \(\tau_{ce}\) _ \(\tau_{cd}\) _ \(W_{s}\).

It is important to highlight that the sediment concentrations (\(C\)) ranged from 26 g/L, which is a higher value than that usually found in the river in natural conditions, to 660 g/L, which can be considered a hypercongested sediment flow. Although this last value would be considered a non-Newtonian fluid, there are no limitations in using even higher values in a 2D-SWE-based numerical model coupled with a sediment transport module.

Figure 6 presents the maps of height of the deposited sediments at the end of the simulation. Lower values of the sediment concentration (\(C\)) and shear stresses (\({\tau }_{cd}\) and \({\tau }_{ce}\)) provided results (Fig. 6a, 26_2_1_0.01) not in agreement with the observations (Fig. 6, grey polygon). Increasing the shear stresses and keeping \(C\) equal to 26 g/L (Fig. 6b, 26_275_165_0.01), the extent of the deposited sediments expanded to the limits of the flood extent; however, the height of sediments was generally less than 0.1 m. When the concentration of sediment was assumed to be 660 g/L, the height of the deposited sediments increased considerably. Figure 6c, which corresponds to lower shear stress values (660_2_1_0.01), shows maximum heights of 1.2 m a few meters downstream of the breaking point. Maximum heights up to 2 m of deposited sediments were obtained for the higher values of the shear stresses (Fig. 6d, 660_275_165_0.01), with huge amounts of sediments being deposited on the riverbed (> 0.7 m). There were no remarkable differences for different settling velocities (\({W}_{s}\)) due to the nature of the spill, which was characterised by two peak discharges that flooded the riverbed and part of the flood plains.

Fig. 6
figure 6

Simulated extent of the deposited sediments after the flood according to the deposition (\({\tau }_{cd}\)) and erosion (\({\tau }_{ce}\)) critical stresses, the sediment concentration (\(C\)), and the settling velocity (\({W}_{s}\)): a 26_2_1_0.01, b 26_275_165_0.01, c 660_2_1_0.01, d 660_275_165_0.01. The grey polygon depicts the observed flood extent (source: JA (2003)). Negative values mean ‘deposition’ while positive ‘erosion’

This approach suitably reproduced the backwater effect in the Agrio and Guadiamar rivers. The fluid travelled about 500 m towards the northern part of the reservoir and more than 1.5 km from the Agrio-Guadiamar junction, opposite to the natural slope, appropriately reproducing both the fluid behaviour and the location of the deposited sediments.

Non-Newtonian Fluid: Mud

Next, the new module of Iber oriented to simulate non-Newtonian fluid flows was applied to simulate the spill, considering the fluid as a Bingham plastic. The simulations were carried out by varying the parameters \({\tau }_{y}\) and \({\mu }_{B}\), with the results being labelled as \(\tau_{y}\) _ \(\mu_{B}\).

As anticipated, a different behaviour was observed when the governing equations considered in the simulations related to non-Newtonian fluid flow. The Bigham parameters were also varied, with the yield stress (\({\tau }_{y}\)) ranging between 0 and 50 N/m2 and the viscous (or turbulent) stress (\({\mu }_{B}\)) ranging from 0 to 2000 Ns/m2. Thus, a different front wave velocity and deposited mud height were expected at the end of the simulation.

Figure 7a shows the simulated flood considering a theoretical fluid with \({\tau }_{y}\) = 0 N/m2 and \({\mu }_{B}\) = 0 Ns/m2 (0_0). In such a case, 24 h after the pond failure, the fluid continued flowing and the muds were only deposited in depressed areas. In Fig. 7b (0_2000), the fluid continued flowing but with lower velocities, which implied a greater extent of the flood at the end of the simulation. By contrast, Fig. 7c (50_0) and Fig. 7d (50_2000) depicts the extent of the deposited muds; i.e. with the fluid stopped. This demonstrates the role of the yield stress (\({\tau }_{y}\)) in the detention of the fluid, which is a non-velocity-dependent term computed with an upwind scheme (Sanz-Ramos et al. 2023a). The amount of fluid deposited during the simulation with the maximum flow resistance considered (Fig. 7d, 50_2000) was greater than 2 m (garnet colour) in several areas, with the extent of the flood adequately adjusted to the observations. Despite the good results in terms of the flood extent, the simulated flood front velocity was the slowest.

Fig. 7
figure 7

Simulated extent of the deposited muds after the flood according to the yield stress (\({\tau }_{y}\)) and the viscous stress (\({\mu }_{B}\)). The results are labelled as \({\tau }_{y}\) _ \({\mu }_{B}\): a 0_0, b 0_2000, c 50_0, and d 50_2000. The grey polygon depicts the observed flood extent (source: JA (2003))

Discussion

On Using a Hydrograph as the Inlet Condition Instead of a Breach Formation

In the current work, a hydrograph was considered as an inlet condition in the numerical model. This approach was adopted due to the general uncertainties in the hydraulics of the disaster (Sanz-Ramos et al. 2022), especially in the breach formation process because there were no direct observations or measurements.

According to the trilogy of papers of Alonso and Gens (Alonso and Gens 2006a, 2006b; Gens and Alonso 2006), where the causes of the embankment rupture were analysed from a geotechnical point of view, the failure occurred in less than 16 s. A fan-like displacement on the east dike of the southern lagoon affected a length of 600 m and opened a breach of ≈55 m. This sudden movement would have immediately affected the central dike and the east dike of the northern lagoon, generating an opening in the embankment that would have affected the retained fluids in both lagoons.

In the numerical reproduction of the spill presented by Castro-Díaz et al. (2008), a breach formation was simulated. However, the embankment rupture process considered in this simulation contrasted with previous research because the breach was first generated on the northern lagoon, and then in the southern one. Additionally, although the simulated arrival time of the flood wave front at the gauge station suitably fit with the observations (not presented in this document), the fluid depth was ≈1 m below the measurements (Castro-Díaz et al. 2008). This was possibly caused by the utilisation of the post-failure topography, which was demonstrated to be not suitable because it includes the deposited muds.

Related to the breach hydrograph, it is relevant to highlight that there are types of failure that, due to their geometry and evolution, are complex to reproduce with 2D-SWE-based numerical tools. This would be the case, for example, of a failure caused by internal erosion or tubing. In these cases, formulas could be used to calculate the hydrograph generated by the break, which could be implemented in the model as an inlet condition. However, there are other processes whose relevance has not been considered to date, such as the formation of a breach in a dike due to its displacement.

Another fact to highlight is that for the present research, a new methodology was implemented into Iber allowing for the consideration of the breach formation by reading topography rasters. This method provides the possibility of generating breaches by the sliding of one of the dikes (Fig. 8), as well as the definition of almost any type of breach geometry that evolves over time (Vahedifard et al. 2017). In these cases, the necessary data are the pre- and post-break topography.

Fig. 8
figure 8

Example of the breach formation process due to a displacement of the dike (above) and subsequent flooding process (below) due to the formation of a breach

This methodology was demonstrated to be unsuitable for the Aznalcóllar failure. In this case, the fluid was released immediately after the generation of the breach and, thus, the two peaks registered at the EA90 gauge station were not properly reproduced. The causes of the generation of the two peaks could be due to liquefaction of part of the retained fluid in both lagoons (Ayala-Carcedo 2004; Kheirkhah Gildeh et al. 2021; Penman et al. 2001).

Therefore, considering the uncertainties on the hydraulics of the Aznalcóllar disaster, the proposed hydrograph (see Spilled hydrograph) at the breaking point demonstrated to be the most suitable option to reproduce the flood propagation. Novel numerical approaches and new evidences in DEM data that better represents the pre-failure topography were used for the propagation of the proposed hydrograph, showing good agreement with the observations and the physics of the problem.

On Using the Hydrograph at EA90 Gauge Station

A handful of researchers presented different hydrographs at the EA90 gauge station, although only a limnograph was registered. A hydrograph can be obtained from a limnograph using the proper rating curve, an unambiguous relationship between the discharge and the flow depth/elevation that passes through a section. However, for the case of the Aznalcóllar disaster, the flow depth registered at the EA90 gauge station overtopped the maximum measurable value of 2.5 m for a wide time gap. Additionally, the gauge station was rebuilt after the event, changing the geometry and, thus, also changing the rating curve. Only a limited rating curve (up to 1.1 m of flow depth) previous to the accident could be inferred from the observed data previous to the event (Sanz-Ramos et al. 2022).

Despite that, several authors estimated a potential hydrograph at the EA90 gauge station (Fig. 9a). It was first presented in the unpublished document of Consultec Ingenieros (1999), showing a maximum peak discharge of 600 m3/s. A few years later, using the rating curve of Benito et al. (2001), which was estimated from a one-dimensional hydraulic analysis of a downstream cross-section, Borja et al. (2001) calculated a hydrograph with a maximum peak discharge of 1056 m3/s. Finally, Ayala-Carcedo (2004) estimated a similar hydrograph using data from Palancar (2001). In this case, the fluid was considered as a mixture of water and mud, and the author indicated that the estimated peak discharge of 811 m3/s was probably underestimated due to the higher viscosity of the fluid.

Fig. 9
figure 9

a The estimated (black lines) and calculated (coloured lines) flow discharge at EA90 gauge station according to Consultec Ingenieros (1999) (black dotted line), Borja et al. (2001) (black dashed line), Ayala-Carcedo (2004) (black continuous line), Consultec Ingenieros (1999) (blue line), Padilla et al. (2016) (green line), and according to the proposed spilled hydrograph (red continuous line). b The simulated discharge – depth relation at EA90 with the proposed hydrograph

The estimated hydrograph presented by Consultec Ingenieros (1999) (Fig. 9a, black dotted line), besides being the lowest in magnitude, was time-displaced by ≈1 h. The peak discharge and the flood front arrival time at EA90 of the other two estimated hydrographs coincided in time. However, according to the specific energy theory in open channel flows (Chow 1959), and Koch’s parabola, which relates the flow depth and the specific discharge, the maximum discharge is produced prior to the maximum flow depth (Muste et al. 2020). This is what happened in the simulated hydrographs at EA90 (Fig. 9a, coloured lines). The results of the simulation with the proposed hydrograph had a time-gap of ≈12 min for the first peak and 1.4 min for the second peak. The hysteresis of flow variables obtained with the numerical model is presented in Fig. 9b.

On the Topographical Data for Historical Flood Reconstruction

Topography is one of the main factors in the assessment of flood prone areas (Fu et al. 2022; Wang et al. 2015). The use of historical topography, at least from previous years of the flood event, is mandatory for reconstructing accurately historical floods.

In addition, depending on the magnitude of the flood and the type of river (ephemeral or perennial), the riverbed (bathymetry) must be considered as an integral part of the topographical data used in the simulation (Adnan and Atkinson 2012; Dey et al. 2022; Neal et al. 2021). That is, the elevation data used to update the elevation of the nodes of the calculation mesh must include the bathymetry of the riverbed and the topography of the riverbanks and flood plains before the event. Otherwise, the flood propagation process is underestimated while the flood extent is overestimated. This is a common issue when using free and/or massively distributed DEM data, which considers the free surface of the water layer instead of the bathymetry of water bodies (rivers, lakes, reservoirs, etc.).

According to the mean daily water depth and discharge data during 1998 at the EA90 gauge station (Sanz-Ramos et al. 2022), the river discharge before the disaster was less than 1 m3/s. Since the peak discharge produced during the event was several orders of magnitude greater in this case, consideration of the bathymetry is not relevant for the reconstruction of the fluid propagation.

In order to obtain the previous topography to any flood event, several techniques can be used: from digitizing of historical topographic maps (contour lines) to using local or worldwide free distributed DEMs generated previous to the flood (mainly since the beginning of the XXI century). However, as previously mentioned, the first DEM available in this case was generated in 2001, after the disaster. The information extracted from the digitalization of contour lines is limited by the map scale, that is 1:10.000 for the available 1998 topographic map of the study area, which leads to contour lines of 10 m each. Attending to the extremely flat area where the flood propagated, with mean terrain slopes less than 0.15% from the breaking point to the Agrio-Guadiamar junction and less than 0.06% downstream (Ayala-Carcedo 2004; Benito-Calvo et al. 2001), this data lacks representativity for suitable reproduction of the event.

The technique used herein for reconstructing the topography is a step forward not only because it combines field data with remote sensing techniques, but specially because historical DEMs were obtained from a backward update process. The methodology followed is similar to that used for flights with analogical photogrammetric cameras (already in disuse). Most of the NSP points (with XYZ coordinates in the field, 1st support order) were not identifiable. For this reason, the use of XYZ points obtained directly on stereoscopic pairs (2nd support order) for 1977–1978 and 1984–1985 were needed. Another relevant and novel aspect in the methodology used was that the waypoints and connection points from the calculation of the aero triangulation of the 2001–2002 flight were used as altimetric control points (Z) for the 1977–1978 flight. This could be the reason why the 1977 DEM seems to be more reliable than the 1984 DEM; besides the 1984 DEM needed an additional step for its determination (a “backward” update to 1977 and a “forward” update to 1984).

The identifiable differences between the mentioned DEMs are plotted in Fig. 10a. The closest historical orthophotographic images of the Spanish Geographical Institute (IGN 2021; Rediam 2023) to the 1977 (Fig. 10b) and 1984 (Fig. 10c) DEMs are also presented. Besides the changes in the area where the pond was built-up (Fig. 10a, west side), a notable terrain accretion is observed at the east side with differences of up to 8 m in the left riverbank of the Agrio River. However, no remarkable differences were appreciated when comparing the orthophotos highlighting this sudden change in the topography.

Fig. 10
figure 10

a Difference in elevation between the 1984 and the 1977 DEMs. b Orthophoto taken in June 1977 (source: Rediam, ‘Interministerial’ flight 1973–1986). c Orthophoto taken in November 1981 (source: Rediam, ‘Nacional’ flight, 1980–1986)

This accretion reduced the flood extent by ≈29 ha, with the simulated flood extent 5.3% outside of the observed one (see Fig. 5b). Thus, considering the double step needed to generate the 1984 DEM, and the poorest results in terms of flood extent, the 1977 DEM was demonstrated to be more reliable for the flood reconstruction.

On the Nature of the Fluid and the Simulation of the Propagation Process

During the reconstruction of the Aznalcóllar mine disaster presented in this document, an attempt was made to explore the wide range of uncertainties of the event. Estimation of the spill's hydrograph at the point of rupture relied on contemporary and reliable numerical techniques (specifically, a 2D-SWE-based model). Additionally, the most recent data used aimed to replicate the morphological features of the river and floodplains as they existed before the pond's failure. With all that, two different numerical approaches were used to characterise not only the extent of the flood, but also the amount of sediments deposited after the event.

Consideration of water and suspended sediments (non-clear water) provided similar results as clear water because the resistances forces are computed in the same way, i.e. with the Manning formula. A similar flood extent caused by the first peak discharge was generated in both cases. As Iber couples the hydrodynamics and the sediment transport process, the terrain accretion caused by the sedimentation of particles modified the topography and, thus, the fluid behaviour. The simulated fluid depth at the EA90 gauge station for this case is presented in Fig. 11a. Low concentrations (26 g/L) provided almost the same limnograph as for clear water (Fig. 11a, green and purple lines). Higher concentrations (660 g/L) notably modified the morphology of the river and the flood plains, especially after the first peak discharge. When the second peak discharge was produced, the fluid flowed according to the new topography that included the deposited sediments. The rising and falling limbs of the second peak reflects this behaviour (Fig. 11a, red and orange lines). In such cases, the low values of the deposition (\({\tau }_{cd}\)) and erosion (\({\tau }_{ce}\)) critical stresses provided a good adjustment, while the high values generated greater flow depths due to the considerable topographical changes.

Fig. 11
figure 11

a Comparison of the flow depth at EA90 gauge station between the observations (dotted line) and the simulations. The results are labelled as \(C\) _ \({\tau }_{ce}\) _ \({\tau }_{cd}\) _ \({W}_{s}\): 26_2_1_0.01 (green), 26_275_165_0.01 (purple), 660_2_1_0.01 (red), and 660_275_165_0.01 (orange). b Volume of deposited sediments after the event estimated by López-Pamo et al. (1999) (black and white) and the simulations (maroon)

The volume of the deposited sediments is plotted in Fig. 11b. The observed volume was extracted from the data of López-Pamo et al. (1999), who provided a 1:50,000 map of the deposited muds indicating the mean height of it. In this case, the volume ranged from ≈0.65 to ≈1.24 hm3, considering a maximum sediment height of 1 m in the study area (Fig. 11b, black and white bar), equivalent to a total estimated mud volume of 1.98 hm3 in the entire affected area. The volume of deposited sediments resulting from the simulations also ranged widely (Fig. 11b, maroon bars), from a few cubic metres to 1.5 hm3. Although this last value is above the observations, a greater deposited volume could have been produced in the study area (Sanz-Ramos et al. 2022), especially due to the averages made in the original map, which could have hidden maximum values.

The attempt to model the spill as a non-Newtonian fluid flow (mud) provided disparate results. The consideration of no resistance forces (\(\tau_{y}\) _ \(\mu_{B}\) as 0_0) led to inaccurate results (Fig. 12a, blue line), although ≈0.2 hm3 of the volume of the spill was deposited in the depressed zones of the study area (Fig. 12b). The extreme values of \({\tau }_{y}\) and \({\mu }_{B}\) (0_2000, 50_0, and 50_2000) also generated results far from the observations. The high values of the resistance forces generated a flood front arrival time to the EA90 gauge station at about 9:00 and 10:30 AM and a volume of deposited mud of 7.1 and 9.7 hm3 for 0_2000 and 50_2000, respectively. Intermediate values of the rheological model (20_20, 25_5, and 25_15) provided suitable results in terms of flood wave evolution (Fig. 12a) and the volume of the deposited muds (Fig. 12b), but in these cases, the maximum flow depth was ≈3.4–3.5 m, ≈0.5 m below the value written by the facility operator, and the falling limb of the second peak did not match the observations.

Fig. 12
figure 12

a Comparison of the flow depth at EA90 gauge station between the observations (dotted line) and the simulations (non-Newtonian fluid). The results are labelled as \({\tau }_{y}\) _ \({\mu }_{B}\): 0_0 (blue), 0_25 (maroon), 0_2000 (green), 20_20 (purple), 25_5 (cyan), 25_15 (orange), 50_0 (dark blue), and 50_2000 (brown). b Volume of deposited sediments after the event estimated by (López-Pamo et al. 1999) (black and white) and the simulations (maroon)

Based on the numerical approaches presented, including the classical one that considered the fluid as clear water, none of them provided a perfect match with the observed data. The differences could come from the topographical data used to infer the spilled hydrograph to the kind of fluid considered in the simulations. In terms of flood behaviour, the consideration of Newtonian fluid as water, without or with suspended sediments, generated the best fit on the limnograph. The combination of parameters \(C\) _ \(\tau_{ce}\) _ \(\tau_{cd}\) _ \(W_{s}\) as 660_2_1_0.01 (Fig. 11a, red) provided the best agreement with the observations; however, the flood extent of the deposited sediments was smaller in area. In this sense, the simulations considering the fluid as non-Newtonian (muds) showed the poorest results in terms of flood behaviour and flood extent. A good fit with the observed flood extent of the deposited muds was only obtained with high values of the rheological model (\(\tau_{y}\) \(\mu_{B}\) as 50_2000), but huge amounts of muds (> 2 hm3) were deposited in the study area in this scenario.

These facts reinforce the idea that the fluid of the spill behaved more like a highly concentrated or hyperconcentrated sediment-laden flow, as suggested by Sanz-Ramos et al. (2022, 2021b), than the mud-like flow generally denoted in the literature. The fluidification of the retained mine tailings few hours after the dike failure could have generated the second peak registered in the gauge station (Ayala-Carcedo 2004). One of the main limitations of both approaches is the consideration of a constant sediment concentration, i.e. the same rheological properties during the event. A non-Newtonian behaviour should be expected during the first stages, with the arrival time to the gauge station well captured, while the sedimentation of particles probably changed the bulk properties, generating higher propagation velocities and lower depths. This agrees with the change in colour of the spill observed downstream of Vaqueros ford, where it changed from dark blue to orange (JA 2003; Sanz-Ramos et al. 2022).

Conclusions

The simulation of historical flood events is challenging due to uncertainties in the measurements, the lack of observations, and the available topographical data. Furthermore, adequate characterisation of the fluid (Newtonian or non-Newtonian) produces better representation of the fluid rheology and, thus, the resistances forces that define the flow behaviour, both in the static and dynamic phases.

Reconstruction of the spill propagation process of the Aznalcóllar mine tailings that occurred in 1998 was performed using new data and current numerical techniques. To that end, a new DEM previous to the mine disaster was used. This was generated or updated “backwards”, allowing for the representation of sudden changes in the territory based on previous flights. Furthermore, the spilled hydrograph at the breaking point was also estimated throughout a least squares process with a two-dimensional numerical tool and using topographical data prior to the disaster. This resulted in a hydrograph of two peaks (≈1600 and ≈275 m3/s) with a volume of ≈11.2 hm3 that, once propagated over the pre-failure topography, provided the best fit to the observations.

A first attempt of simulating the spill as water with sediment transport (Newtonian fluid flow) and as mud-like fluid (non-Newtonian fluid flow) was performed. Consideration of the Newtonian fluid as water, with or without suspended sediments, generated the best fit between the observed and simulated limnographs, while the simulations that considered the fluid as non-Newtonian (muds) showed the poorest results in terms of flood behaviour and flood extent. These results demonstrate that the spill probably behaved more like a highly concentrated or hyperconcentrated sediment-laden flow than the mud-like flow generally denoted in the literature. The fluidification of the mine tailings retained in the pond after the failure could have led to a more complex fluid behaviour, changing the rheology of the fluid during the dynamic phase, and the propagation of a multi-phase fluid not being discarded.