Introduction

Grasslands are one of the world’s major biomes, covering more than one-third of the Earth’s terrestrial area and two-thirds of its agricultural area1. Grasslands are a vital source of fodder for livestock production, but also provide a wide range of other ecosystem services related to plant functional traits, such as erosion protection, water purification, carbon storage and sequestration, and habitat for many species2,3,4,5,6. Higher plant species richness is often associated with increased ecosystem services in both intensively and extensively managed grasslands. For example, plant species richness may contribute to increased biomass yield and yield stability, regulating services such as carbon sequestration and cultural ecosystem services7,8,9. Management factors such as fertilization, mowing, and grazing regimes are, however, important determinants of plant species richness10,11,12,13. Adjusting management to increase plant species richness, on the other hand, may imply a loss of yield14,15,16. While available evidence suggests a negative relationship between management intensity and biodiversity, the magnitude of this relationship is heterogeneous and context-dependent17,18. Existing studies are often based on experiments, but there is a lack of large-scale causal studies on the relationships between management intensity, biodiversity and yield, and quantification of the underlying heterogeneity.

Here, we present a new methodological approach to estimate causal, spatially explicit effects of management intensity on biodiversity and illustrate it using the effect of mowing frequency on plant species richness as an example. The approach can be applied to other research contexts and data sets as they become available. We here answer the question: what change in species richness would we observe on each specific field if the field had been mowed differently? We demonstrate the scalability of our approach based on these proxies using large, satellite-sourced data across entire Germany for the period 2017−2020 (1,313,073 fields observed for up to four years resulting in n = 5,008,614 total observations). We also quantify the implicit trade-offs of lower mowing frequency in terms of yield losses and quantify the opportunity costs associated with lower mowing frequency and consequently higher plant species richness.

We extend previous research by considering the spatial variation in impacts of changing mowing frequency rather than focusing on the static relationship between current (i.e., observed) mowing frequency and different biodiversity indicators. Spatially explicit modeling of causal effects of changes in mowing frequency is important for guiding policy to be effective and cost-efficient19,20. Information on the spatial and temporal variability of a conservation policy’s impact can help to target contexts where expected benefits of changing or maintaining certain management practices are greatest21,22.

The increasing availability of high-resolution remote sensing products enables us to build a unique dataset and evaluate grassland management policies at national scale23. We use the estimated number of mowing events24 per year as a proxy for grassland management intensity (see Method section for details). As a proxy for biodiversity, we use annual estimates of plant species richness25. Both variables are remotely sensed and exemplify our approach at scale, but other ecologically nuanced indicators may be integrated in future studies as they become available. Specifically, the estimates of plant species richness refer to the number of plant species per 16 m2 square plot and are produced using time series of Sentinel-2 imagery and plant inventories across Germany26. Using generalized random forests27 to analyze the resulting dataset enables us to evaluate context-specific, spatially explicit causal impacts at field level, i.e., we estimate the effect of higher frequency of mowing events on species richness in the following year. The core principle of this approach is to predict plant species richness under a different mowing frequency than the current one. As it is not possible to observe and compare a higher and a lower frequency mowing regime for the same field in the same year, we use a comparison group based on similar mowing propensities, but different observed mowing regimes. These mowing propensities are calculated from the observed variables and their non-linear interactions and are used to weight observations for valid inference (see Method section).

Finally, we extend our analysis by assessing yield implications of lower mowing frequency. We thus provide a spatially explicit quantification of species richness-yield trade-offs, moderated by mowing frequency. This enables policy makers to more effectively target and monitor conservation policies while considering provisioning services and related food security implications. More specifically, we estimate opportunity cost of marginally extensifying 30% of Germany’s permanent grasslands to meet the 30 by 30 goal to protect 30% of terrestrial ecosystems by 2030, as set out in the new Kunming-Montreal Global Biodiversity Framework28.

Results

Field-level estimates of plant species richness in our panel dataset (N = 5,008,614) over the four years considered, 2017–2020, vary between 4 and 69 species across all mowing frequencies, with a mean of 23.3 (SD = 5.7). We calculated plant species richness for the whole of Germany using the previously calibrated model for which the authors report an overall coefficient of determination (r2) of 0.42 based on data from the biodiversity exploratories25. The observed annual mowing frequency in our dataset ranges from 0 to 5.5 with a mean of 1.9 (SD = 0.82). The authors of the underlying mowing event dataset report an average f-score of 0.6 across all years and test regions24. This treatment indicator represents the number of uses (mowing) per year and is continuous due to field-level aggregation of pixels exhibiting within-field heterogeneity.

Causal effect of mowing frequency on plant species richness

We find that plant species richness decreases with increasing mowing frequency, but the magnitude of this effect varies across space. This relationship is causal in the sense that we isolate the causal effect of mowing frequency from other observed and unobserved factors that explain both mowing frequency and plant species richness. We mitigate bias from reverse causality by looking at plant species richness in the subsequent year. Our generalized random forest analysis shows that, on average, a unit increase in mowing frequency leads to 1.6 (SE: 0.05) fewer plant species in the following year, with a full range from −3.5 to 0.2 (SD: 0.39) (Fig. 1a). Using discrete mowing frequency instead of a continuous indicator as the treatment variable leads to an effect of similar magnitude (Fig. S5). The results of the model calibration tests show a strong predictive ability of the generalized random forest model (p < 0.001). The omnibus tests for heterogeneity indicate that the effect varies across space and time (p < 0.001, Table S3, row 1). Identifying spatial heterogeneity, as in our analysis, is highly policy relevant. For example, decision makers may be interested in spatially prioritizing or avoiding changes in mowing regimes in certain areas depending on the expected impact on ecological indicators, such as plant species richness in the application shown here. Panels B and C in Fig. 1 show how the effect varies along observed mowing frequency and species richness. It shows that intensification has a slightly smaller negative impact on already intensively managed fields with already low species richness, while it has a much larger impact on fields that initially had a high species richness. The top 5% of the species richest fields exhibit a three times higher partial effect of mowing frequency compared to 75% of the fields with lower species richness. Panel D shows the geographic distribution of extensification impacts; i.e., the absolute impact of a unit reduction in mowing frequency. The largest gains from management extensification in terms of plant species richness occur in the northeast, while the smallest gains occur in the more intensively managed northwest and south of Germany.

Fig. 1: Impact of changes in mowing frequency on plant species richness.
figure 1

Left panels show doubly robust average partial effects across (a) quantiles of the predicted effect, (b) quantiles of mowing frequency and (c) outcome variable with 95 percent confidence intervals in gray. The impact magnitude slightly diminishes with increasing mowing frequency. In contrast, the impact of mowing frequency becomes drastically stronger for very species-rich fields. d shows the geographic distribution of the predicted field-level impacts of a one-unit increase in mowing frequency aggregated to 1 km grid cells. Only 0.34% of the fields have an estimated positive impact of increased mowing frequency.

Drivers of impact heterogeneity

These spatial patterns are driven by spatial and temporal patterns in contextual factors, which we reveal through their variable importance. Average temperatures and precipitation in the spring and summer months are among the most important variables for predicting the impact of mowing frequency on plant species richness (Fig. S4). Furthermore, structural variables at the municipal level such as average farm size and cattle density predicate impact heterogeneity, as visible by some marked district boundaries especially in the south (Fig. 1, panel d). Variables describing field structure (size, diameter) are important predictors of impact, which is intuitive, given that these variables reflect long-term land use and management decisions and historic adaptation to local production conditions. On the contrary, soil characteristics, location dummies and surrounding landscape diversity are not found to be important predictors. In the context of generalized random forests, this does not necessarily imply that these predictors are irrelevant, but rather that their explanatory power may be captured by other variables.

Plant species richness in grasslands has been shown to decrease with increasing mowing frequency11. To explore this further, we test the sensitivity of our estimates using subsets of fields that monotonously increased or decreased in mowing frequency between 2017 and 2020. We also estimate the average mowing frequency over time and use it as a treatment indicator. All three subsets result in much larger effect magnitudes, implying that the impact proliferates over time (see Table S3, rows 4, 7, and 8). In line with our expectations and previous research11, we also find that delaying the first mowing activity leads to higher species richness (Table S3, row 5).

Robustness checks

Our treatment variable (mowing frequency) will not capture all relevant management dimensions affecting species richness. As an alternative proxy, we therefore use a composite index that includes mowing frequency, grazing intensity and fertilization intensity, which is available for Germany for 2017 and 201829,30. We find a similar, but 30% smaller average partial effect size, which is moderately correlated with our main estimates (r = 0.778). The smaller magnitude could be due to the limited temporal coverage of the alternative data; our main indicator also yields smaller effects when restricted to the years 2017 and 2018 (see Table S3, rows 18 + 19).

We visually confirm that our common support assumption holds, namely that the probabilities of different mowing frequencies overlap (Fig. S1). For example, we observe fields that are mowed twice but have a predicted mowing propensity of three, and other fields mowed three times but have a predicted mowing propensity of two. This overlap in propensities allows us to compare fields under different mowing regimes and make causal inferences because the fields are comparable based on their covariate space. Our results are robust to spatial and temporal restrictions of the training data, the omission of important predictor variables, and potential observation bias, i.e., different probabilities of fields being covered by remote sensing (Table S3, rows 9-20, Fig. S2). Furthermore, placebo tests show that our model is not sensitive to spurious patterns in the training data that could bias our estimates (Table S3, rows 23-24).

Quantifying yield trade-off implied by lower mowing frequency

Reducing mowing frequency increases plant species richness but leads to a reduction in total harvested biomass14 and thus yield (relevant for food security), which implies opportunity costs for farmers (i.e., they may lose money). To quantify this biodiversity-yield trade-off arising from lower mowing frequencies, we assess the implications of reduced mowing frequency on primary production. We aim to derive spatially explicit upper-bound opportunity costs for field owners in three steps (see Method section for details). First, we use a biophysical growth model to estimate baseline field-level dry matter yields based on actual weather, soil, and management31. Second, we use generalized random forests to predict the change in yield associated with a change in mowing frequency27. That is, we re-estimate yields under a unit change in mowing frequency to get the difference between actual and counterfactual yields. Because mowing frequency and yields are highly endogenous (i.e., mowing affects yield, but expected yield also affects the mowing decision), we use latitude, longitude, and elevation as the only predictors, so our yield model (unlike the species richness model) has no causal interpretation. Third, we multiply the predicted changes in dry matter yield by an average hay price of 70€ per ton32 to monetize yield losses. Panel A in Fig. 2 shows the distribution of yield losses associated with marginal, i.e., unit-wise, changes in mowing frequency. They range from 1 to 3 t ha−1 dry matter yield (mean: 2.4t ha−1). Panel B in Fig. 2 shows the cumulative monetary value of foregone yield (i.e., upper bound opportunity cost) for one additional plant species averaged over the years 2017–2020. For example, the presence of one additional plant species due to lower mowing frequency on the 30,000 km2 grassland with the lowest opportunity costs is associated with forgone yields worth approximately 70 million euros. Panel C shows the spatial distribution of the upper bound opportunity costs. Notably, an additional plant species would incur ten times higher opportunity costs in terms of forgone yields in the southern parts of Germany compared to the central and eastern parts. This is likely due to the lower prevailing plant species richness in these intensively used regions, resulting in a lower impact of extensification on this outcome measured in absolute terms (see Fig. 1), and possibly partly due to environmental factors that favor higher biomass production and mowing frequencies, like less sandy soils and more precipitation.

Fig. 2: Upper bound opportunity cost of increased plant species richness.
figure 2

a shows the distribution of forgone dry matter yield per hectare under a one unit lower mowing frequency. b shows the cumulative upper bound opportunity cost curve in terms of forgone hay production, assuming an average hay price of €70/t. The gray confidence band indicates annual variability and prediction error. c shows the spatial distribution of associated opportunity costs in terms of the monetary value of forgone hay yield per unit increase in plant species richness. Plot-level estimates are aggregated over four years (2017–2020) and to a 1 by 1 km grid for visualization.

High potential for effective spatial targeting

We need to recognize that financial resources for species protecting are limited33. Cost-effective targeting of biodiversity policies is therefore crucial to efficiently balance biodiversity and food security as well as costs to farmers (if we impose measures) and taxpayers (if we compensate farmers for voluntary measures). However, the past Common Agricultural Policy of the European Union has been shown to be ineffective in conserving high value grasslands and promoting their biodiversity34. To shed light on the relevance of spatial targeting, we compare the environmental and economic performance of different contextual targeting policies. We consider the 30 by 30 goal set in the Global Biodiversity Framework28, which aims to protect 30% of land by 2030. In Table 1, we compare the effectiveness and cost-efficiency of different policy scenarios that compensate for a marginal decrease in mowing frequency. The first column of Table 1 shows the baseline protected area, average plant species richness and yields (at the status quo level), while subsequent columns show differences relative to the baseline. In the second column, we consider a scenario in which grassland fields located within currently existing protected areas marginally reduce their current mowing frequency by one unit. Our results indicate that this would affect 19.7% of permanent grasslands and increase the average species richness by 1.72 (7%), but it would also be associated with a decrease in dry matter biomass yield of 1.8 t ha−1 (20%) on average. Without any spatial targeting, reducing mowing frequency on randomly selected 30% of the grasslands in our sample results in an average increase of 1.46 plant species but also implies an average decrease in dry matter yield of 1.9 t ha−1 (column 3). Targeting the 30% of land with the highest predicted impact of extensification on plant species richness results in an average species gain of 2.1 (i.e., 47% more effective, see column 4). Furthermore, when targeting fields with the lowest yield losses when reducing the mowing frequency can reduce the associated forgone yield by 0.5 t ha−1 while achieving a higher impact on plant species richness than a non-targeting policy (1.68 vs. 1.46, see column 5). Finally, we estimate that the most cost-efficient policy targeting minimum forgone yield per species gained would incur only 40 percent of the compensation costs of a non-targeting policy (column 6). The annual upper-bound opportunity costs of extensifying 30% of all permanent grasslands range from 131 million euros (target low yield changes) to 181 million euros (no targeting), or 98–135 euros per hectare.

Table 1 Comparison of targeting scenarios

Discussion

Our main contribution is to quantify the causal relationship between changes in mowing frequency and grasslands plant species richness and its spatial variation. The field-level effects of increased mowing frequency had a somewhat smaller negative impact on already intensively managed fields with low species richness (especially in northwestern and southern Germany), whereas it had a much larger impact on fields with initially high species richness (see Fig. 1, panels B and C). The spatial patterns we identified differ from a previous study that found larger effects across observed mowing regimes in Southern Germany11, possibly because we account for self-selection in mowing regimes and estimate potential outcomes. The general patterns and relationships between species richness and mowing frequency reflect our expectations: Low-intensity mowing is a moderate disturbance that affects individual plant growth and thus community characteristics and species richness, and tends to increase species diversity in grasslands (compared to no mowing, which was not considered here due to the scope of this study). On the contrary, high-intensity mowing tends to drastically reduce species richness11. This is known as the intermediate disturbance hypothesis35, where seed limitation at high disturbance levels and micro-site limitation at low disturbance levels play a role11. During intensification, as a result of increased mowing frequency, plant species tend to disappear faster than they are able to repopulate after extensification, due to dominant cover of mowing tolerant species (especially grasses) and lack of connectivity with source populations18. This is also of great policy relevance, e.g., where to support higher species richness or avoid its losses. While we find that impacts proliferate over time, our approach does not lend itself to directly testing the above-mentioned mechanism, i.e., whether repopulation is slower than disappearance. The average causal estimate presented here of −1.6 species per additional mowing event (or plus 1.6 species for one less mowing event) is smaller than correlational estimates reported in previous studies11,18,19, underscoring the importance of controlling for confounding contextual variables36. The timing of the mowing, i.e., before, during or after the species-specific flowering and fruiting periods, has been shown to affect species-level reproductive success37. Consistent with these findings, we also found that the earlier the day of the first cutting, the lower the species richness (Supplementary Table S3, row 5).

The mowing event data used here has been reported to also count some grazing activities as mowing events24. Grazing generally affects vegetation in a similar way to mowing, but produces a more patchy distribution across space and time13. In contrast, mowing exerts a more uniform spatial and temporal influence, even over large areas, by consistently removing biomass within a short period, leading to vegetation homogenization11. The impact of grazing varies with stocking density and grazer species, often favoring more unpalatable taxa while suppressing others. We acknowledge that our dataset cannot precisely in all cases distinguish between grazing and mowing, but expect this not to affect our general conclusions on the relationships between management intensity and plant species richness.

Regarding the spatial representativeness of our data, we address the potential bias arising from geographically clustered training data25,26 by using an area of applicability mask38. The validation of our plant species predictions against spatially representative independent data39 (Supplementary Information Fig. S7) shows no evidence of systematic bias, but we recognize the need to harmonize data collection protocols for better comparability. Restricting our predictions to the area of applicability limits spatial error propagation, thereby avoiding spatially correlated errors40. Given the large sample size, this approach addresses potential issues related to the unknown degree of individual prediction uncertainty. We also acknowledge the possibility of correlated measurement errors in both treatment and outcome indicators due to partial reliance on the same Sentinel-2 satellite imagery. Such errors could affect both indicators in the same way and thus introduce unobservable bias. In our particular case, we are not concerned that correlated measurement errors could invalidate our findings, because of the fundamentally different mapping approaches (rule-based for mowing frequency versus neural network for species richness) used to generate outcome and treatment variables.

We use plant species richness as an available proxy for biodiversity to illustrate our approach, while acknowledging its limitations in fully encapsulating the multiple qualitative aspects of biodiversity. The quality of remote sensed indicator maps of land management intensity and biodiversity required for more comprehensive assessments has steadily improved, and this trend is likely to continue. The approach presented here is flexible and can be easily adapted to new predictors and outcomes. Future research will thus benefit from improved remote sensing data with increased measurement precision, allowing for a wider range of conceptually relevant variable constructs. These could include ecologically more nuanced indicators that clearly differentiate mowing from grazing as well as biodiversity indicators such as relative abundances or the presence of endangered species41, or regionally characteristic species42, as effects may vary by subgroups18. Future research should explore additional metrics such as functional traits and diversity that are directly linked to ecosystem services, their trade-offs and synergies in grasslands6. Remote sensing of functional plant diversity in grasslands is an active field of research43, and improvements in this area are anticipated. Despite this, plant species richness – particularly dicotyledonous species indicative of low land- management intensity – remains the most widely used ecological indicator in results-based agri-environmental schemes across Europe due to its simplicity and ease of communication44. Similarly, the focus on measuring outcomes should extend beyond singular dimensions of provisioning ecosystem services, like yield, to encompass the broader spectrum of potential impacts on various ecosystem services12,17,45. Finally, future research should refine the opportunity cost estimates by considering not only revenues but also production costs to allow for a more comprehensive economic analysis.

To the best of our knowledge, this study is the first to leverage a comprehensive set of remotely sensed, field-level estimates of management changes on biodiversity both measured in real-world agricultural systems to inform conservation targeting at the national level (here Germany). In addition, our results contribute to a better understanding of spatial explicit trade-offs between different objectives such as high yields and high biodiversity. Our methodological framework illustrates the potential of integrating high-resolution remote sensing data with causal machine learning and biophysical modeling20. It is flexible and can accommodate improved and more comprehensive data sources that are likely to emerge over time.

Overcoming current limitations could substantially improve the targeted design of conservation policies, but important implications arise already at this stage. Our results reveal that one-size-fits-all policy solutions, such as reducing mowing frequency by the same magnitude everywhere, tend to be ineffective. This insight follows from the spatial heterogeneity of mowing impacts and corroborates findings from previous studies21,46. In practice, however, this heterogeneity of impacts has often been ignored, resulting in heuristic, but suboptimal, policy design and legislation47. Since the impact of higher mowing frequency on plant species richness is greatest in areas that are currently characterized by high plant species richness, policy should focus on maintaining mowing frequency low in these areas. This is supported by research showing that agri-environmental schemes are more effective in marginal areas than in intensively used farmland48. Conversely, extensification efforts are likely less effective and cost-efficient in areas of high mowing frequency, such as the northwest and southern regions of Germany. This implies a need for the development and sustainable use of technological and management innovations that raise the profitability of species-rich grasslands. In other words, innovations that focus on grassland productivity can only affect the intensive and extensive margins of production leading to trade-offs with species richness.

In summary, we show that considerable gains in policy effectiveness and efficiency can be expected from leveraging large spatial datasets and digital tools for grassland management23. For example, our field-specific estimates of plant species richness and associated upper bound opportunity costs can be used to design and monitor more cost-effective agri-environmental schemes. In particular, our spatially explicit impact predictions can help to design and implement change-based or result-based rather than action-based payment schemes49. Such schemes have been shown to be the preferred option among farmers in some situations50 and could be implemented using a revealed cost approach, e.g., in combination with auctions. The policy analysis shows that targeting areas with low opportunity costs is not only the most cost-efficient, but also a very effective approach to increasing plant species richness. Our results underscore the need for targeted, context-specific conservation policies, rather than one-size-fits-all solutions, enabled by the integration of large spatial datasets and digital tools. Potential real-world applications of our approach include the operationalizing of baselines for biodiversity credits or supporting regional land use and conservation planning. Our findings can support the design of tailored agri-environmental schemes and ultimately contribute to broader efforts towards reconciling biodiversity conservation and provisioning ecosystem services in grasslands.

Methods

This study covers a random subset of the available data (N = 50,000) sampled from all permanent grassland fields51 across the entire Federal Republic of Germany (n = 1,313,073) observed over the four years 2017–2020. We applied an agricultural grassland mask52 to ensure that considered fields are comparable with respect to their land use and have the primary purpose of fodder production. We used temporally and spatially explicit contextual variables to identify determinants of different mowing frequencies and to estimate heterogeneous treatment effects along the most relevant of these variables. Table S1 shows details and references of all data sources used. The following subsections describe the outcome variable, treatment indicator, contextual variables and sampling strategy. Summary statistics of all input variables are presented in Table S2, annual maps of treatment and outcomes in Fig. S6.

Outcome variable: plant species richness

To approximate biodiversity, we generated a novel dataset of plant species richness for the whole of Germany at a resolution of 20 m for the years 2017–2021 based on a previously published method25. This method involved model calibrations with plant species inventories from the Biodiversity Exploratories25,26. To ensure the spatial transferability of the trained model for prediction across Germany, the feature space of the input Sentinel-2 data was used to mask out areas where the model was not applicable38. Thereby, we excluded grasslands fields for which the spectral-temporal signature differed from the training fields from the prediction map. Based on the area of applicability38, our prediction is valid for 70% of the German grassland areas. Even though the calibration areas cover already a wide range of environmental and management conditions26, we acknowledge that they might not be fully representative of the relationships between species richness and satellite-derived information throughout Germany. Therefore, additional uncertainties in species richness estimates are expected, even within the area of applicability. We use independent secondary data39 to validate the spatial representativeness of our estimates (see Fig. S7 and Supplementary Note S1 for details). To reduce the bias of non-applicable areas in our estimation, we used the percentage of overlap of the plant species richness map with grassland fields as weights in our analysis. This essentially gives a lower weight to fields with missing pixels, reducing the associated uncertainty regarding applicability. Time series of Sentinel-2 imagery for Germany were downloaded and interpolated using the FORCE processing software53, before applying the predictive model. The resulting product is a 20 m resolution raster that estimates species numbers at 16 m2, which is the original size of the plant inventory data. The spatial mismatch between plant inventory size and raster size does not affect our interpretation of species number as an intensive variable, i.e., relative to area.

Treatment variable: mowing frequency

We use the number of mowing events per year as a proxy for grassland management intensity for the years 2017–202024. Mowing frequency has previously been identified as a useful indicator of management intensity within mowed grasslands54 as it is directly related to the human appropriation of net primary production. The pixel-level number of mowing events has previously been mapped across Germany from temporal changes in reflectance in Sentinel-2 and Landsat-8 time series using a rule-based classification approach with dynamic thresholds that vary with environmental conditions24. The authors report a state-of-the-art overall accuracy of 60% with a slight tendency to underestimate mowing events in regions that were often covered by clouds and thus could not be observed with sufficient temporal resolution. We here calculate the average mowing frequency per field, resulting in a continuous indicator. In addition to mowing frequency, the intensity of grassland management is commonly characterized by fertilization input and grazing pressure, as these dimensions also affect, among other things, nutrient loads in the soil and flora29,55. Fertilization and mowing frequency are often positively correlated, as higher nutrient availability stimulates biomass growth and thus allows or requires more frequent mowing. Grazing may be both positively or negatively associated with mowing depending on whether they complement or supplement each other in the management regime. The mowing product we used cannot differentiate between meadows and pastures, as the temporal reflectance signature of a mowing event may be indistinguishable from an intensive rotational grazing scheme. On the contrary, extensive grazing typically produces a less pronounced temporal reflectance signal and thus is not always identified as a mowing event in this product24. We provide estimates based on an alternative indicator30 as a robustness check.

Contextual variables

We include a wide range of topographic, climatic, meteorological, and pedological properties to account for environmental conditions that may affect both management intensity and species richness17 (see Table S1 for sources). Apart from environmental conditions, regional production characteristics as well as social networks may influence management intensity through market proximity or economies of scale56. Therefore, we include socioeconomic variables related to regional characteristics of the agricultural economy such as average farm sizes, percentage of permanent grassland, percentage of farms with cattle, cattle density per hectare, and share of organic farms at district level (i.e., NUTS, Nomenclature of Territorial Units for Statistics, 3rd level). We used 2016 values to avoid potential feedback loops that would bias our causal estimates. For example, grassland areas or stocking densities could have changed due to expected mowing yields, especially after the drought years of 2017 and 2018. As a proxy for landscape heterogeneity, which has been shown to influence grassland species richness57, we calculate a land cover diversity index within a 1000 meter buffer around each field. We also control for field shape, as distance to field boundaries may affect species composition, spatial clustering, and temporal dispersal. Local and regional differences in regulations are accounted for by including binary indicators for strictly protected area status, Natura 2000 site, and each federal state. We also include latitude and longitude and binary year indicators to control for unobserved factors. The selection of contextual variables follows the basic principles of causal identification to avoid potential endogeneity problems58. That is, we include variables that directly influence mowing regime and plant species richness (e.g., environmental conditions, land use context, and socioeconomic characteristics just before the years covered by our analysis) while excluding those that may be influenced by them (e.g., yields, soil nutrient levels, and socioeconomic characteristics during or after the time of our analysis).

Sampling strategy

We conduct a field-level analysis focusing on permanent grassland across Germany. To identify fields with permanent grassland we use a recently published dataset of field boundaries51. We use only those field boundaries that are labeled as permanent grassland. We extracted features from polygons (such as being in a protected area) using the field centroid. For raster-based input data, including mowing frequency and plant species richness, we calculate the area-weighted average number of mowing events from all pixels intersecting with the field. Spatial aggregation resulted in a continuous mowing frequency variable, which serves as a proxy of grassland management intensity. In a robustness check (Supplementary Note S3 and Fig. S5) we also consider the majoritarian mowing frequency per field as a treatment indicator. For each model run, including robustness checks, we draw a simple random sample of 50,000 observations for reasons of computational efficiency and to avoid spatial autocorrelation. Errors are clustered at field level to account for cases where multiple observations (from different years) were drawn from one field.

Empirical framework

We are interested in the difference in grassland species richness between the observed mowing frequency and a counterfactual scenario of a different mowing frequency. However, it is impossible to observe the same field in the same year under different mowing regimes. We therefore build on the potential outcomes framework, where treatment effects are estimated by comparing observed outcomes with counterfactual outcomes under an alternative treatment59,60. The potential outcomes framework is related to, but not identical to dark diversity61, the regional pool of species that could potentially inhabit an area, but are not found. Rather than estimating the size or composition of a potential species pool based on the observed state of the environment and the functional species traits42, our framework statistically modulates the difference between an observed and an unobserved state of the environment and attributes the difference to an intervention, in our case the change in mowing frequency. In other words, this approach tries to answer the question: what species richness would we observe on each specific field if the field had been mowed differently?

To estimate the causal impact of mowing frequency on plant species richness, we use causal forests, a specific case of generalized random forests27,62. Specifically, we use the augmented inverse probability-weighted estimator27. This approach has previously been used to estimate, e.g., the impact of tillage on yields63, the impact of agri-environmental schemes on environmental outcomes64, or the effect of cover crop adoption on maize and soybean yield losses65. A key advantage of using generalized random forests is the possibility to learn about treatment effect heterogeneity. Furthermore, generalized random forests are doubly robust, i.e., they provide unbiased treatment effects as long as the assumption of unconfoundedness holds for either the treatment propensity model or the outcome model66. The unconfoundedness assumption posits that there are no hidden variables influencing the treatment and outcome, respectively. Another advantage of generalized random forests is their ability to partially capture the latent unobserved heterogeneity as long as these latent, i.e., unobserved, variables are some (non-)linear representations of the observed covariate space (see Fig. S8).

In general terms, the approach consists of two steps. First, a prediction of the observed mowing frequency is created to serve as propensity score. In a second step, the propensity score is used as an inverse weight to create the causal impact of mowing on species richness, while controlling for all other observed contextual variables. We estimate the impact of mowing in year t on species richness in the subsequent year (t + 1), because treatment logically must occur before any effect takes place, and our species richness estimates relate to May-June, while it is possible to have mowing events also later. In this way, we also avoid reverse causality, by assuming that species richness in t + 1 does not affect mowing at time t. Since the treatment (mowing frequency) is not randomly distributed, we control for potential selection bias by including a set of contextual variables described in Table S1. In particular, we calculate a propensity score for the treatment variable to account for selection bias. We assume unconfoundedness, i.e., we expect that our chosen set of conceptual variables largely capture self-selection and serve as proxies for the effects of potential unobserved confounders that are not included in the analysis. We assess the validity of this assumption by comparing the distribution of propensity scores across quintiles of grassland mowing frequency (Fig. S1). We identify and discuss the most important variables for predicting the mowing propensity as well as the overall treatment effect.

Yield data

We model dry matter grass yields using the biophysical growth model LINGRA-N which was developed to model grass yields across the European Union31. The model takes daily weather data, soil hydraulic properties, nitrogen applications and mowing dates as inputs. Details of the input data, implementation and validation can be found in the Supplementary Note S2 and Fig. S3.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.