Dynamical downscaling CMIP6 models over New Zealand: added value of climatology and extremes

Gibson, Peter B.; Stuart, Stephen; Sood, Abha; Stone, Dáithí; Rampal, Neelesh; Lewis, Hamish; Broadbent, Ashley; Thatcher, Marcus; Morgenstern, Olaf

doi:10.1007/s00382-024-07337-5

Dynamical downscaling CMIP6 models over New Zealand: added value of climatology and extremes

Original Article
Open access
Published: 17 July 2024

Volume 62, pages 8255–8281, (2024)
Cite this article

Download PDF

You have full access to this open access article

Climate Dynamics Aims and scope Submit manuscript

Dynamical downscaling CMIP6 models over New Zealand: added value of climatology and extremes

Download PDF

Peter B. Gibson¹,
Stephen Stuart¹,
Abha Sood¹,
Dáithí Stone¹,
Neelesh Rampal²,
Hamish Lewis^2,3,
Ashley Broadbent¹,
Marcus Thatcher⁴ &
…
Olaf Morgenstern¹

1053 Accesses
1 Citation
12 Altmetric
1 Mention
Explore all metrics

Abstract

Dynamical downscaling provides physics-based high-resolution climate change projections across regional and local scales. This is particularly important for island nations characterized by complex terrain, where the coarse resolution of global climate model (GCM) output often prohibits direct use. One of the main motivations for dynamical downscaling is to reduce biases relative to the host GCM at the local scale, which can be quantified through assessing ‘added value’. However, added value from downscaling is not guaranteed; quantifying this can help users make informed decisions about how best to use available climate projection data. Here we describe the experiment design of the updated national climate projections for New Zealand based on dynamical downscaling. The global non-hydrostatic Conformal Cubic Atmospheric Model (CCAM) is primarily used for downscaling, with a global stretched grid targeting high resolution over New Zealand (12-km) and the wider South Pacific region (12–35-km). Focusing on the historical simulations, we assess added value for a range of metrics, climatological fields, extreme indices, and tropical cyclones. The main strengths of the downscaling include generally large improvements relative to the host GCM for temperature and orographic precipitation. Inter-annual variability in temperature is well captured across New Zealand, and several temperature and precipitation-based extreme indices show large improvements. The representation of tropical cyclones reaching at least category 2 intensity is generally improved relative to the large consistent under-representation in the host GCMs. The remaining biases are explored and discussed forming the basis for ongoing bias-correction work.

Regional climate change projections from NA-CORDEX and their relation to climate sensitivity

Article Open access 09 September 2020

Dynamical downscaling of regional climate: A review of methods and limitations

Article 28 September 2018

Dynamical downscaling over the complex terrain of southwest South America: present climate conditions and added value analysis

Article 30 August 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Reliable high-resolution climate change projections are highly sought after by the climate impacts community (Giorgi et al. 2009). Despite ongoing investment in the latest generation of global climate models (GCMs) within the Coupled Model Intercomparison Project Phase 6 (CMIP6), the typical resolution of these models (~ 100-km) remains prohibitive for direct use in many climate impacts applications and contexts. At this resolution, GCMs often struggle to adequately represent various features of atmospheric circulation that drive mid-latitude temperature and precipitation extremes, including atmospheric blocking (Davini et al. 2017; Quinting and Vitart 2019; Davini et al., 2020), tropical cyclones (Roberts et al. 2020; Gibson et al. 2023) and short-duration convective storms (Chan et al. 2014; Thomassen et al. 2021). At finer local and regional scales, the biases from coarse resolution GCMs can be larger and more detrimental. This is especially the case in regions of complex terrain, where the interaction between mesoscale circulation features and orography creates fine scale spatial variability poorly resolved by GCMs (Giorgi and Gutowski 2015; Gibson et al. 2023). These features are highly important for defining the regional climate and include orographic precipitation, mountain and valley breezes, lapse rates in the boundary layer, as well as sea and land breezes.

Given the current limitations in the resolution GCMs, regional climate models (RCMs), which are generally run at much finer resolution (e.g. 12-25-km), are often relied upon to ameliorate these issues. The Coordinated Regional climate Downscaling Experiment (CORDEX) is a prominent example, designed to coordinate international regional downscaling efforts for climate change impact and adaptation studies (Giorgi and Gutowski 2015). Through CORDEX, several participating groups implement RCMs to downscale selected GCMs for a particular region. The climate projection stream first involves selecting a small number of best-performing GCMs to downscale with a small number of RCMs, where performance is evaluated on a regional basis. Other factors in the GCM selection process include model independence and spread in future warming rates. A balanced selection of models across different institutional ‘families’ and spanning a reasonable range of future warming rates is desirable (Brunner et al. 2020; Hausfather et al. 2022; Grose et al. 2023). The RCM performance and chosen configuration are typically based on evaluating biases across reanalysis-driven RCMs, or different configurations of a single RCM, under “perfect” boundary conditions. Reanalysis-driven simulations are particularly useful for isolating biases that come directly from the RCM, as well as providing a best-case expectation of performance when the RCM is used to downscale GCMs (Di Virgilio et al. 2019).

Building upon CMIP5, several studies have evaluated the added value of RCM downscaling in various regions. In this context, added value refers to the reduction in biases in the RCM relative to the host GCM at the regional scale (Giorgi and Gutowski 2015). The concept was extended by Di Virgilio et al. (2020) to include ‘realised added value’ where added value from the RCM refers to both a reduction in biases and a modification of spatial details in the climate change signal. Several studies have reported added value from RCMs, especially for representing precipitation variability and extremes in various regions (Lloyd et al. 2021; and references therein). Across Australia, Di Duca et al. (2016) and subsequently Di Virgilio et al. (2020) found evidence for added value in both temperature and precipitation mean and extreme fields, though this varied across regions and seasons. Even in cases of added value, important systematic biases in RCM output can remain which require careful evaluation. Over parts of Europe, Knist et al. (2017) found a tendency for RCMs to overestimate land-atmosphere coupling strength due to issues in the land-surface models, with implications for the representation of heatwaves. Under ‘perfect’ boundary conditions (i.e. reanalysis-driven), Di Virgilio et al. (2019) noted a tendency in some RCMs to underestimate maximum temperature over large parts of Australia by more than 5^oC, while Hirsch et al. (2019) noted a tendency to underestimate heatwave frequency and intensity in the same RCM ensemble. For precipitation extremes across the United States, Gibson et al. (2019) noted cases of RCM added-value through increasing precipitation rates that were otherwise too small in most GCMs, though the highest resolution RCMs evaluated (12-km) began to overestimate extreme events. Other studies have emphasized ongoing systematic biases with the large-scale atmospheric circulation features in GCMs which are inherited by RCMs and degrade the downscaled output (Lloyd et al. 2021), though this issue can be at least partly addressed through the GCM selection process. Due to these regional and seasonal variations in RCM performance, a comprehensive evaluation of added value is important for setting expectations regarding the use of the output from RCMs in climate impact and adaptation applications.

New Zealand provides an ideal case study for evaluating added value from RCMs. Situated within the prevailing Southern Hemisphere storm track, New Zealand is exposed to a diverse range of synoptic conditions that drive extreme events. Especially notable in summer, ex-tropical cyclones (Lorrey et al. 2014) and atmospheric rivers (Prince et al. 2021) tap into sub-tropical sources of moisture to drive extreme rainfall. In winter, a diverse range of fronts, cyclones, and marine polar air masses reach New Zealand relatively unimpeded (Sturman and Tapper, 1996). At smaller scales, the highly complex mountainous and coastal terrain give rise to important local and mesoscale circulation features (Sturman et al. 1999). Due to the uplift from the Southern Alps, orographic precipitation in western regions results in some of the largest annual rainfall totals on Earth (Ibbitt et al. 2000). To be fully successful, regional downscaling must capture the full range of these processes and their relevant interactions with complex terrain at high spatial resolution (Drost et al. 2007; Rampal et al., 2022a).

Compared to other regions, the literature evaluating dynamical RCMs and added value for New Zealand is relatively sparse. Early work, prior to the availability of CMIP3 output, focused on applying a relatively coarse RCM (50-km) developed by CSIRO to output from a single GCM (Renwick et al. 1998, 1999). This included a simple climatological evaluation as well as describing the regional transient climate response to doubling of CO₂. Also prior to CMIP3, Drost et al. (2007) evaluated the HadRM3H regional model (40-km) embedded within the global model HadAM3H, which is derived from the atmosphere component of the coupled model HadCM3. The authors show several improvements at high resolution, despite a tendency for simulated temperatures to be too low and the precipitation totals to be too high in high elevation regions. Around the time of CMIP3, Ackerley et al. (2012) evaluated an updated version of the same regional model at higher resolution, HadRM3P (~ 30 km) driven by both reanalysis and GCM data. The authors describe general improvements at this higher resolution, including for the west-east gradients of precipitation around the Southern Alps but with an overall negative bias in climatological precipitation totals. There was also a general tendency for a negative bias in maximum air temperatures and a positive bias in minimum air temperatures (i.e. overall low bias in daily temperature range).

Based on the results from Ackerley et al. (2012), the CMIP5 downscaling for New Zealand was carried out using bias corrected SST and sea-ice concentration (SIC) fields from 6 selected CMIP5 GCMs to drive the global HadAM3P model (~ 150-km), which was then downscaled over the New Zealand domain in a second step with HadRM3P (~ 30-km) (Ministry for the Environment 2018). These CMIP5 downscaled simulations have been used extensively in various applications in New Zealand, including for catchment-scale (Jobst et al. 2018; Akhter et al. 2019) and national-scale hydrology (Collins 2020) fire weather (Melia et al. 2022), and damages from extreme events (Pastor-Paz et al. 2020) among others. While general climatological biases have been evaluated, no comprehensive assessment of added value has been carried out for RCMs in the region, including for extreme events. More recently, Gibson et al. (2023) applied and evaluated the global non-hydrostatic Conformal Cubic Atmospheric Model (CCAM) with a stretched grid configuration producing high-resolution (~ 12 km) simulations over New Zealand. The experiment design there focused on internal variability over the historical period, generating a 10-member initial condition ensemble member from observed SST/SIC alone. Encouraging results were obtained over the New Zealand region in terms of precipitation and temperature-based extreme indices, which motivated the wider application and evaluation of CCAM for CMIP6 dynamical downscaling in the present study.

The present study introduces the experiment design for the updated CMIP6 dynamical downscaling; this project will provide updated and improved projections of future climate over New Zealand (Sect. 2, Methods). The primary focus of this study is then to comprehensively evaluate and quantify the historical biases and added value from the newly produced downscaled GCM/RCMs (Sect. 3, Results and Discussion). By comparing reanalysis-driven RCM biases with GCM-driven RCM biases, we gain insight into the origins of various biases. A comprehensive evaluation of extreme events is presented, including various temperature and precipitation-based indices and tropical cyclone frequency and intensity.

2 Methods

2.1 Overview of experiment design

The primary model used here for climate downscaling is the Conformal Cubic Atmospheric Model (CCAM) (version CCAM-2206) developed by the Commonwealth Scientific and Industrial Research Organisation (CSIRO) (Table 1). Further details about the specific CCAM configuration used are given in Sect. 2.2. Two other RCMs (WRF and UM) are also included in Table 1 as part of the broader downscaling project for New Zealand, namely, to compare biases across RCMs when driven by reanalyses and to compare the climate change signal between RCMs for select GCMs. Here we focus primarily on the historical downscaling and added value from CCAM, while this other reduced set of inter-RCM comparisons will be presented in a separate paper.

Table 1 Details for the downscaling experimental design, including the host models (GCM/reanalysis) and RCMs. Each RCM has an atmospheric model grid of approximately 12 km resolution over the New Zealand region. Models with asterisks (i.e. WRF and UM) are not analysed in this study but are included in the table to show data availability for the wider national climate downscaling effort. Further details of the model setup are given in Sect. 2. For ACCESS-CM2, r4i1p1f1 (instead of r1i1p1f1) was used due to data availability

Full size table

Traditionally, most RCMs receive lateral boundary conditions from the host model (GCM or reanalysis) for a limited area domain, with the RCM evolving freely across the inner domain. Here, CCAM is instead run as a global atmospheric model with global spectral nudging across a stretched grid configuration. Spectral nudging (sometimes referred to as scale-selective downscaling) removes the need for pre-defined lateral boundaries and instead uses a cutoff in the spectral domain at a particular length scale (Thatcher and McGregor 2009). From this, the state of the regional atmosphere at large length scales is determined by the host model, while smaller scales (e.g. mesoscale circulation features) are allowed to evolve freely. The grid configuration in CCAM allows the placement of a high-resolution face centred over the domain of interest with gradually reduced resolution away from this region. Compared to limited area RCMs that are exposed to a step change in resolution at the domain boundary, this more seamless grid configuration allows coupling between the global and regional spatial scales on the same grid and may provide benefits for the representation of storms as they approach the domain of interest (Gibson et al. 2023). This approach can also alleviate important long-standing issues concerning the size and placement of the inner domain in limited area RCMs (Davies 1976; Jones et al. 1995).

Since CCAM is a global model, climate downscaling can either be performed through direct spectral nudging (Thatcher and McGregor 2009) to atmospheric fields from the host model (e.g. Grose et al. 2023) or through an AMIP-style simulation driven only by SST/SIC at the lower boundary (e.g. Hoffmann et al. 2016; Di Virgilio et al. 2020). Each approach (‘spectral nudging’ and ‘SST/SIC driven’) is known to have strengths and weaknesses. The spectral nudging approach is more consistent with standard climate downscaling performed through CORDEX and ensures relatively close consistency with the direct atmospheric fields from the GCM. The degree of consistency with atmospheric GCM fields can be further enhanced or relaxed through nudging parameters allowing smaller scale features to evolve at high resolution more freely. Alternatively, in the SST/SIC-driven approach, atmospheric conditions are simulated by CCAM itself as a global atmospheric model, while the rate of warming is largely determined by the SSTs from the host GCM. This approach offers an opportunity to reduce biases in the SST fields prior to the downscaling. For example, Hoffmann et al. (2016) found that bias correction of input SST fields prior to downscaling with CCAM improved the downscaled tropical precipitation climatology and the response to ENSO. Similarly, Di Virgilio et al. (2020) found that bias corrected SSTs in the output of the host GCM helped reduce CCAM biases in downscaled precipitation over Australia. Since New Zealand is an island nation where the surface air temperature over land is heavily influenced by regional SSTs (Gibson et al. 2023), bias correction of this nature can be particularly beneficial.

The output from each approach can be combined and considered part of an ensemble of regional climate projections (Grose et al. 2023). In this study, using CCAM, three GCMs are downscaled through direct spectral nudging to atmospheric conditions and another three GCMs are downscaled through bias-corrected SST/SIC-driven simulations (see Table 1 for details). The use of both approaches has also helped address important data availability issues from the CMIP6 output described in Sect. 2.3.

2.2 CCAM configuration details

CCAM-2206 was used in all simulations (Table 1) as a global atmospheric model with a stretched grid configuration (C288) and a Schmidt stretching factor of ~ 0.343. The Schmidt transformation determines the degree of stretching away from the region of highest resolution: this provided high-resolution over the target NZ domain (12 km) and relatively high-resolution over the wider South Pacific region (~ 12-35-km) (see Fig. 1). This grid configuration was shown to have several advantages for the New Zealand region (Gibson et al. 2023). CCAM adopts a non-hydrostatic, semi-Lagrangian dynamical core with a range of physical parameterizations. The CCAM radiation parameterizations (Freidenreich and Ramaswamy 1999; Schwarzkopf and Ramaswamy 1999) are based on the GFDL-CM3 radiation code with recent updates for CMIP6 radiative forcings. The cloud microphysics are based on a single moment prognostic cloud condensate scheme from the CSIRO-Mk3.6 model (Rotstayn 1997). Prognostic (single moment) aerosols were switched on and based on the CSIRO-Mk3.6 aerosol scheme with modifications for coupling to CCAM physical parameterisations. Turbulent mixing in the atmosphere is based on the k-e turbulence closure scheme following Hurley (2007). The convection scheme in CCAM implements a mass-flux closure scheme and includes downdrafts, entrainment, and detrainment (McGregor 2003). The land surface in CCAM adopts the CABLE land surface model (Kowalczyk et al. 2006).

CCAM was run with 35 vertical levels in the atmosphere and a 4-minute dynamical timestep. For both the reanalysis and GCM fields, atmospheric spectral nudging was applied to 6-hourly surface pressure, winds and air temperature for levels between 850-hPa and 10-hPa. For downscaling reanalysis and GCMs, no nudging to water vapour is performed. The spectral filter nudging length (Thatcher and McGregor 2009) was set to a length scale of 3000-km. The reanalysis-driven run used atmospheric fields from ERA5 reanalysis (Hersbach et al. 2020) with daily SST/SIC fields from the Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA, Donlon et al. 2012). Land use/land cover change were switched off.

The historical simulation period for downscaling each GCM spanned years 1960–2014, with an additional 1-year spin-up (year 1959). Historical greenhouse gas, aerosol, ozone, and solar forcings from the CMIP6 historical experiment were used. The reanalysis-driven simulation period spanned years 1982–2020 with 1-year spin-up (year 1981). The future simulation period for each GCM spanned years 2015–2099 across a range of future scenarios. Since the focus of this study is on historical evaluation, further details of these future projections will be provided in a separate paper. Each CCAM simulation was run independently on the Cray Kupe supercomputer at NIWA each using 22 XC50 compute nodes (880 physical CPU cores) generating approximately 3.5 simulation years per wall-clock day.

2.3 Selection of CMIP6 GCMs for downscaling

The computational cost of running RCMs typically dictates that a relatively small number of GCM/RCM pairs are used in downscaling. As is common in other downscaling projects (e.g. Evans et al. 2014; Jacob et al. 2020; Di Virgilio et al. 2022; Grose et al. 2023), the choice of which GCMs to downscale was made by balancing: (1) the regional performance of the GCM over the historical period; (2) model independence; (3) the rate of future warming. As mentioned above, a further practical consideration is data availability required to run the RCM, since several CMIP6 modelling groups did not provide the necessary sub-daily output.

The initial performance evaluation of CMIP6 GCMs was carried out following the approach for selecting CMIP5 GCMs in the previous downscaling exercise for New Zealand (Mullan and Dean 2009; Ministry for the Environment 2018) with updates for CMIP6. Model rankings were based on a comprehensive 62-element evaluation of the historical climate over the New Zealand region (120-210^oE, 20 S-60^oS) and wider South-West Pacific (90-240^oE, 10^oN-60^oS), relative to ERA-Interim reanalysis. The four categories of indicators of climate and circulation features are:

1)
The annual mean, seasonal cycle and the interannual standard deviation for mean sea level pressure, surface air temperature and precipitation. The Pearson pattern correlation and RMSE metrics were computed for the two domains. Combined this gives 36 elements.
2)
The correlation between the Southern Oscillation Index (SOI) and mean sea level pressure, surface air temperature and precipitation. This is computed across two domains giving 6 elements.
3)
Annual cycle in climatological mean sea level pressure differences used to diagnose regional circulation indices Z1, Z2, M1 (Trenberth 1976), and the SOI, giving 4 elements.
4)
The position and the intensity of the winter and summer southern hemisphere zonal wind maxima and high-pressure belt maxima, computed across the two domains. Combined this gives 16 elements.

The ranking algorithm weights each of the above four categories to ensure relatively consistent performance across all aspects in the model selection. The ranking was applied to the historical period (years 1979–2014) for all available CMIP6 models and ensemble members (over 60 GCMs and 477 members in total). The intention of using multiple ensemble members was not to select the best performing member (for a given model) but rather to examine the sensitivity of the model selection to internal variability.

For the CCAM simulations with atmospheric nudging (detailed in Table 1), this complete set of indicators, across all four categories, was used in the ranking and GCM selection. The final ranking for individual models is presented in Supplementary Material Figure S1, with the models selected for downscaling highlighted (EC-Earth3, ACCESS-CM2, NorESM2-MM). As shown, the selected models are among the top performing CMIP6 models, and differences in the overall score between these selected models are relatively small after accounting for spread across ensemble members. Note that we did not simply select the three single highest scoring CMIP6 models due to important issues around data availability and model independence. In terms of availability, despite ranking well, TaiESM1 and GFDL-ESM4 did not provide all required 6hourly fields for downscaling so could not be used. In terms of model independence, different models from the same institution often performed similarly well in the rankings so only the top performing model was selected from these (e.g. the different configurations of the EC-Earth3 model).

For the CCAM SST/SIC-driven downscaling runs, since these are not nudged to the host model atmospheric conditions, the atmospheric fields from the host model are less relevant for evaluation. Instead, the primary consideration for model selection here was the surface air temperature features (i.e. annual mean, seasonal cycle, and interannual standard deviation). The ranking and selection of models based on this consideration is shown in Supplementary Material Figure S2. For this selection, data availability is not a limiting factor since all models have the required fields to drive CCAM. This enabled a more balanced consideration of diversity in the equilibrium climate sensitivity (ECS) and model independence alongside model performance. The two highest performing models were first selected (AWI-CM-1-1-MR and CNRM-CM6-1) followed by GFDL-ESM4. The selection of GFDL-ESM4 was made from the consideration of including an additional relatively low ECS model in the downscaled ensemble, while being relatively independent from the other top performing models (described below).

The ECS of the 6 models selected for downscaling are shown alongside those of the CMIP6 ensemble in Fig. 2. The 6 selected models span the overall CMIP6 model ECS range well, while additionally being constrained to the IPCC ‘very likely range’ for ECS (between 2 and 5^oC). Our selection has ensured that both the lower and higher end of the CMIP6 ECS ‘very likely’ range are included. This resulted in the exclusion of certain ‘hot models’ that fell outside this range (Hausfather et al. 2022), which otherwise performed reasonably well in the historical evaluation (e.g. UKESM1-0-LL). The mean ECS from the six downscaled models is similar to, albeit slightly lower than, the CMIP6 mean ECS (3.69^oC versus 3.77^oC, respectively). While ECS describes stabilized warming at equilibrium in a global mean sense, these 6 selected models also sample well the range of CMIP6 warming rates when assessed in terms of transient warming over the New Zealand region (Supplementary Material Figure S3).

Adopting the concept of ‘families’ of models within CMIP (Knutti et al. 2013; Abramowitz et al. 2019; Brunner et al. 2020), model independence was assessed qualitatively based on an a priori knowledge of each model. This includes obvious institutional dependencies (e.g. the different variants of the EC-Earth3 model) as well as known sharing of major model components across models from different institutions (e.g. ACCESS-CM2 and UKESM1-0-LL models share the same underlying atmospheric model). As described in the model independence framework of Brunner et al. (2020) each of the 6 selected GCMs can be considered belonging to an overall separate ‘family’. While all 6 GCMs have notably different atmospheric models, in certain cases their ocean models have some dependencies. Namely, ACCESS-CM2 and GFDL-ESM4 implement different versions of the Modular Ocean Model (MOM) while CNRM-CM6-1 and EC-Earth3 both implement version 3.6 of the Nucleus for European Modelling of the Ocean (NEMO). In the context of how these GCMs are downscaled here, the importance of this ocean model dependency is somewhat reduced since bias correction of SST fields is first performed on GFDL-ESM4 and CNRM-CM6-1 models within the SST/SIC-driven approach to downscaling (see Table 1). Furthermore, although not explicitly considered as part of the selection process, the six models chosen here also span a wide range of the larger CMIP6 ensemble in terms of temperature and circulation climatological fields between CMIP6 models as quantified in Brunner et al. (2020).

2.4 Bias adjustment of SST/SIC-driven runs

As detailed above (Table 1), when CCAM is driven only from the SST/SIC fields of the GCM (i.e. for downscaling AWI-CM-1-1-MR, CNRM-CM6-1, and GFDL-ESM4) bias correction of the input SST and SIC fields was first performed.

2.4.1 SST adjustment

The reduction of systematic biases in prescribed SST and sea-air fluxes of heat and moisture can improve the representation of large-scale circulation and precipitation in atmospheric models (Hoffman et al. 2016; Nguyen et al. 2012; Ashfaq et al. 2011; Chapman et al. 2023); this can also reduce the inter-model spread of future climate projections (He and Soden 2016). We reduced biases in the climatological mean and interannual variance of monthly SST from CMIP6 models, prescribed to CCAM, using a method similar to Hoffman et al. (2016), hereafter H2016. The adjustments for each CMIP6 model were derived with reference to SST observed from 1982 to 2014 using OSTIA data. OSTIA compares well to other global observational SST data sets (Yang et al. 2021) and has been used as surface boundary forcing in recent atmospheric reanalyses over Australasia (Su et al. 2021) and a free-running CCAM historical ensemble targeting the southwest Pacific region (Gibson et al. 2023). While bias-correcting the CMIP6 SST, several steps were taken to avoid over-adjusting the variance and to preserve the long-term trends from the models, similar to H2016. In particular, linear trends and non-linear multidecadal variations were temporarily removed, the SST variance was then adjusted by a correction factor between 0.1 and 3.0, and the linear trends and multidecadal variations were subsequently restored. This variance correction factor was derived from the ratio of the standard deviations of observed and modelled SST, and was limited to the range 0.1-3.0 to avoid making very large adjustments to the model SST variability. In certain cases, the correction of SST variability at high latitudes may be impaired by mismatches between modelled and observed sea ice cover. Therefore, and similar to H2016, the variance correction was linearly relaxed between latitudes, from full adjustments at the equator to zero adjustment poleward of 50 °N and 60 °S. Thus the variance correction was extended slightly further poleward in the Southern Hemisphere, where sea ice is climatologically concentrated slightly further poleward, and this allowed additional correction of SST variability across NZ latitudes (~ 34–47 °S).

2.4.2 SIC adjustment

In addition to SST, CCAM also requires global prescribed sea ice concentrations (SIC). We cannot take the SIC directly from the CMIP6 models because it would be physically inconsistent with the bias-adjusted SSTs (described above) and because there can be large biases in SIC in the CMIP6 models. Instead, we use the statistical algorithm based on Stone and Pall (2021), hereafter SP2021, to estimate the SIC field directly from the bias-adjusted SST field, described above. Whereas SP2021 used the algorithm to estimate perturbations in an SIC field provided perturbations in an SST field, here we use it as a direct predictor of SIC. The first step involves calculating a relationship between SST and SIC based on observed values in OSTIA during the 2001–2010 period, separately for each hemisphere. We take monthly mean values at all grid cells within the hemisphere, select all cells with SIC in one of 100 evenly spaced bins (from no-ice to full-ice), and take the median SST value. In contrast to SP2021, here we perform the calculation separately for each calendar month, and instead of fitting a linear function to the median SSTs from each of 100 SIC bins we smooth the relationship in the 100 SIC bins into a monotonic function of SST. This function can then take any SST value from any grid cell from a GCM and translate it to a unique SIC value. The resultant Southern Hemisphere SIC tends to have the same mean over the 1982–2014 period but with more interannual variability, while future trends in SIC are preserved.

2.5 Added value metrics

In the context of RCM evaluation, added value refers to the reduction in biases in RCM output relative to the host GCM output at the regional scale. This is highly relevant since one of the main motivations for performing downscaling is to reduce important GCM biases at these finer scales. Here added value is quantified through various statistics, each applied to climatological (i.e. annually or seasonally time-averaged) fields over New Zealand land-based grid cells. These are: root-mean-square error (RSME), mean absolute error (MAE), mean absolute percentage error (MAPE) and Pattern Correlation. For each of these metrics, the percentage difference between the RCM and GCM error was computed, where positive values indicate a reduction in the error from the RCM (i.e. added value).

Analysing a range of added value metrics is beneficial since they account for different aspects of model performance. RMSE and MAE both penalize heavily for the magnitude of the errors while MAPE penalizes based on the magnitude of percentage errors, the latter useful for characterizing errors across regions of relatively low climatological precipitation. In contrast, pattern correlation penalizes based on differences in the spatial distribution of the climatology. Following Di Virgilio et al. (2020), a final added value metric was included to account for the overall spatial coverage of added value, which we refer to as Land%. This is defined as the fraction of all grid cells where the RCM shows added-value defined by MAE, scaled to range between − 50 and + 50. As such, a Land% value > 0 indicates that the RCM added value covers more than half of the total land area, while a value of 50 indicates that the RCM added value covers all land area. Climatological fields of precipitation, daily maximum air temperature (tasmax), and daily minimum air temperature (tasmin) were assessed for added value for each GCM/RCM, along with a seasonal breakdown (annual, summer, winter). Added value was then extended to extreme events in each GCM/RCM by evaluating select temperature and precipitation-based indices (Table 2) from the Expert Team on Climate Change Detection and Indices (ETCCDI, Zhang et al. 2011). This enabled different parts of the distribution to be evaluated as well as duration-based considerations (e.g. the length of wet and dry spells).

Table 2 Extreme event ETCCDI indices evaluated across models in this study

Full size table

The reference dataset for assessing added value was from the daily Virtual Climate Station Network (VCSN) (~ 5-km grid, Tait et al. 2006; Tait et al. 2012; Tait and Macara 2014). The gridded daily VCSN product is constructed based on spatial interpolation of an extensive network of station data across New Zealand. The interpolation uses a second-order trivariate thin-plate smoothing spline. Additional information on location and climatological fields are used in the interpolation. The updated ‘Norton adjusted’ version of VCSN was used for tasmax and tasmin which improves temperature estimates in high elevation regions (Tait and Macara 2014). The updated ‘augmented’ version of VCSN was used for precipitation which includes a much larger number of stations in the final interpolated product (approximately 1200), as discussed in Tait et al. (2012). For assessing added value across the common overlapping years of 1982–2014, all products (i.e. VCSN, reanalysis and models) were regridded to the 12-km CCAM grid over New Zealand using conservative regridding. Regridding was performed after the computation of extreme indices (e.g. Gibson et al. 2019).

2.6 Circulation regimes

Circulation regimes were analysed to evaluate relationships between a range of synoptic circulation conditions and precipitation variability in the RCM output. This adds additional understanding of model biases on top of the climatological added value evaluation. Circulation regimes were defined following the approach of Rampal et al. (2022b) applied to ERA5 daily 1000-hPa geopotential heights (z1000), where the seasonal cycle and trend were first removed. In this approach affinity propagation is first applied to determine an ‘optimal’ number of clusters (n = 9), from which K-means clustering is applied to the first three empirical orthogonal functions (EOFs) computed over the New Zealand domain.

In this approach, the selection of 9 clusters was recommended by Rampal et al. (2022b), which highlighted that fewer than the 12 clusters originally identified by Kidson (2000) were needed to capture the main features of New Zealand’s circulation variability. It is important to note that many of the clusters obtained in Rampal et al. (2022a) closely resemble those of Kidson (2000). The first three EOFs were chosen because they accounted for 85% of the total variance in z1000. Sensitivity testing to a larger number of EOFs produced very similar clusters and the evaluation of CCAM precipitation composites remained very similar (not shown), highlighting that the results are largely insensitive to these methodological choices. For further discussion on the clustering methodology and comparisons with Kidson (2000), the reader is referred to Rampal et al. (2022b). The circulation regimes defined from ERA5 data were applied to daily GCM z1000 fields to assign a daily circulation regime. From this we assessed biases in the GCM circulation regime frequency and the association (i.e. composites) with RCM simulated precipitation.

2.7 Cyclone tracking

Tropical cyclones and associated ex-tropical transitions were evaluated in each GCM and RCM (CCAM) pair. This allows for an assessment of whether CCAM improves the representation of these storms relative to the host GCM. Cyclone tracking was performed through the TempestExtremes v2.1 tracking algorithm (Ullrich et al. 2021). Following Gibson et al. (2023), 6-hr mean sea level pressure (MSLP) was tracked by identifying a local minimum in the MSLP field. A single closed contour criterion is specified, based on MSLP increasing by 200-Pa over a 6.0-degree great circle distance outward from the candidate point minima, which ensures the low-pressure region is sufficiently strong and compact to be considered a coherent feature. Candidates are eliminated if another stronger MSLP minimum exists within a 6.0-degree great circle distance. As these candidate features are tracked, additional criteria are imposed: a track must persist for at least 60 h to be considered, and the maximum distance travelled between subsequent detections is 6.0-degrees. We focus our evaluation on subsets of cyclones based on genesis location latitude (between 0-25^oS) and in terms of strength defined by MSLP along the track. Since tracking parameter choices can influence the overall cyclone frequency (Ullrich et al. 2021), we also carried out sensitivity testing of tracking parameters on CCAM output including the use of a warm core criteria.

The climatology of tropical cyclone count, tracks, and maximum intensity were assessed relative to IBTrACS Version 4 (Knapp et al. 2018) and SPEArTC (Diamond et al. 2012) over the southwest Pacific. The use of two reference products enabled an assessment of observational uncertainty in the context of model evaluation. SPEArTC can be considered an update and reanalysis of IBTRACS specific to this basin (Magee et al. 2016); it includes ex-tropical cyclones in the database which are important for rainfall extremes over New Zealand (e.g. Harrington et al. 2023).

3 Results and discussion

3.1 Climatological added value

We begin by investigating added value from CCAM in the context of downscaling ERA5 reanalysis and a single GCM, ACCESS-CM2 (Fig. 3 and Supplementary Material Figure S4). Later we extend this added value analysis to include all downscaled GCMs (Table 1) including for different variables, seasons and error metrics. With the assumption that the large-scale fields are well represented in ERA5, comparing downscaled biases between ERA5 and the GCMs is useful for shedding light on different sources of model bias (e.g. large-scale circulation induced versus RCM physics).

3.1.1 Precipitation

For downscaling precipitation, the top row of Fig. 3 compares the annual precipitation climatology relative to VCSN (reference) for: ERA5, ERA5 downscaled by CCAM, ACCESS-CM2, and ACCESS-CM2 downscaled by CCAM. Prior to downscaling, both ERA5 and ACCESS-CM2 substantially underestimate precipitation in high elevation regions. This underestimation is most apparent across the Southern Alps of the South Island, where the highest average annual precipitation totals are in the 3–5 m range for ERA5 and ACCESS-CM2, in contrast to approaching 12 m in VCSN. This underestimation is also apparent in the North Island in both ERA5 and ACCESS-CM2 with a lack of orographic enhancement in the precipitation climatology. After downscaling with CCAM, this high-elevation dry bias issue is considerably reduced, where the annual precipitation climatology is in much better agreement with VCSN.

The added value is more formally quantified by comparing the magnitude of climatological biases in the middle and bottom rows of Fig. 3 for various error metrics. For example, the annual climatological precipitation RMSE is reduced by around 30% when downscaling ERA5 and by around 45% when downscaling ACCESS-CM2. The improved representation of high elevation precipitation also results in substantially improved spatial pattern correlations, which increases from 0.52 to 0.92 after downscaling ACCESS-CM2 with CCAM. As noted earlier, comparing error metrics and the spatial patterns of biases between reanalysis and GCM-downscaled fields can shed light on different sources of model bias. As expected, when aggregated across the country, errors are slightly larger when downscaling ACCESS-CM2 compared to downscaling ERA5 (Fig. 3, middle row). This indicates that circulation biases induced by the GCM have partly contributed to the final downscaled precipitation biases. Interestingly, a dry bias across parts of the Southern Alps when downscaling ERA5 with CCAM is switched in sign to wet bias when downscaling ACCESS-CM2. In Sect. 3.2 we investigate this in greater detail from a circulation regimes perspective. On the other hand, certain regional biases are evident across both downscaled products indicating important biases induced by CCAM itself. A notable example of this is in the lee of the Southern Alps (i.e. across the eastern South Island) where CCAM consistently displays a positive wet bias relative to VCSN.

The spatial maps in the bottom row of Fig. 3 indicate that the reduction in biases from downscaling are not uniform across the country (i.e. positive values indicate a reduction in biases, while grey values indicate no change or enhanced biases). Regions of negative added value (i.e. enhanced biases) are further highlighted in Supplementary Material Figure S4. As expected, high elevation regions are where downscaling with CCAM shows the largest reduction in biases (Fig. 3). The most likely explanation for this is that low resolution models (i.e. around 100 km resolution) have insufficient representation of orography so lack orographic enhancement (e.g. Renwick et al. 1998). It is notable that ERA5 (prior to downscaling, ~ 30 km resolution) also has a large dry bias over the Southern Alps, also reported by Pirooz et al. (2021). Rain gauge data are not directly assimilated into ERA5 (Lavers et al. 2022), therefore the biases here cannot simply be attributed to the low station density in this region. This suggests that the driving atmospheric model’s low resolution (i.e. relative to the terrain height and complexity) and physics parameterizations in ERA5 have contributed to the dry bias.

While a considerable reduction in biases is evident in many regions (particularly high elevation regions), added value from downscaling via CCAM is not apparent across the entire country for precipitation. When downscaling ACCESS-CM2, the Land% metric (10.22) indicates that just over 60% of the country shows improvements after downscaling. As noted above, the lee of the Southern Alps is a large region where CCAM appears to struggle to reduce biases from the GCM, suffering from a regionally consistent wet bias. The general spatial pattern of biases, especially the wet bias in the lee of the Southern Alps, is evident when downscaling other GCMs (e.g. EC-Earth3, Supplementary Material Figure S5). Interestingly, this same regional wet bias is shown across high resolution convective-scale forecasts and regional reanalysis products (~ 1.5-km resolution) in Pirooz et al. (2023), indicating that this issue is not unique to CCAM and not necessarily resolved at higher resolution.

3.1.2 Tasmax

The added value for daily maximum air temperature (tasmax) is shown in Fig. 4 and Supplementary Material Figure S6. As was the case for precipitation, there is a clear lack of spatial variability in the temperature climatology in high elevation regions prior to downscaling. This is particularly notable in ACCESS-CM2 with a complete lack of spatial variation in temperature by elevation. Due to the magnitude of these biases, it is unsurprising that CCAM shows relatively large added value. For example, after downscaling, RMSE and MAE are reduced by over 30%, and the spatial pattern correlation improves from 0.69 to 0.93. In the case of tasmax, the added value is more uniformly spread across the country than for precipitation (Fig. 4, bottom row).

It is noteworthy that when downscaling ERA5 with CCAM, similarly large reductions in tasmax biases are evident. In particular, RMSE and MAE have reductions in the 30–40% range and the added value is spread across over 70% of the country. This is an impressive result, since ERA5 directly assimilates near-surface air temperature observations and is therefore more difficult to improve upon through downscaling. For example, across Australia, Di Virgilio et al. (2020) reported a general lack of added value for temperature fields from RCMs driven by reanalysis.

When comparing ERA5 downscaled by CCAM with ACCESS-CM2 downscaled by CCAM (Fig. 4 middle row) the consistency in the spatial pattern and magnitude of the biases suggests that large-scale circulation biases inherited from the GCM are not the primary source of CCAM tasmax biases. All other downscaled GCMs show a very similar spatial pattern in the bias for tasmax (not shown), despite some differences in the overall magnitudes for certain models. One notable difference is for EC-Earth3 which is approximately 1^oC warmer in the downscaled climatology averaged across New Zealand (Supplementary Material Figure S7). Relative to VCSN, CCAM appears to induce a warm bias across parts of the eastern North Island and the top of the South Island. These warm biases are similar in location and magnitude to those described in Gibson et al. (2023) when CCAM was driven by observed SST (no atmospheric nudging), which again suggests that these biases are driven by CCAM at the local scale. These results are consistent with other studies showing that RCMs can generate their own internal biases in near-surface air temperature independent of the driving GCM biases (e.g. Di Virgilio et al. 2020). As discussed in Gibson et al. (2023) through sensitivity testing, the spatial pattern of this apparent bias may be related to roughness length values in CCAM for tall forest canopy regions. Inconsistencies may also arise given the assumption that observational measurements of temperature are made in clearings in these regions. The apparent cold bias in CCAM across the central South Island was also found in Gibson et al. (2023). As described there, VCSN tends to have an apparent warm bias in this high-elevation region which may partly contribute to the apparent discrepancy with CCAM.

3.1.3 Tasmin

In comparison to precipitation and tasmax, tasmin generally displays the largest added value from downscaling (Fig. 5 and Supplementary Material Figure S8). For the case of downscaling ACCESS-CM2, RMSE and MAE decrease by more than 80% (Fig. 5, middle row) and added value is spread across almost the entire country (Fig. 5, bottom row). For the case of downscaling ERA5, error reductions are somewhat smaller but consistent in direction and are also spatially widespread. As was the case for tasmax, the similarity of tasmin biases when downscaling ERA5 and ACCESS-CM2 again suggest that large-scale circulation biases from the GCM have made only a relatively small contribution to this. The small magnitudes of the tasmin biases in both cases are generally impressive, with climatological MAE aggregated across the country of ~ 0.7^oC.

3.1.4 Summary across models

A summary of added value is presented in Fig. 6, comparing across variables, GCMs, error metrics and seasons. The heatmap displays percentage reductions in biases after downscaling, whereby red positive values indicate a reduction in bias (i.e. added value from downscaling), while blue negative values indicate an enhancement in bias.

As noted earlier for ACCESS-CM2, the overall added value is typically largest for tasmin compared to other variables. As shown in Fig. 6c, the large reductions in tasmin biases, particularly MAE and RMSE, that were shown earlier for ACCESS-CM2 are consistent across models and seasons. The only exception for this is when downscaling ERA5, which as discussed earlier is expected to be challenging for temperature (compared to precipitation) since ERA5 directly assimilates surface temperature data. It is also encouraging that for tasmax, in almost all cases, there is consistent added value across models, seasons, and error metrics. The only exception to this is for EC-Earth in summer where tasmax biases increase slightly for certain metrics (i.e. MAE and Land%).

For precipitation, while there is generally added value shown across models (Fig. 6a), a notable exception is in winter for the models where CCAM is driven by SST/SIC fields (i.e. AWI-CM-1-1-MR, GFDL-ESM4, CNRM-CM6-1). For these models, CCAM generally enhances biases in winter precipitation compared to the host model (although the spatial pattern correlation is still consistently improved after downscaling). Individual inspection of these model precipitation biases (e.g. GFDL-ESM4, Supplementary material Figure S9) reveals a consistent pattern of bias across SST/SIC driven models characterised by a wet bias in winter precipitation across the Southern Alps. This bias, and the associated reduction in added value, is limited to winter precipitation, and does not impact other seasons (i.e. MAM, SON, not shown).

Comparisons to the CCAM biases in Gibson et al. (2023), whereby CCAM was forced by observed SST/SIC (OSTIA), provide useful context for explaining these winter precipitation biases. The biases from that study provide an upper bound for how CCAM is expected to perform in this study when driven by bias-corrected SST/SIC fields from GCMs. As discussed in Gibson et al. (2023), when driven by observed SST/SIC, CCAM has a tendency for too strong winter ridging over Southern Australia and the Tasman Sea and an associated too strong winter jet and storm track activity, particularly affecting the south of the South Island (Supplementary Material Figure S10), consistent across initial condition ensemble members. These same circulation biases are shown to be very similar across the SST/SIC driven CCAM runs here, with the magnitude increased slightly here (Supplementary Material Figure S11). In Gibson et al. (2023) evidence was presented to suggest that these circulation biases over the New Zealand region may be related to Rossby wave train biases induced by remote deep convection biases in certain regions of the tropics in CCAM. Given the similarities, it seems plausible that this is also partly responsible for the circulation biases here which subsequently drive the wet bias in high elevation precipitation across the Southern Alps. Since this approach to dynamical downscaling (i.e. CCAM driven by SST/SIC) is also commonly performed by other studies (e.g. Chapman et al. 2023; Grose et al. 2023), we suggest that these findings carry wider relevance for downscaling applications in other regions and settings. As such, similar process-based investigation into large-scale circulation biases and their causes in CCAM is warranted. Testing the sensitivity of these biases to the grid configuration and physics schemes in CCAM would be particularly useful.

3.2 Circulation regimes

The above model evaluation considered precipitation only in a climatological sense. Here we present a more comprehensive evaluation of CCAM precipitation through the perspective of circulation regimes. To remove any precipitation biases induced by large scale circulation biases, here we focus on the ERA5 reanalysis-driven CCAM simulation.

Precipitation composites are shown by circulation regime in Fig. 7 comparing VCSN (top panels) and CCAM (bottom panels). Note that the circulation regime composites themselves are almost identical, as both are driven by ERA5. Very minor differences can however arise due to how CCAM modifies the circulation fields from ERA5 through spectral nudging. Instead, we focus on how the precipitation composites compare between VCSN and CCAM across circulation regimes, quantified through the spatial pattern correlation assigned to each regime. As expected, the presence of widespread cyclonic conditions (namely the L and LSW regimes) typically results in large widespread precipitation totals over much of the country which is reproduced well by CCAM (pattern correlation of 0.86–0.92). Strong zonal westerly flow conditions (most apparent under the HW circulation regime) results in a large west/east gradient of precipitation associated with orographic enhancement over the Southern Alps and dry conditions to the lee. This is also well captured by CCAM (Pattern correlation of 0.93).

In terms of weaknesses of CCAM, as noted earlier, CCAM tends to produce too much climatological precipitation in regions to the lee of the Alps. From the perspective of circulation regimes, we can see that this regional wet bias manifests mostly through synoptic northerly and north-easterly flow conditions (i.e. the HSE and LNW regimes) as opposed to westerly flow conditions. Nevertheless, the general finding of highly consistent agreement in these precipitation composites further suggests that CCAM can represent precipitation in a spatially and physically correct manner under typical synoptic circulation regimes. Despite not being commonplace in RCM evaluation studies, we suggest that this type of circulation regime-based evaluation is a valuable addition to model evaluation, providing additional insights into model biases beyond simple climatological statistics.

The frequency of circulation regimes in the GCMs can also be useful for further understanding aspects of the downscaled climatological precipitation biases presented earlier. For example, when comparing CCAM downscaled precipitation from ACCESS-CM2 against ERA5, ACCESS-CM2 had a significant increase in precipitation across the Southern Alps. One possible explanation for this is that zonal circulation regimes associated with westerly or north-westerly flow conditions (namely LSW, HW, HSE), shown to produce large precipitation totals over the Alps (Fig. 7), occur too frequently in ACCESS-CM2. There is some evidence for this shown in Fig. 8, where both the LSW and HSE regimes occur around 5% and 10% more frequently, respectively. Another apparent cause for the wet precipitation bias in ACCESS-CM2 is that there is also too much precipitation occurring under these key circulation regimes, especially for the north to north-westerly flow conditions under the HSE regime (Supplementary material Figure S12). The circulation regime frequency analysis (Fig. 8) also reveals that both the SST/SIC driven CCAM runs and the nudged CCAM runs have overall roughly similar biases in regime frequency on an annual basis. This is encouraging given the tendency for an overly strong winter jet and storm track bias in the SST/SIC driven CCAM runs, as discussed earlier. While the focus here has been on historical evaluation, it would be interesting in future work to assess future projections of these circulation regimes in CMIP6, building upon prior regional attribution and CMIP5 studies (e.g. Gibson et al. 2016; Thomas et al. 2023), including from this subset of downscaled projections.

3.3 Inter-annual variability

There is also interest in whether the downscaled output from CCAM appropriately captures inter-annual variability in temperature over land. Although New Zealand is an island country, where temperature variability over land is strongly related to SST variability, land surface feedbacks may also be important in certain inland regions, particularly for temperature extremes in late summer (Harrington 2021). Since SSTs are prescribed to CCAM in these simulations, changes in inter-annual variability in land temperature through downscaling (i.e. relative to the host GCM) is likely indicative of the role of the land surface model. Furthermore, for the SST/SIC-driven CCAM simulations, since bias correction is first performed on the SSTs, changes in inter-annual variability in temperature after downscaling are useful for assessing how the bias correction has influenced temperature variability over land.

In the nudged simulations for tasmax and tasmin in summer (Fig. 9), there is relatively little change in variability after downscaling. Depending on the host GCM, downscaling can either slightly increase or decrease inter-annual variability, but these changes are small relative to the overall differences between models. As expected, this implies that differences in temperature variability between the host GCMs have a strong first order control on downscaled temperature variability over land. For the SST/SIC driven simulations (marked by asterisk) downscaling has a much larger influence on inter-annual variability, which is very likely due to the bias and variance correction of SSTs performed prior to downscaling (see Sect. 2.4 for details). Notably, this change in variability is consistently in the direction of improving variability relative to VCSN. Overall, it is encouraging that for both nudged and SST/SIC driven simulations CCAM appears to produce similar temperature variability to VCSN. Similar results were found for other seasons (not shown).

3.4 Extreme events

Next, we describe added value in the context of extreme events through analysis of ETCCDI indices. As before, we begin with an in-depth evaluation of added value for reanalysis (ERA5) alongside a single GCM (ACCESS-CM2) for select indices before summarizing across all GCMs and indices.

3.4.1 Rainfall extremes

For the climatology of the annual wettest 3 consecutive days (Rx3day) (Fig. 10 and Supplementary Material Figure S13) the spatial patterns of biases are broadly consistent and scale with the biases described earlier for mean precipitation. For example, for the reanalysis-driven simulation, there is a tendency for a dry bias in extreme rainfall in high elevation regions, and a wet bias in the lee of the Alps. Regarding the dry bias, the maximum Rx3day in CCAM is 478 mm and 572 mm when driven by ERA5 and ACCESS-CM2, respectively, while the maximum Rx3day in VCSN is 720 mm. However, despite these biases, this still constitutes a large improvement in Rx3day relative to ERA5, in which the maximum Rx3day is less than 300 mm (i.e. dry biased) and the mean is also considerably dry biased. The apparent wet bias for CCAM rainfall extremes in the lee of the Alps might be somewhat inflated due to issues with VCSN underestimating rainfall extremes in this region in certain historical events (Stone et al. 2022). One region of relatively large improvements for CCAM is across the eastern North Island where the Rx3day are a close match to VCSN but are substantially underestimated in ERA5 and ACCESS-CM2 by as much as 50%. As shown by Cyclone Gabrielle recently (Harrington et al. 2023), this region can achieve significant multiday rainfall totals from ex-tropical cyclones with devastating consequences, making CCAM’s improvement over the global products an important example of added value. Similar results (i.e. spatial patterns of biases and added value) are found for single day extreme rainfall analysis from Rx1day (Supplementary Material Figure S14).

3.4.2 Temperature extremes

For the climatology of the annual hottest day (TXX) (Fig. 11 and Supplementary Material Figure S15) the spatial patterns of biases from CCAM are again generally consistent with those of the daily tasmax climatology shown earlier. Notably, there are regions where CCAM tends to overestimate the hottest day of the year by more than 2°C, namely across the top of the South Island and across eastern North Island. Conversely, CCAM tends to underestimate the hottest day of the year over inland regions of the South Island, though as discussed earlier there is also considerable observational uncertainty across this region. Since similar spatial patterns of biases are evident across both ERA5 and other GCMs (not shown) this suggests that these biases in TXX are again driven predominantly by CCAM at the local scale and less so by specific biases in the host GCM. Despite these biases, for downscaling both ERA5 and ACCESS-CM2 there is generally large added value with reductions in RMSE and MAE in the range of 25–44% when aggregated across New Zealand. As indicated by the Land% metric, the spatial consistency of added value for temperature-based extremes is generally larger (i.e. more widespread) compared to that for precipitation-based extremes shown earlier (Fig. 10).

3.4.3 Summary of ETCCDI

A summary of added value across GCMs and ETCCDI indices is presented in Fig. 12 based on climatological RMSE comparisons. From this, widespread evidence of added value is apparent from downscaling. Generally, the largest added value is for temperature-based indices, especially daily temperature range (DTR) and frost days (FD). For both of these indices, the host GCMs have a strong tendency to underestimate magnitudes (i.e. overall too few frost days and insufficient diurnal temperature range, not shown) which are greatly improved upon in the downscaling by CCAM across GCMs. Despite certain biases described above for precipitation extremes, added value from downscaling is also consistent across GCMs for the various indices analysed (i.e. Rx1day, Rx3day, Rx5day, R95p).

Generally, the smallest or degraded added value is for consecutive wet days (CWD) and consecutive dry days (CDD). For these indices, there are single model examples (i.e. for NorESM2-MM and AWI-CM-1-1-MR GCMs) where the downscaling from CCAM has not lead to overall improvements. Although not shown, CCAM tends to slightly overestimate CWD in most locations (i.e. wet spells are too long) and underestimate CDD (i.e. dry spells are too short). When driven by reanalysis and aggregated across the country, CCAM tends to also overestimate the number of wet days (R1mm) and slightly underestimate the average rainfall on wet days (SDII). This implies that the biases in CWD and CDD shown here are driven by too frequent low-intensity rainfall events in CCAM. A similar finding was reported in Gibson et al. (2023), where CCAM was not nudged to atmospheric conditions from GCMs, to suggest that these biases in these indices are generated by CCAM at the local scale. It is also noteworthy that the SST/SIC-driven models in Fig. 12 show an overall similar degree of added-value as the nudged runs, highlighting the combined usefulness of both approaches in the ensemble.

Assessments of RCM added value from other studies and regions have generally reported mixed results, with examples of both improvements and deteriorations after downscaling (Di Luca et al. 2016 and references therein). As discussed in Di Luca et al. (2016) for Australia, the largest added value was generally found in regions of complex topography and land-sea contrasts, where GCM biases can be very large. They also generally reported the largest added values for extreme statistical quantities compared to mean fields. Our results are broadly consistent, whereby we report the largest added value for precipitation over high elevation regions of complex terrain, where precipitation totals (for both the mean and extreme statistics) are consistently and considerably underestimated across all GCMs. Similarly for temperature, considerable improvements are found in high elevation regions where the representation of more complex fine-scale patterns becomes apparent. In contrast, the temperature climatology fields in GCMs over New Zealand tend to be too spatially homogeneous, often varying only by latitude (e.g. ACCESS-CM2, Fig. 4).

Using a larger RCM ensemble from CORDEX, Di Virgilio et al. (2020), found that temperature (maximum and minimum) for both mean and extreme statistics over Australia were generally degraded in the RCM output when downscaling reanalysis. For downscaling GCMs, they also reported instances where downscaling did not add value for temperature fields over the historical period. It is therefore encouraging that in this study for New Zealand we find generally greater consistency in added value for temperature extremes (Fig. 12). A likely reason for this difference is that, compared to Australia, a greater proportion of New Zealand land area is dominated by complex terrain and land-sea contrasts, allowing ample opportunities for the RCM to ameliorate substantial GCM biases. Consistent with this, Di Virgilio et al. (2020) also found that added value was generally much greater for the mean and extremes of temperature fields across isolated stretches of Australian coastline and across the Australian Alps.

3.5 Tropical cyclones

Tropical cyclone climatology was analysed on a per model basis, before and after downscaling from CCAM with tracking from TempestExtremes (Ullrich et al. 2021). Consideration of cyclone intensity was based on the relatively simple metric of minimum cyclone central pressure along the track (Roberts et al. 2020; Gibson et al. 2023), this reduces data requirements and helps improve consistency across tracking methodologies and best track databases. Consideration of cyclone strength is important in the context of downscaling, given issues in coarse resolution GCMs (and reanalysis) where strong tropical cyclones generally occur too infrequently (Roberts et al. 2020).

We begin by showing an example (Fig. 13) of the tropical cyclone climatology from NorESM2-MM focusing on relatively strong cyclones (reaching at least category 3, MSLP < 965-hPa). As shown in Fig. 13a, compared to the reference SPEArTC, NorESM2-MM substantially underestimates the frequency of cyclones that reach this strength (NorESM2-MM = 35 versus SPEArTC = 119). While observational uncertainties exist for tropical cyclone counts (Schreck et al. 2014), this uncertainty is likely small relative to the magnitude of this bias. For example, IBTrACs was found to produce a very similar overall climatology (n = 123, not shown). After downscaling NorESM2-MM with CCAM (Fig. 13b) the underrepresentation is reduced considerably where CCAM produces n = 100 events over this time period. The genesis locations and tracks also appear to be generally well captured by CCAM in a climatological sense. One apparent difference is that that the tracks from CCAM appear to reach into high latitudes more readily, especially to the east of the international dateline. A similar finding was reported in Gibson et al. (2023) when CCAM was driven by observed SSTs, indicating that this apparent difference is not driven by biases in the GCM large-scale circulation fields or biases in SST. It is difficult to determine whether this is a true bias in CCAM or whether it stems more from underlying differences in how storms are recorded in these observational track databases (across both IBTrACS and SPEArTC). For example, as discussed in Schreck et al. (2014), counts can be more uncertain as storms traverse into higher latitudes due to differences in counting procedures between reporting agencies and changes to these procedures across time which are unaccounted for. Future research evaluating the background atmospheric environmental conditions generated by CCAM relevant to the maintenance of cyclones at higher latitudes, namely vertical wind shear, would also be beneficial (Walsh and Katzfey 2000).

Next, we extend this analysis of added value in the representation of tropical cyclone climatology across GCMs and for different intensity categories. As shown in Fig. 14a, for storms that reach at least category 2 (MSLP < 980-hPa), downscaling considerably improves the cyclone frequency across GCMs which were otherwise consistently underrepresented. This general, and often substantial, underrepresentation in storm frequency in the coarse-resolution GCMs is consistent with other studies in other basins (e.g. Roberts et al. 2020). However, it is notable that the number of storms after downscaling with CCAM is shown here to be strongly dependent on the GCM. For example, after downscaling NorESM2-MM and EC-Earth3 have too many events while the other downscaled GCMs compare reasonably well with observational estimates at this intensity.

For stronger category 3 + events (MSLP < 965-hPa, Fig. 14b), while CCAM considerably increases the frequency across all GCMs (typically by more than a factor of 2), the frequency remains generally underestimated by CCAM after downscaling. The exception is EC-Earth3 which slightly over predicts frequency at this intensity. Similarly, for category 4 + events (Fig. 14c) the underrepresentation further increases across all models. The sensitivity to tracking parameters including warm core criteria (Ullrich et al. 2021) is shown in Supplementary Material Figure S16. When applied to downscaled CCAM output, modification to tracking parameters and warm core inclusion typically reduces tropical cyclone frequency by around 5–20% (depending on the model and intensity) and the overall variability between models is preserved. From this we conclude that sensitivity to tracking parameters is small in the context of the often-large improvements (especially in the category 2–3 range) in frequency from downscaling.

Over the historical period, only two category 5 events were simulated by CCAM, both when downscaling EC-Earth3, compared to 19 events of this magnitude in observations across both IBTrACS and SPEArTC (not shown). A snapshot of the MSLP and wind fields from CCAM during one of these extreme TC events are shown in Supplementary Material Figure S17. During this event, the central MSLP reached a track minimum of 913.5-hPa with hourly maximum 10-m surface wind speeds of 72.6 m s^− 1. Although clearly sensitive to the driving GCM fields, it is encouraging that CCAM can produce an event of this magnitude at this resolution.

The differences in the representation of tropical cyclones between downscaled GCMs indicates that the thermodynamic and dynamic conditions responsible for cyclone genesis, intensification and maintenance differ strongly between GCM driving fields. To better understand and quantify this, a target for future research is to investigate these conditions (e.g. SSTs, SST gradients, wind shear) across GCMs before and after downscaling. Another possible cause for the underrepresentation of the most extreme events is the model spatial resolution across the tropics and subtropics north of New Zealand. While the CCAM resolution (variable by ~ 15-25-km) in this region is greatly improved compared to the host GCMs (~ 80-150-km depending on the model) it may still be a limiting factor. Roberts et al. (2020) found that models with enhanced resolution around 25-km typically tend to have more frequent and stronger tropical cyclones with generally reduced biases compared to events in CMIP6-class models. However, category 3 and 4 storms were still commonly underrepresented by GCMs at this higher resolution as well as in modern reanalysis products.

As discussed in Roberts et al. (2020) further work is needed to understand how parameterizations of unresolved processes could benefit representation of the most intense tropical cyclones in models at ~ 25-km resolution. Process-based examinations of the boundary layer, convection and surface drag schemes may lead to additional insights. There are at least a few examples of global models at ~ 15 km resolution that can simulate very strong tropical cyclones well (e.g. Chauvin et al. 2020), but the results appear rather sensitive to the coefficients of the turbulence scheme that enhance convection (Roberts et al. 2020). Future research should examine this further in CCAM, targeting the impact of the turbulence scheme and different aspects of the stretched grid configuration. There is already preliminary evidence from Gibson et al. (2023) for the importance of the grid configuration in CCAM, where the number of storms reaching category 2 decreases by ~ 50% over the South Pacific basin when the model is run with a quasi-uniform ~ 100 km resolution grid compared to the high resolution stretched grid used by CCAM in the present study. Further examination of the sensitivity of strong tropical cyclones to the placement of the high-resolution grid (i.e. including placing over the main genesis regions) and to the Schmidt grid stretching factor would be useful.

Despite these remaining biases in CCAM, it is worth highlighting that the large improvements in tropical cyclone frequency for events reaching at least category 2 (and more modest improvements for events reaching at least category 3) are likely to be valuable for future studies exploring tropical cyclone risk under climate change in this region. For example, one of New Zealand’s most devastating weather events on record was caused by Cyclone Gabrielle in February 2023 (Harrington et al. 2023), which had a lowest track MSLP of 958.1-hPa (category 3) before undergoing ex-tropical transition and impacting New Zealand with a strength of 966.6-hPa recorded over land (equivalent to category 2 strength). Given the also important role of atmospheric rivers in driving precipitation extremes over New Zealand (e.g. Prince et al. 2021; Reid et al., 2021), future work will focus on evaluating and understanding future projections of these features in this new set of downscaled simulations.

3.6 Observational uncertainty and caveats

Observational uncertainty is an important consideration in the context of climate model evaluation, especially for extreme events and in regions where observations are sparse (e.g. Gibson et al. 2019; Alexander et al. 2020). Compared to many regions of the globe, New Zealand’s observational network is relatively dense over the time period of model evaluation used here (1982–2014), a caveat is that few long-term station observations are available in high elevation regions where precipitation is often highly heterogenous (Tait et al. 2012). As discussed earlier, the observational uncertainty in VCSN in these regions is relatively large which imposes certain caveats to the interpretation of model biases. It would be useful to re-assess the model biases documented here for any future releases of VCSN that account for additional station data or more sophisticated approaches to spatial interpolation that directly account for the complexities of extreme events.

Despite these caveats, we suggest that VCSN currently provides the most reliable high-resolution daily gridded rainfall and temperature observations for evaluation of model climatologies over New Zealand. Vishwanathan et al. (2023) recently compared several existing gridded products over New Zealand relative to station observations, including VCSN (non-bias corrected), merged satellite-gauge products and reanalysis. The authors report that VCSN and MSWEP (a merged satellite-gauge product, Beck et al. 2019) generally compare well and provide the overall best agreement with station observations, though this varies by region and metric. However, their assessment of MSWEP is complicated by the fact that it was constrained to a limited set of station observations, which do not sample high elevation regions well. More recently, Campbell et al. (2024; submitted) found that MSWEP tends to substantially underestimate high-elevation precipitation, including for both mean and extremes. This likely stems from the fact MSWEP is weighted towards lower resolution reanalysis products that are known to underestimate high elevation precipitation across New Zealand (e.g. Pirooz et al. 2021 and Fig. 3 above) combined with considerably fewer in-situ precipitation observations being ingested than in VCSN. For this reason, and despite the important caveats of VCSN described here, we suggest that the use of MSWEP would not provide additional value.

The evaluation here was based on assessing the raw downscaled model output (i.e. no additional bias correction has been performed). This provides a useful overview of the model strengths and weaknesses and is also important since certain aspects of model biases cannot be easily corrected through bias correction (Maraun et al. 2017). However, in our experience, since bias corrected output is highly sought after and often essential for certain climate impacts applications and services, this will be performed across all simulations for a reduced set of core variables. The bias correction results and methodology will be documented and made available in a separate publication.

4 Conclusions

The overall aims of this study were to: (1) document the experiment design for producing the updated CMIP6 dynamical downscaled climate projections for New Zealand; (2) comprehensively assess biases and quantify the added value over the historical period of the downscaling across climatological fields, circulation regime composites, extreme indices, and tropical cyclones.

The experiment design involves downscaling 6 GCMs with CCAM (primarily) with a stretched grid configuration. The stretched grid centres high resolution over New Zealand (~ 12-km) accompanied by relatively high resolution (12-35-km) over the wider South Pacific. This extended high-resolution domain appears favourable for the representation of tropical cyclone frequency and may also prevent issues around the placement of the inner domain in typical limited area RCMs. While focus here was exclusively on CCAM, the wider downscaling project for New Zealand also includes two other RCMs (WRF and the UM) downscaling reanalysis and select GCMs. The selection of which GCMs to downscale was informed by balancing evaluated performance over the historical period, future rate of warming, and model independence. From this, three of the 6 GCMs were downscaled directly with CCAM through spectral nudging to atmospheric fields. The remaining three GCMs were downscaled using only their bias-corrected SST/SIC fields as lower boundary conditions to drive CCAM.

In terms of added value, the main strengths of the downscaled output are:

Precipitation and temperature mean climatology fields are generally greatly improved compared to the host GCM. The largest and most consistent improvements are for daily minimum temperature. While certain biases persist, the representation of orographic precipitation is greatly improved after downscaling both reanalysis and GCMs.
When driven by reanalysis, relative to observations, CCAM produces very similar precipitation patterns and amounts according to circulation regime type.
Interannual variability in temperature is generally in close agreement with observations. There is no evidence of systematic under or overestimation of variability.
Large consistent improvements for several temperature and precipitation-based extreme indices after downscaling relative to the host GCM.
Generally large improvements to the representation of tropical cyclone frequency in the category 2 and 3 intensity range, which are consistently and strongly under-represented in the host GCMs.

The main consistent weaknesses of the downscaled output are:

Climatological precipitation in certain low elevation regions appears overestimated by CCAM relative to VCSN, particularly in the lee of the Southern Alps.
Positive precipitation bias in winter across the SST/SIC-driven simulations. This is related to CCAM producing an overly active storm track which impacts the South Island.
Of the extreme indices evaluated, consecutive wet days (CWD) and consecutive dry days (CDD) generally have the largest relative biases. In particular, CCAM tends to slightly overestimate CWD in most locations (i.e. wet spells are too long) and underestimate CDD (i.e. dry spells are too short) driven by too frequent low intensity rainfall.
In certain regions, the temperature during hot extremes has a warm bias, but biases are generally much lower than the host GCMs.
While there are general improvements to TC frequency, the frequency of very strong category 4 and 5 events remain too low in CCAM. However, it is encouraging that CCAM can occasionally simulate a category 5 intensity event.

Through comprehensively documenting the relative strengths and weaknesses of the downscaled output in this study, we hope that users will be able to make better informed decisions about the most appropriate use of the data for a given climate application. In our ongoing bias correction work, we are targeting specific methods that best address the largest remaining model output biases documented here, while seeking to retain physically plausible climate change signals in the future projections.

Data availability

The ERA5 data (Hersbach et al. 2020) used in this study is available from https://doi.org/10.24381/cds.bd0915c6. The IBTrACS Version 4 data (Knapp et al. 2018) used in this study is available from: https://doi.org/10.25921/82ty-9e16. The SPEArTC data (Diamond et al. 2012) (version November 28 2022) is available from: http://apdrc.soest.hawaii.edu/projects/speartc/. The CMIP6 data is available from the Earth System Grid Federation (ESGF) archive: https://esgf-node.llnl.gov/projects/cmip6/. The VCSN data (Tait et al. 2012) is developed and maintained by NIWA, data access is available by applying through https://niwa.co.nz/climate/our-services/virtual-climate-stations. The total size of the CCAM output produced exceeds 2 PB, access to core variables from this dataset can be obtained through a free NeSI account https://www.nesi.org.nz/services/applyforaccess. When finalized, information about how to access the model output is available through: https://niwa.co.nz/climate/research-projects/updated-national-climate-projections-for-aotearoa-new-zealand.

Code availability

The CCAM model used here is fully open source and made available by CSIRO: https://confluence.csiro.au/display/CCAM/CCAM. The TempestExtremes package v2.1 used for the detection of cyclones is available from: https://github.com/ClimateGlobalChange/tempestextremes.

References

Abramowitz G, Herger N, Gutmann E, Hammerling D, Knutti R, Leduc M, Lorenz R, Pincus R, Schmidt GA (2019) ESD reviews: model dependence in multi-model climate ensembles: weighting, sub-selection and out-of-sample testing. Earth Syst Dynam 10(1):91–105. https://doi.org/10.5194/esd-10-91-2019
Article Google Scholar
Ackerley D, Dean S, Sood A, Mullan AB (2012) Regional climate modelling in New Zealand comparison to gridded and satellite observations. Weather Clim 32(1):3–22. https://doi.org/10.2307/26169722
Article Google Scholar
Akhter MS, Shamseldin AY, Melville BW (2019) Investigation of climate change impacts on flow regime in the Lucas Creek catchment using multiple CMIP5 ensembles. Urban Water J 16(5):389–401. https://doi.org/10.1080/1573062X.2019.1669199
Article CAS Google Scholar
Alexander LV, Bador M, Roca R, Contractor S, Donat MG, Nguyen PL (2020) Intercomparison of annual precipitation indices and extremes over global land areas from in situ, space-based and reanalysis products. Environ Res Lett 15(5):055002
Google Scholar
Ashfaq M, Skinner CB, Diffenbaugh NS (2011) Influence of SST biases on future climate change projections. Clim Dyn 36:1303–1319
Google Scholar
Beck HE, Wood EF, Pan M, Fisher CK, Miralles DG, Van Dijk AI, McVicar TR, Adler RF (2019) MSWEP V2 global 3-hourly 0.1 precipitation: methodology and quantitative assessment. Bull Am Meteorol Soc 100(3):473–500
Google Scholar
Brunner L, Pendergrass AG, Lehner F, Merrifield AL, Lorenz R, Knutti R (2020) Reduced global warming from CMIP6 projections when weighting models by performance and independence. Earth Syst Dynam 11(4):995–1012. https://doi.org/10.5194/esd-11-995-2020
Article Google Scholar
Campbell I et al (2024) Comparison of three reanalysis-driven regional climate models over New Zealand: climatology and extreme events. International Journal of Climatology (submitted)
Chan SC, Kendon EJ, Fowler HJ, Blenkinsop S, Roberts NM, Ferro CAT (2014) The Value of High-Resolution Met Office Regional Climate Models in the Simulation of Multihourly Precipitation extremes. J Clim 27(16):6155–6174. https://doi.org/10.1175/JCLI-D-13-00723.1
Article Google Scholar
Chapman S, Syktus J, Trancoso R, Thatcher M, Toombs N, Wong KKH, Takbash A (2023) Evaluation of dynamically downscaled CMIP6-CCAM models over Australia. Earths Future 11(11):e2023EF003548
Google Scholar
Chauvin F, Pilon R, Palany P, Belmadani A (2020) Future changes in Atlantic hurricanes with the rotated-stretched ARPEGE-Climat at very high resolution. Clim Dyn 54:947–972
Google Scholar
Collins DB (2020) New Zealand river hydrology under late 21st century climate change. Water 12(8):2175
Google Scholar
Davies H (1976) A lateral boundary formulation for multi-level prediction models. Q J R Meteorol Soc 102(432):405–418
Google Scholar
Davini P, D’Andrea F (2020) From CMIP3 to CMIP6: Northern Hemisphere Atmospheric Blocking Simulation in Present and Future Climate. J Clim 33(23):10021–10038. https://doi.org/10.1175/JCLI-D-19-0862.1
Article Google Scholar
Davini P, Corti S, D’Andrea F, Rivière G, von Hardenberg J (2017) Improved Winter European Atmospheric Blocking frequencies in High-Resolution Global Climate simulations. J Adv Model Earth Syst 9(7):2615–2634. https://doi.org/10.1002/2017MS001082
Article Google Scholar
Di Luca A, Argüeso D, Evans JP, de Elía R, Laprise R (2016) Quantifying the overall added value of dynamical downscaling and the contribution from different spatial scales. J Geophys Research: Atmos 121(4):1575–1590
Google Scholar
Di Virgilio G, Evans JP, Di Luca A, Olson R, Argüeso D, Kala J, Andrys J, Hoffmann P, Katzfey JJ, Rockel B (2019) Evaluating reanalysis-driven CORDEX regional climate models over Australia: model performance and errors. Clim Dyn 53:2985–3005
Google Scholar
Di Virgilio G, Evans JP, Di Luca A, Grose MR, Round V, Thatcher M (2020) Realised added value in dynamical downscaling of Australian climate change. Clim Dyn 54:4675–4692
Google Scholar
Di Virgilio G, Ji F, Tam E, Nishant N, Evans JP, Thomas C, Riley ML, Beyer K, Grose MR, Narsey S, Delage F (2022) Selecting CMIP6 GCMs for CORDEX Dynamical Downscaling: Model Performance, Independence, and Climate Change signals. Earths Future 10(4). https://doi.org/10.1029/2021EF002625. e2021EF002625
Diamond HJ, Lorrey AM, Knapp KR, Levinson DH (2012) Development of an enhanced tropical cyclone tracks database for the Southwest Pacific from 1840 to 2010. Int J Climatol 32(14):2240–2250. https://doi.org/10.1002/joc.2412
Article Google Scholar
Donlon CJ, Martin M, Stark J, Roberts-Jones J, Fiedler E, Wimmer W (2012) The operational Sea Surface temperature and sea ice analysis (OSTIA) system. Remote Sens Environ 116:140–158. https://doi.org/10.1016/j.rse.2010.10.017
Article Google Scholar
Drost F, Renwick J, Bhaskaran B, Oliver H, McGregor J (2007) Simulation of New Zealand’s climate using a high-resolution nested regional climate model. Int J Climatology: J Royal Meteorological Soc 27(9):1153–1169
Google Scholar
Evans JP, Ji F, Lee C, Smith P, Argüeso D, Fita L (2014) Design of a regional climate modelling projection ensemble experiment – NARCliM. Geosci Model Dev 7(2):621–629. https://doi.org/10.5194/gmd-7-621-2014
Article Google Scholar
Freidenreich S, Ramaswamy V (1999) A new multiple-band solar radiative parameterization for general circulation models. J Geophys Research: Atmos 104(D24):31389–31409
Google Scholar
Gibson PB, Perkins-Kirkpatrick SE, Renwick JA (2016) Projected changes in synoptic weather patterns over New Zealand examined through self-organizing maps. Int J Climatol 36(12):3934–3948. https://doi.org/10.1002/joc.4604
Article Google Scholar
Gibson PB, Waliser DE, Lee H, Tian B, Massoud E (2019) Climate model evaluation in the presence of observational uncertainty: precipitation indices over the contiguous United States. J Hydrometeorol 20(7):1339–1357
Google Scholar
Gibson PB, Stone D, Thatcher M, Broadbent A, Dean S, Rosier SM, Stuart S, Sood A (2023) High-resolution CCAM simulations over New Zealand and the South Pacific for the detection and attribution of Weather extremes. J Geophys Research: Atmos 128(14):e2023JD038530. https://doi.org/10.1029/2023JD038530
Article Google Scholar
Giorgi F, Gutowski WJ Jr (2015) Regional dynamical downscaling and the CORDEX initiative. Annu Rev Environ Resour 40:467–490
Google Scholar
Giorgi F, Jones C, Asrar GR (2009) Addressing climate information needs at the regional level: the CORDEX framework. World Meteorological Organ (WMO) Bull 58(3):175
Google Scholar
Grose MR, Narsey S, Trancoso R, Mackallah C, Delage F, Dowdy A, Di Virgilio G, Watterson I, Dobrohotoff P, Rashid HA (2023) A CMIP6-based multi-model downscaling ensemble to underpin climate change services in Australia. Clim Serv 30:100368
Google Scholar
Harrington LJ (2021) Rethinking extreme heat in a cool climate: a New Zealand case study. Environ Res Lett 16(3):034030. https://doi.org/10.1088/1748-9326/abbd61
Article Google Scholar
Harrington LJ, Dean SM, Awatere S, Rosier S, Queen L, Gibson PB, Barnes C, Zachariah M, Philip S, Kew S (2023) The role of climate change in extreme rainfall associated with Cyclone Gabrielle. over Aotearoa New Zealand’s East Coast
Hausfather Z, Marvel K, Schmidt GA, Nielsen-Gammon JW, Zelinka M (2022) Climate simulations: recognize the ‘hot model’problem. Nature 605(7908):26–29
CAS Google Scholar
He J, Soden BJ (2016) The impact of SST biases on projections of anthropogenic climate change: a greater role for atmosphere-only models? Geophys Res Lett 43(14):7745–7750
Google Scholar
Hersbach H, Bell B, Berrisford P, Hirahara S, Horányi A, Muñoz-Sabater J, Nicolas J, Peubey C, Radu R, Schepers D (2020) The ERA5 global reanalysis. Q J R Meteorol Soc 146(730):1999–2049
Google Scholar
Hirsch AL, Evans JP, Di Virgilio G, Perkins-Kirkpatrick SE, Argüeso D, Pitman AJ, Carouge CC, Kala J, Andrys J, Petrelli P, Rockel B (2019) Amplification of Australian heatwaves via Local Land-Atmosphere Coupling. J Geophys Research: Atmos 124(24):13625–13647. https://doi.org/10.1029/2019JD030665
Article Google Scholar
Hoffmann P, Katzfey JJ, McGregor JL, Thatcher M (2016) Bias and variance correction of sea surface temperatures used for dynamical downscaling. J Geophys Research: Atmos 121(21) 12,877 – 812,890. https://doi.org/10.1002/2016JD025383
Hurley P (2007) Modelling mean and turbulence fields in the dry convective boundary layer with the eddy-diffusivity/mass-flux approach. Boundary Layer Meteorol 125(3):525–536
Google Scholar
Ibbitt RP, Henderson R, Copeland J, Wratt D (2000) Simulating mountain runoff with meso-scale weather model rainfall estimates: a New Zealand experience. J Hydrol 239(1–4):19–32
Google Scholar
Jacob D, Teichmann C, Sobolowski S, Katragkou E, Anders I, Belda M, Benestad R, Boberg F, Buonomo E, Cardoso RM (2020) Regional climate downscaling over Europe: perspectives from the EURO-CORDEX community. Reg Envriron Chang 20:1–20
Google Scholar
Jobst AM, Kingston DG, Cullen NJ, Schmid J (2018) Intercomparison of different uncertainty sources in hydrological climate change projections for an alpine catchment (upper Clutha River, New Zealand). Hydrol Earth Syst Sci 22(6):3125–3142. https://doi.org/10.5194/hess-22-3125-2018
Article Google Scholar
Jones R, Murphy J, Noguer M (1995) Simulation of climate change over europe using a nested regional-climate model. I: Assessment of control climate, including sensitivity to location of lateral boundaries. Q J R Meteorol Soc 121(526):1413–1449
Google Scholar
Kidson, J. W. (2000). An analysis of New Zealand synoptic types and their use in defining weather regimes. International Journal of Climatology: A Journal of the Royal Meteorological Society, 20(3), 299-316.
Knapp KR, Diamond HJ, Kossin JP, Kruk MC, Schreck CJ (2018) International best track archive for climate stewardship (IBTrACS) project, version 4. NOAA Natl Centers Environ Inform, 10
Knist S, Goergen K, Buonomo E, Christensen OB, Colette A, Cardoso RM, Fealy R, Fernández J, García-Díez M, Jacob D, Kartsios S, Katragkou E, Keuler K, Mayer S, van Meijgaard E, Nikulin G, Soares PMM, Sobolowski S, Szepszo G, Teichmann C, Vautard R, Warrach-Sagi K, Wulfmeyer V, Simmer C (2017) Land-atmosphere coupling in EURO-CORDEX evaluation experiments. J Geophys Research: Atmos 122(1):79–103. https://doi.org/10.1002/2016JD025476
Article Google Scholar
Knutti R, Masson D, Gettelman A (2013) Climate model genealogy: Generation CMIP5 and how we got there. Geophys Res Lett 40(6):1194–1199
Google Scholar
Kowalczyk E, Wang Y, Law R, Davies H, McGregor J, Abramowitz G (2006) The CSIRO Atmosphere Biosphere Land Exchange (CABLE) model for use in climate models and as an offline model. CSIRO Mar Atmospheric Res Paper 13:42
Google Scholar
Lavers DA, Simmons A, Vamborg F, Rodwell MJ (2022) An evaluation of ERA5 precipitation for climate monitoring. Q J R Meteorol Soc 148(748):3152–3165. https://doi.org/10.1002/qj.4351
Article Google Scholar
Lloyd EA, Bukovsky M, Mearns LO (2021) An analysis of the disagreement about added value by regional climate models. Synthese 198(12):11645–11672. https://doi.org/10.1007/s11229-020-02821-x
Article Google Scholar
Lorrey AM, Griffiths G, Fauchereau N, Diamond HJ, Chappell PR, Renwick J (2014) An ex-tropical cyclone climatology for Auckland, New Zealand. Int J Climatol 34(4):1157–1168
Google Scholar
Magee AD, Verdon-Kidd DC, Kiem AS (2016) An intercomparison of tropical cyclone best-track products for the Southwest Pacific. Nat Hazards Earth Syst Sci 16(6):1431–1447
Google Scholar
Maraun D, Shepherd TG, Widmann M, Zappa G, Walton D, Gutiérrez JM, Hagemann S, Richter I, Soares PMM, Hall A, Mearns LO (2017) Towards process-informed bias correction of climate change simulations. Nat Clim Change 7(11):764–773. https://doi.org/10.1038/nclimate3418
Article Google Scholar
McGregor JL (2003) A new convection scheme using simple closure
Melia N, Dean S, Pearce H, Harrington L, Frame D, Strand T (2022) Aotearoa New Zealand’s 21st-Century Wildfire Climate. Earths Future, 10(6), e2022EF002853.
Ministry for the Environment (2018) Climate Change Projections for New Zealand: Atmosphere Projections Based on Simulations from the IPCC Fifth Assessment, 2nd Edition. In. Wellington, NZ
Mullan B, Dean S (2009), February AR4 climate model validation and scenarios for New Zealand. In 9th International Conference on Southern Hemisphere Meteorology and Oceanography, Melbourne (pp. 9–13)
Nguyen KC, Katzfey JJ, McGregor JL (2012) Global 60 km simulations with CCAM: evaluation over the tropics. Clim Dyn 39:637–654
Google Scholar
Pastor-Paz J, Noy I, Sin I, Sood A, Fleming-Munoz D, Owen S (2020) Projecting the effect of climate change on residential property damages caused by extreme weather events. J Environ Manage 276:111012. https://doi.org/10.1016/j.jenvman.2020.111012
Article Google Scholar
Pirooz A, Moore S, Carey-Smith T, Turner R, Su C-H (2021) Evaluation of global and regional reanalyses performance over New Zealand. Weather Clim 41(1):52–71
Google Scholar
Pirooz A, Moore S, Carey-Smith T, Turner R, Su C-H (2023) The New Zealand reanalysis (NZRA): development and preliminary evaluation. Weather Clim (01115499), 42(1)
Prince HD, Cullen NJ, Gibson PB, Conway J, Kingston DG (2021) A climatology of atmospheric rivers in New Zealand. J Clim 34(11):4383–4402
Google Scholar
Quinting JF, Vitart F (2019) Representation of synoptic-scale Rossby Wave packets and blocking in the S2S Prediction Project Database. Geophys Res Lett 46(2):1070–1078. https://doi.org/10.1029/2018GL081381
Article Google Scholar
Rampal N, Gibson PB, Sood A, Stuart S, Fauchereau NC, Brandolino C, Noll B, Meyers T (2022a) High-resolution downscaling with interpretable deep learning: rainfall extremes over New Zealand. Weather Clim Extremes 38:100525. https://doi.org/10.1016/j.wace.2022.100525
Article Google Scholar
Rampal N, Lorrey A, Fauchereau N (2022b) An objective Weather Regime classification for Aotearoa New Zealand using a two-Tiered K-Means Clustering Approach. Mon Weather Rev 150(11):3103–3122. https://doi.org/10.1175/MWR-D-22-0059.1
Article Google Scholar
Reid, K. J., Rosier, S. M., Harrington, L. J., King, A. D., & Lane, T. P. (2021). Extreme rainfall in New Zealand and its association with Atmospheric Rivers. Environmental Research Letters, 16(4), 044012.
Renwick JA, Katzfey JJ, Nguyen KC, McGregor JL (1998) Regional model simulations of New Zealand climate. J Geophys Research: Atmos 103(D6):5973–5982
Google Scholar
Renwick JA, Katzfey JJ, McGregor JL, Nguyen KC (1999) On regional model simulations of climate change over New Zealand. Weather Clim 19:3–13. https://doi.org/10.2307/44279923
Article Google Scholar
Roberts MJ, Camp J, Seddon J, Vidale PL, Hodges K, Vanniere B, Mecking J, Haarsma R, Bellucci A, Scoccimarro E, Caron L-P, Chauvin F, Terray L, Valcke S, Moine M-P, Putrasahan D, Roberts C, Senan R, Zarzycki C, Ullrich P (2020) Impact of Model Resolution on Tropical Cyclone Simulation using the HighResMIP–PRIMAVERA Multimodel Ensemble. J Clim 33(7):2557–2583. https://doi.org/10.1175/JCLI-D-19-0639.1
Article Google Scholar
Rotstayn LD (1997) A physically based scheme for the treatment of stratiform clouds and precipitation in large-scale models. I: description and evaluation of the microphysical processes. Q J R Meteorol Soc 123(541):1227–1282
Google Scholar
Schreck CJ, Knapp KR, Kossin JP (2014) The impact of best track discrepancies on global Tropical Cyclone climatologies using IBTrACS. Mon Weather Rev 142(10):3881–3899. https://doi.org/10.1175/MWR-D-14-00021.1
Article Google Scholar
Schwarzkopf MD, Ramaswamy V (1999) Radiative effects of CH4, N2O, halocarbons and the foreign-broadened H2O continuum: a GCM experiment. J Geophys Research: Atmos 104(D8):9467–9488
CAS Google Scholar
Stone DA, Pall P (2021) Benchmark estimate of the effect of anthropogenic emissions on the ocean surface. Int J Climatol 41(5):3010–3026
Google Scholar
Stone DA, Rosier SM, Bird L, Harrington LJ, Rana S, Stuart S, Dean SM (2022) The effect of experiment conditioning on estimates of human influence on extreme weather. Weather Clim Extremes 36:100427
Google Scholar
Sturman AP, Tapper NJ (1996) The weather and climate of Australia and New Zealand. Oxford University Press, USA
Google Scholar
Sturman P, McGowan AA, H., Spronken-Smith A, R (1999) Mesoscale and local climates in New Zealand. Prog Phys Geogr 23(4):611–635
Google Scholar
Su C-H, Eizenberg N, Jakob D, Fox-Hughes P, Steinle P, White CJ, Franklin C (2021) BARRA v1. 0: kilometre-scale downscaling of an Australian regional atmospheric reanalysis over four midlatitude domains. Geosci Model Dev 14(7):4357–4378
Google Scholar
Tait A, Macara G (2014) Evaluation of interpolated daily temperature data for high elevation areas in New Zealand. Weather Clim 34:36–49
Google Scholar
Tait A, Henderson R, Turner R, Zheng X (2006) Thin plate smoothing spline interpolation of daily rainfall for New Zealand using a climatological rainfall surface. Int J Climatology: J Royal Meteorological Soc 26(14):2097–2115
Google Scholar
Tait A, Sturman J, Clark M (2012) An assessment of the accuracy of interpolated daily rainfall for New Zealand. J Hydrology (New Zealand), 25–44
Thatcher M, McGregor JL (2009) Using a scale-selective filter for Dynamical Downscaling with the Conformal Cubic Atmospheric Model. Mon Weather Rev 137(6):1742–1752. https://doi.org/10.1175/2008MWR2599.1
Article Google Scholar
Thomas A, McDonald A, Renwick J, Tradowsky JS, Bodeker GE, Rosier S (2023) Increasing temperature extremes in New Zealand and their connection to synoptic circulation features. Int J Climatol 43(3):1251–1272. https://doi.org/10.1002/joc.7908
Article Google Scholar
Thomassen ED, Kendon EJ, Sørup HJD, Chan SC, Langen PL, Christensen OB, Arnbjerg-Nielsen K (2021) Differences in representation of extreme precipitation events in two high resolution models. Clim Dyn 57(11):3029–3043. https://doi.org/10.1007/s00382-021-05854-1
Article Google Scholar
Trenberth KE (1976) Fluctuations and trends in indices of southern hemispheric circulation. Q J R Meteorol Soc 102(431):65–75
Google Scholar
Ullrich PA, Zarzycki CM, McClenny EE, Pinheiro MC, Stansfield AM, Reed KA (2021) TempestExtremes v2.1: a community framework for feature detection, tracking, and analysis in large datasets. Geosci Model Dev 14(8):5023–5048. https://doi.org/10.5194/gmd-14-5023-2021
Article Google Scholar
Vishwanathan G, McDonald A, Stone DA, Rosier S, Rana S, Noble C (2023) Mean and extreme precipitation over Aotearoa New Zealand: a comparison across multiple different estimation techniques. Int J Climatol 43(7):3072–3093. https://doi.org/10.1002/joc.8017
Article Google Scholar
Walsh KJ, Katzfey JJ (2000) The impact of climate change on the poleward movement of tropical cyclone–like vortices in a regional climate model. J Clim 13(6):1116–1132
Google Scholar
Yang C, Leonelli FE, Marullo S, Artale V, Beggs H, Nardelli BB, Chin TM, De Toma V, Good S, Huang B, Merchant CJ, Sakurai T, Santoleri R, Vazquez-Cuervo J, Zhang H-M, Pisano A (2021) Sea Surface temperature intercomparison in the Framework of the Copernicus Climate Change Service (C3S). J Clim 34(13):5257–5283. https://doi.org/10.1175/JCLI-D-20-0793.1
Article Google Scholar
Zelinka MD, Myers TA, McCoy DT, Po-Chedley S, Caldwell PM, Ceppi P, Klein SA, Taylor KE (2020) Causes of higher climate sensitivity in CMIP6 models. Geophys Res Lett, 47(1), e2019GL085782.
Zhang X, Alexander L, Hegerl GC, Jones P, Tank AK, Peterson TC, Trewin B, Zwiers FW (2011) Indices for monitoring changes in extremes based on daily temperature and precipitation data. WIREs Clim Change 2(6):851–870. https://doi.org/10.1002/wcc.147
Article Google Scholar

Download references

Acknowledgements

We acknowledge the World Climate Research Programme, which, through its Working Group on Coupled Modelling, coordinated and promoted CMIP6. We thank the climate modelling groups for producing and making available their model output, the Earth System Grid Federation (ESGF) for archiving the data and providing access, and the multiple funding agencies who support CMIP6 and ESGF. We acknowledge ECMWF for producing and making available the ERA5 dataset and the Copernicus Climate Data Store for archiving the data and providing access.

Funding

Funding for this work has been provided by New Zealand’s Ministry for Business, Innovation and Employment (MBIE)’s Building for Climate Change programme and is contracted through the Ministry for the Environment (MfE contract number 25679). The work is also supported by MBIE Strategic Science Investment funding (contract number C01 × 1703).

Author information

Authors and Affiliations

National Institute of Water and Atmospheric Research (NIWA), Wellington, New Zealand
Peter B. Gibson, Stephen Stuart, Abha Sood, Dáithí Stone, Ashley Broadbent & Olaf Morgenstern
National Institute of Water and Atmospheric Research (NIWA), Auckland, New Zealand
Neelesh Rampal & Hamish Lewis
Department of Physics, University of Auckland, Auckland, New Zealand
Hamish Lewis
Commonwealth Scientific and Industrial Research Organisation (CSIRO) Oceans and Atmosphere, Aspendale, VIC, Australia
Marcus Thatcher

Authors

Peter B. Gibson
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Stuart
View author publications
You can also search for this author in PubMed Google Scholar
Abha Sood
View author publications
You can also search for this author in PubMed Google Scholar
Dáithí Stone
View author publications
You can also search for this author in PubMed Google Scholar
Neelesh Rampal
View author publications
You can also search for this author in PubMed Google Scholar
Hamish Lewis
View author publications
You can also search for this author in PubMed Google Scholar
Ashley Broadbent
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Thatcher
View author publications
You can also search for this author in PubMed Google Scholar
Olaf Morgenstern
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter B. Gibson.

Ethics declarations

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gibson, P.B., Stuart, S., Sood, A. et al. Dynamical downscaling CMIP6 models over New Zealand: added value of climatology and extremes. Clim Dyn 62, 8255–8281 (2024). https://doi.org/10.1007/s00382-024-07337-5

Download citation

Received: 10 January 2024
Accepted: 05 July 2024
Published: 17 July 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s00382-024-07337-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Dynamical downscaling CMIP6 models over New Zealand: added value of climatology and extremes

Abstract

Similar content being viewed by others

Regional climate change projections from NA-CORDEX and their relation to climate sensitivity

Dynamical downscaling of regional climate: A review of methods and limitations

Dynamical downscaling over the complex terrain of southwest South America: present climate conditions and added value analysis

1 Introduction

2 Methods

2.1 Overview of experiment design

2.2 CCAM configuration details

2.3 Selection of CMIP6 GCMs for downscaling

2.4 Bias adjustment of SST/SIC-driven runs

2.4.1 SST adjustment

2.4.2 SIC adjustment

2.5 Added value metrics

2.6 Circulation regimes

2.7 Cyclone tracking

3 Results and discussion

3.1 Climatological added value

3.1.1 Precipitation

3.1.2 Tasmax

3.1.3 Tasmin

3.1.4 Summary across models

3.2 Circulation regimes

3.3 Inter-annual variability

3.4 Extreme events

3.4.1 Rainfall extremes

3.4.2 Temperature extremes

3.4.3 Summary of ETCCDI

3.5 Tropical cyclones

3.6 Observational uncertainty and caveats

4 Conclusions

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Electronic supplementary material

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation