Introduction

Wheat (Triticum aestivum L.) is a globally important source of nutrition. Climate change poses a major risk for crop production worldwide and thus new varieties adapted to specific regions are needed. Various scenarios have been described and various possible benefits and losses predicted for different crops, including spring and winter wheat. An increase in global temperature by 1 °C was predicted to result in wheat yield reductions ranging from 4.1 to 6.4%, when adaptation and CO2 fertilization effects were not considered (Liu et al. 2016). Breeding of region-specific new varieties requires genetic diversity and adaptation to changing weather conditions (Kahiluoto et al. 2019; Kaseva et al. 2023; Morgounov et al. 2018; Peltonen-Sainio et al. 2018; Zabel et al. 2021; Zhang et al. 2022). More than 100 years have passed since the beginning of systematic wheat breeding in Nordic and Baltic countries, and wheat has become the most widely grown cereal in the region. The combination of long summer days and usually adequate moisture levels makes cultivation of wheat attractive for farmers. The biggest challenge for breeders is changing climate, accompanied by frequent extreme weather events such as heatwaves, droughts, and heavy rainfalls (Mäkinen et al. 2018; Wirehn 2018). Therefore, recent spring wheat yields have been fluctuating, and to cope with future decline, varieties better adapted to changing climatic conditions will be needed in the region.

Spring wheat is known for its high quality. In Norway, spring wheat currently dominates by production (Mroz et al. 2022). The area of wheat in 2019 was 944.2 kha in Lithuania, 537.3 kha in Latvia, 180.0 kha in Estonia, and 63.6 kha in Norway. The proportion of spring wheat to total wheat area in the Baltic region, in 2021 was about 25% (FAOSTAT 2021).

Wheat yields in several European countries have stagnated since the 1990s due to changes in agricultural practices and climate, particularly drought during the stem elongation and heat stress during grain filling (Le Gouis et al. 2020). Breeding climate-resilient spring wheat varieties is an important task for the Baltic and Nordic countries to provide the food industry with high-quality grain and to ensure a sustainable food supply in the region. The current geopolitical situation further exacerbates the importance of food security. Also, for wheat germplasm used in the region, a better understanding of genotype × environment (GxE) interaction is required to identify adapted genotypes and preserve diversity. Regional collaboration will assist in that effort (Chawade et al. 2018).

Grain yield (GY) is the main target in selection, followed by several quality traits that determine nutritional value and processing attributes. Mature wheat grain consists of 55–75% carbohydrate and 10–20% protein (Gillies et al. 2012). Typically, GY and protein content (PC) are negatively correlated, hindering attempts to increase both traits (Simmonds 1995). In addition to PC, thousand kernel weight (TKW), and test weight (TW) represent basic quality traits for milling, baking and export (Koppel et al. 2020). These grain characteristics are essential indicators of processing and end-use quality (Nehe et al. 2019). New varieties should combine high, stable yield performance with the required quality traits (Chawade et al. 2018; Kyratzis et al. 2022). This can only be achieved by using appropriate parents in breeding programs (Al-Ashkar et al. 2023).

This study, carried out in the framework of the NOBALwheat project: “Breeding toolbox for sustainable food system of the NOrdic BALtic region”, was designed to identify superior spring wheat genotypes with genetic plasticity and adaptive capacity for breeding biotic and abiotic stress-resistant varieties. Here, we determine the extent of diversity present among spring wheat genotypes adapted to Northern European and Baltic climates. We characterized the variation in GY and quality traits across environments over two years in order to identify genotypes with high and stable yield potential combined with superior grain quality. These genotypes can be exploited in breeding programs for the region and beyond.

Materials and methods

Plant material, experimental design and measured traits

This study was conducted to assess the level of genetic diversity in a panel of 300 spring wheat genotypes. The experimental materials included local breeding lines and released varieties adapted to the region. The genotypes were tested over two consecutive years (2021–2022) in four countries (Lithuania, Latvia, Estonia, and Norway) with latitudes N 55°40’ to N 59°66’. Field experiments were carried out at the Lithuanian Research Centre for Agriculture and Forestry (LAMMC, Lithuania), the Institute of Agricultural Resources and Economics (AREI, Latvia), the Centre of Estonian Rural Research and Knowledge (METK, Estonia), and the Norwegian University of Life Sciences (NMBU, Norway).

The applied fertilizer level was 120 kg nitrogen (N) ha−1. Weeds were controlled using herbicides chosen according to the dominant weed species. Fungicide-dressed seeds were used to control seed-borne diseases, and 1–2 treatments of foliar fungicide were applied at each location. Insecticides were used when the control thresholds of insects were exceeded. All field management procedures were standard to the locations. Daily agro-meteorological variables were recorded in all the environments from weather stations placed at each site. Normal or long-period average climatic data (LPA, 1991–2020) were compared with the two-year trial data. Field plot sizes ranged from 5 to 7.5 m2 at the different sites. The genotypes in the yield trial were sown based on nearest neighbour analysis, alpha-lattice or a randomized complete block design with two replications.

The plots were harvested mechanically at maturity, and GY (g m−2) was expressed on a 14% grain moisture basis. Phenotypic data were collected for 13 agronomic traits: emergence date (70% seedling emergence), early vigour (ground cover in % at the beginning of tillering), heading date (70% head emergence), maturity date, plant height (at maturity), lodging (1–9 visual scale, where 1 was completely lodged and 9 represented no lodging), and grain yield (GY, g m−2), thousand kernel weight (TKW, g, 500 seeds per replicate, and test weight (TW, kg hl−1) measured by a Perten AM 5200 in Estonia, by an Infratec NOVA whole grain analyser in Latvia, and by an Infratec 1241 whole grain analyser in Lithuania. Basic grain biochemical quality traits: protein content (PC), gluten and starch contents and sedimentation values were determined using full kernels by rapid NIT or NIR reflectance spectrometers. Length of growing period (GP) was calculated in days from sowing to maturity and period to heading was in days from sowing to heading. Trial plantings were from April 19 (Norway) to May 2 (Latvia) in 2021 and from April 21 (Lithuania) to May 3 (Estonia) in 2022. The trials were harvested in the first two weeks of August (Baltic countries) and August 23 (Norway) in 2021, and in the last two weeks of August in all four countries in 2022.

Data analysis

A two-way analysis of variance (ANOVA) was used to determine the effects of treatment and environment, and their interactions on grain yield and other tested traits. Location and year effects were combined into an environmental effect.

The mean, minimum and maximum values, coefficients of variation (CV), genotypic variance and genotype x environment variation, and broad sense heritability (H2) values were calculated for all 13 traits. Broad sense heritability indicating the share of phenotypic variation explained by genotype was calculated according to Piepho and Möhring (2007) for each trait by: \({H}^{2}={\sigma }_{g}^{2}/({\sigma }_{g}^{2}+{\sigma }_{ge}^{2}/m+{\sigma }^{2}/rm)\), where \({\sigma }_{g}^{2}\) is the genetic variance, \({\sigma }_{ge}^{2}\) is the genotype x environment interaction variance, and \({\sigma }^{2}\) is the residual error variance, r is the number of replicates per trial, and m is the number of environments.

Spearman rank correlation analysis was applied to assess pairwise relationships between measured traits. Violin plots were used to show the distribution of the 13 trait values by country and year. A box plot indicating the median and inter-quartile range was presented for each country and year combination. Average values were calculated for countries and years, by countries over years, and overall means. The significance of pairwise differences between average means of the traits between years and locations was determined by Tukey–Kramer (HSD) tests.

The additive main effects and multiplicative interaction model (AMMI), combining the ANOVA for the genotype and environment main effects with principal components analysis of genotype x environment interaction (Zobel et al. 1988), was applied for analyses of trait stability. The Weighted Average of Absolute Scores (WAASB) stability index proposed by Olivoto et al. (2019) was used for estimation of stability. The index was calculated as follows \(WAASB_{i} = {\raise0.7ex\hbox{${\sum\nolimits_{k = 1}^{p} {\left| {IPCA_{ik} \times EP_{k} } \right|} }$} \!\mathord{\left/ {\vphantom {{\sum\nolimits_{k = 1}^{p} {\left| {IPCA_{ik} \times EP_{k} } \right|} } {\sum\nolimits_{k = 1}^{p} {EP_{k} } }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\sum\nolimits_{k = 1}^{p} {EP_{k} } }$}}\):, where WAASBi is the weighted average of absolute score of the i-th genotype; IPCAik is the score of the i-th genotype in the k-th IPCA, and EPk is the variance explained by k-th IPCA. A low WAASB value indicated high genetic stability. Most of the statistical analyses, including the calculation of Spearman correlation coefficients and Tukey–Kramer tests, were performed in R version 3.5.2, except AMMI, which was calculated using Agrobase software version SQL.

Results

Meteorological data

April and June 2021 were relatively dry in all countries, especially June in Estonia, when total precipitation was only 11 mm (Fig. 1). In May, the Baltic countries experienced higher precipitation than long-period averages (LPA). June and July were warmer than the respective LPA averages in all four countries (Fig. 2). June in Estonia and July in Lithuania had the largest differences between current and LPA monthly temperatures, + 4.0 °C and + 4.2 °C, respectively. As in the Baltic countries, the Norway growing season of 2021 was drier (April, June, August) and hotter (June, July) than average.

Fig. 1
figure 1

Total precipitation from April to August in 2021 and 2022 at the four study locations (NO, Norway; EE, Estonia; LV, Latvia; and LT, Lithuania) in comparison with respective long-period averages (LPA)

Fig. 2
figure 2

Average air temperatures from April to August in 2021 and 2022 at the four study locations (NO, Norway; EE, Estonia; LV, Latvia; and LT, Lithuania) in comparison with respective long-period averages (LPA)

In 2022, Estonia and Norway experienced less precipitation than the LPA in four of five recorded months. However, Lithuania was characterized by heavy rains exceeding LPA in four of five months (April–July), which led to temporary flooding of the experimental site. June in Lithuania and July in Latvia were exceptionally rainy, with precipitation exceeding LPA by 92 mm and 84 mm, respectively. The average air temperatures in spring (April, May) were colder, whereas in summer (June, August) warmer than LPA in the Baltic countries; the air temperatures in Norway were close to the LPA. Thus, the Baltic countries experienced weather extremes during the two-year trial; a severe drought in Estonia in 2021 and excess precipitation with flooding in Lithuania in 2022. In Norway, the weather was more stable.

Trait analyses and correlations

Environment was the main source of variation for all studied traits (Table 1). The effects of genotype (G), environment (E), and GxE were highly significant for all traits.

Table 1 Analysis of variance (ANOVA) of the effects of genotype (G), environment (E) and their interaction (GxE) for all studied traits

The minimum, maximum, mean, coefficient of variation (CV), genotypic variance, GxE variance, and H2 were used to characterize the variation of studied traits (Table 2). The main traits: GY, PC, TKW, TW, and GP varied 1.9, 1.4, 1.6, 1.1 and 1.1 times, respectively. High variability was also found for other traits.

Table 2 Minimum (Min), maximum (Max), mean, coefficient of variation (CV), GxE variance (σ2GE), genotypic variance (σ2G), their ratio (σ2GE2G) and heritability (H2) of studied traits

The CV of yield was higher (11.3%) than for the other traits. The CV values for phenology traits including the length of the GP, days for seedling emergence, and days to heading were much smaller, as was the case for test weight. For important grain yield and quality-related traits such as TKW and PC, the variation caused by genotype was considerably higher than the variation caused by GxE. On the other hand, the variation caused by GxE was higher than the genotypic variation for seedling emergence. These traits also showed the lowest heritability (H2) values. The H2 for other traits across environments was high, ranging from 0.88 to 0.95.

As shown in Fig. 3 and Table S1, grain yield (GY) showed a significant (p < 0.01) negative correlations with protein content (PC) (r = − 0.37), gluten content and sedimentation values (r = − 0.41), and positive correlations with TKW (r = 0.46), plant height (r = 0.45), TW (r = 0.34), starch content (r = 0.51), and GP (r = 0.26). PC was strongly positively correlated with gluten content (r = 0.98) and sedimentation value (r = 0.95), but negatively correlated with most other traits, such as starch content (r = − 0.81), TW (r = -0.52), TKW (r = − 0.52), and GP (r = − 0.51). Additionally, the average PCs for 8 environments were also positively correlated (r = 0.81*) with air temperatures during grain-fill. Genotypes with the highest PC, such as SHA3/CBRD (16.4%) and Sport (15.8%), had shorter than average GP (Fig. S1) and low GY (Fig. S2). However, high GY (599.8–622.0 g m−2) in some genotypes (DS-655–5-DH, DS-661–11-DH, DS-661–14-DH and Daugana) was accompanied by above-average PC of 13.4–13.5%.

Fig. 3
figure 3

Spearman’s rank correlation coefficients between fourteen phenotypic traits assessed on 300 spring wheat genotypes evaluated in 8 environments. Positive and negative correlations are indicated by blue and red, respectively, with the size and colour intensity representing the magnitude of the correlation

Phenotypic variability in spring wheat traits

There were considerable differences in GY between environments (Fig. 4a). GY was significantly higher in 2022 than in 2021 across all locations except Lithuania. The two-year average GYs were 501.5, 577.5, 570.9, and 563.4 g m−2 in Lithuania, Estonia, Norway and Latvia, respectively. The stretched shapes of the violin plots portray the considerable variation in GY across genotypes.

Fig. 4
figure 4

Distribution of the main trait values: (a) grain yield, (b) protein content, (c) thousand kernel weight, (d) test weight, and (e) growing period by countries (EE, Estonia; LT, Lithuania; LT, Latvia; NO, Norway) and years. For each country and year combination in background the box plot indicating the median and inter-quartile range presented in grey, bold black lines and black numerical values above the figures mark the average values by country and year (means without common superscript letters are significantly different), red dashed lines and red numerical values above the figure mark the average values for countries over years (means without common superscript letters are significantly different), horizontal dotted line and numerical value at the left side of the figure indicate the overall means. Country and year combinations without data are empty. P-values under the figure show the significance of country (pC), year (pY) and country by year interaction (pC×Y) effects for all traits

Across environments, the highest PC was 16.1% in Latvia whereas the lowest was 9.7% in Norway, both in 2021 (Fig. 4b). PC was significantly higher in 2021 than in 2022 at all locations except for Norway, where it was significantly lower in 2022. The two-year average PC values varied between 10.6 and 14.8% across locations and exhibited significant variation.

Thousand kernel weight (TKW) ranged from 33.3 to 43.5 g (Fig. 4c) and were significantly higher in 2022 than in 2021 across all locations. The two-year average TKW was significantly highest in Norway (42.0 g) and lowest in Lithuania (33.7 g). Based on the shapes of the violin plots, some genotypes had considerably higher or lower TKW values than the trait means.

The values of test weight (TW) were significantly higher in 2022 compared to 2021 for all locations, where that trait was studied (Fig. 4d). The two-year average TW differed significantly among Estonia, Latvia and Lithuania, with Estonia showing the highest values (80.9 kg hl−1). The long lower tales of the violin plots indicated low TW values, especially for some genotypes in Lithuania in 2022.

The shortest and longest mean growing periods (GP) were in Latvia, and mean GPs were significantly longer in 2022 than 2021 in all locations except Norway (Fig. 4e).

There were significant differences in plant emergence and early vigour between six environments (Fig. S3 a, b). Seedling emergence depended on the weather conditions after sowing and was shortest in Latvia in 2021 (12 days) and longest in Estonia in 2021 (23 days). Early vigour was significantly higher in 2021 in Estonia and Latvia and lower in Lithuania compared to 2022.

The average period from sowing to heading varied from 53 to 68 days in different environments depending on weather conditions (Fig. S3 c). There were significant differences between the two-year average values of the period from sowing to heading for all locations. The shortest period from sowing to heading was in Latvia (59 days).

Average plant height ranged from 74.0 to 95.4 cm (Fig. S3d); shortest in Estonia in 2021, and tallest in Latvia in 2022. The average plant height was significantly higher in 2022 compared to 2021 at all locations except Norway. The two-year average plant heights in Estonia and Lithuania were significantly shorter than those in Latvia and Norway. Violin plots of plant height showed high diversity among the genotypes. The average estimated lodging values of plants over environments varied from 5.08 to 9.00 (Fig. S4a). The most severe lodging was in 2021 in Latvia. Two-year average lodging was highest in Latvia.

Average gluten contents varied from 22.1 (Lithuania, 2022) to 35.3% (Latvia, 2021) and sedimentation values from 30.4 (Lithuania, 2022) to 63.4 ml (Latvia, 2021) (Fig. S4 b, c). Starch content was significantly higher in 2022 than 2021 at all locations (Fig. S4 d). The highest starch content was in Estonia in 2022 (69.3%) and the lowest in Latvia in the same year (63.8%).

Stability analyses of the grain yield and main quality traits

No significant relationship between GY and WAASB stability index was detected (Fig. 5). The top-yielding (658.6–669.1 g m−2) genotypes Kanyuk, Cornetto, DS-779–2-DH, and Leidi exhibited below-average stability rankings. An exception from this group was DS-674–9-DH (GY 660.6 g m−2), which ranked among the more stable ones (1.16). Genotypes DS-762–3-DH, DS-14 and DS-10–18-DH also showed high (644.4–655.3 g m−2) and stable yields, whereas the genotypes CKV 13–3, Tybalt and Uffo had lower but highly stable yields. The GY of the most stable genotype, 10 kV 3 FKV 75 (531.7 g m−2), was below average.

Fig. 5
figure 5

Relationship between average grain yield (g m−2) and weighted average of absolute scores (WAASB) for best linear unbiased prediction of the GxE interaction of 300 spring wheat genotypes evaluated in 8 environments

There was a significant negative relationship between PC and stability (Fig. 6); genotypes with the highest PC, SHA3/CBRD (16.4%), Sport (15.8%), N894037 (15.7%), Sumai #3–1 (15.6%), and Dulus (15.4%), exhibited relatively low WAASB stability rankings. Cervino (14.8%), Tjalve (14.0%) and Polkka (14.0%) combined good PC (14–15%) and high stability ranking; genotypes 10 kV 3 FKV 75 (13.0%), 15 kV 4 K31 (13.0%) and KWS Akvilon (12.6%) had the best stability rankings (3rd, 1st, and 2nd, respectively), but lower than average PC.

Fig. 6
figure 6

Relationship between average protein content (%) and weighted average of absolute scores (WAASB) for best linear unbiased prediction of GxE interaction of 300 spring wheat genotypes evaluated in 8 environments

Unlike several other traits, high TW correlated with high stability (Fig. 7). Even though genotypes with the highest TW: Vanek (82.1 kg hl−1), KWS Mistral (81.2 kg hl−1), DS-720–2-DH (80.9 kg hl−1), and Carasso (80.9 kg hl−1), exhibited average stability rankings by WAASB, there were several genotypes that combined both high and stable TW: e.g., Soliat (80.9 kg hl−1), Paros (80.8 kg hl−1), and Crickett (80.8 kg hl−1), and the most stable genotype, 990–2.

Fig. 7
figure 7

Relationship between average test weight (kg hl−1) and weighted average of absolute scores (WAASB) for the best linear unbiased prediction of GxE interaction of 300 spring wheat genotypes evaluated in 8 environments

Increased TKW was accompanied by a slight increasing trend in higher instability (Fig. 8). Genotypes with the highest TKW, such as DS-732–5-DH (47.2 g), TUI/RL4137 (47.0 g), DS-720–3-DH (46.8 g), DS-655–5-DH (45.9 g), DS-661–11-DH (45.8 g), and Parabola (45.6 g), were relatively unstable. Genotypes Irsijevskaja (43.7 g), Cornetto (43.3 g), and Telimena (42.3 g) showed stable and high TKW, whereas the most stable genotype, Krabat, had below average TKW (33.8 g).

Fig. 8
figure 8

Relationship between average thousand kernel weight (g) and weighted average of absolute scores (WAASB) for the best linear unbiased prediction of the GxE interaction of 300 spring wheat genotypes evaluated in 8 environments

Interestingly, genotypes with the highest GP stability were characterized by average length of GP: 96–54-1796 (105 days), DS-720–2-DH (105 days), and Sonett (105 days) (Fig. 9). Genotypes with the shortest GP, Manu and T2038 (both 101 days), as well as genotypes with the longest GP, DS-530–10-DH (110 days) and DS-638–5-DH (109 days), were the most unstable.

Fig. 9
figure 9

Relationship between average length of growing period (days) and weighted average of absolute scores (WAASB) for the best linear unbiased prediction of the GxE interaction of 300 spring wheat genotypes evaluated in 8 environments

Genotypes showing high yield and superior quality traits

We identified five groups of genotypes with superior gene combinations (Table 3). The first group of four genotypes combined high GY (591–669 g m−2) with high TKW (40.6–47.2 g); the second group of three cultivars combined GY and high TW (79.0–80.1 kg hl−1); the third group of five genotypes combined high GY, high TKW and high TW; the fourth group of three genotypes combined high GY with above-average PC (13.4–13.5%); and the fifth group of four lines combined high GY together with high or above-average values of PC, TKW and TW.

Table 3 Ranking of the superior genotypes based on their yield x trait x stability values

Discussion

The NOBALwheat germplasm collection consisted of commercial spring wheat varieties and breeding lines bred for the Nordic-Baltic climate. The present study focused on identification of diversity and selection of genotypes with balanced and stable grain yield (GY) combined with superior grain quality traits with the vision of these genotypes being used in future wheat improvement. The lack of genotypes adapted to diverse growing conditions is a key problem in breeding climate-resilient wheat varieties (Reynolds et al. 2021) and this research provides information to address this task.

Meteorological conditions

Two years of contrasting weather conditions, especially in the Baltic countries, contributed to the identification of genotypes adapted to different weather conditions. As periods of drought and excessive precipitation are predicted to increase in the Baltic countries there is a need to identify genotypes adapted to such conditions (Maciulyte et al. 2023; Rimkus et al. 2020; Tammets 2010). In Norway, the weather conditions were less fluctuating as was also found in a long-term study (Mohammadi et al. 2023). Among the important factors affecting yield and other traits, weather conditions must be considered when breeding high-yielding, drought-tolerant cultivars (Chawade et al. 2018). Nóia Júnior et al. (2023) emphasised the need to breed wheat varieties with an adaptive capacity for cultivation in more variable environments.

Weather conditions in 2022 favoured spring wheat growth and yield formation in Estonia, Latvia and Norway. However, Lithuania experienced heavy rainfall (seven daily events > 20 mm) during critical wheat development phases (before heading and during grain-filling). At the same time, Estonia escaped excess precipitation events (no daily events > 20 mm) and, therefore, some genotypes achieved grain yields higher than 9 t ha−1. The situation was reversed in 2021, when all four countries had dryer than average conditions, especially during June in Estonia. The fluctuations in the weather during the trial years confirmed the increase in the extremes of weather conditions in recent years.

Trait analyses and correlations

To make crop improvement the initial parental material should have appropriate genetic variability (Iqbal et al. 2022). We found considerable diversity of GY, quality and other traits within the NOBALwheat panel that consisted of locally adapted germplasm. GY and protein content (PC) are critical traits that largely determine the economic value of grain. Together with thousand kernel weight (TKW) and test weight (TW), these are major wheat selection traits in breeding (Koppel et al. 2020; Koppel and Ingver 2008).

An almost two-fold difference in GY (352–669 g m−2) was detected in the study whereas the PCs were generally high (11.7–16.4%). Even wider variation in GY and PC was reported for spring wheat by Koppel et al. (2020), but in other European studies the variation was similar to the present results (Knapp et al. 2017; Kyratzis et al. 2022; Nehe et al. 2019). Variation in TKW between genotypes was higher than for TW as commonly found in other studies (Knapp et al. 2017; Kyratzis et al. 2022).

Despite a considerable difference between the minimum and maximum values of the length of the growing period (GP) (101–110 days), variation in this trait was low. In our study, typical correlations between GY and several other traits were confirmed, such as: (1) an inverse relationship between GY and PC (Guardia-Velarde et al. 2023; Koppel et al. 2020; Lama et al. 2023; Mroz et al. 2022; Tanin et al. 2022); and (2) a positive correlation between GY and TKW (Al-Ashkar et al. 2023; Baye et al. 2020; Guardia-Velarde et al. 2023; Mroz et al. 2022; Tanin et al. 2022). In contrast to Guardia-Velarde et al. (2023), our results indicated a positive relationship between GY and TW.

G × E variations were significant for all traits (Table 1). Significant G × E effects for GY and main quality traits (PC, TW, and TKW) were also identified in other studies (Guzmán et al. 2016; Kyratzis et al. 2022; Nagarajan et al. 2007). In contrast, Nehe et al. (2019) found no significant impact of GxE for GY and quality traits, except for TW. Al-Ashkar et al. (2023) reported significant effects of GxE on GY, accounting for 20% of the total variation. For TKW, the GxE variance accounted only for 18% of genotypic variance (Table 2), thus, genotypes responded more similarly to the variable environmental conditions. For GY, PC, and TW, the G × E variance accounted for a greater proportion of the GxE and G variance ratio.

Heritabilities (H2) were high (0.88–0.95) for GY, PC, TKW, and TW, indicating that these traits could be easily selected in breeding. High or even higher H2 for GY and TW was reported by Kumar et al. (2014), whereas lower H2 values of GY, PC and TW were obtained in Kyratzis et al. (2022) and Yabwalo et al. (2018).

Phenotypic variability of spring wheat traits

The largest difference in average GY over two years was for Estonia, where the yield of the drought year of 2021 was only 60% of the yield for 2022. In Lithuania, the yield potential was not realized because of insufficient moisture in 2021, and excess precipitation in 2022 that led to short-term flooding. Zampieri et al. (2017) stated that wheat can be even more sensitive to water excess than to drought. Some genotypes in our panel were better adapted to extreme weather conditions and produced high GYs also in adverse weather conditions. Even greater yield differences between environments were obtained by Kyratzis et al. (2022), where the average GY of twenty durum genotypes exceeded 7,000 kg ha−1 under favourable conditions and was less than 2,000 kg ha−1 under stress during grain filling. High differences in GY (1.3–5.2 t ha−1) among 299 wheat genotypes between 11 environments across the USA were reported by Grogan et al. (2016). Subira et al. (2015) found more than four times and Knapp et al. (2017) two times differences in wheat GY between high and low yielding environments.

Several studies have demonstrated the negative impact of extreme weather events on yield and quality traits in wheat. For example, recent results by Sareen et al. (2014) indicated that heat, drought, or combined stress reduced wheat yield by 17–49%, whereas Nóia Júnior et al. (2023) reported a yield reduction of up to 72% in France in 2016, caused mainly by excess precipitation in combination with diseases, low solar radiation, and anoxia.

PC is largely influenced by weather conditions and this was also the case in our study, where PC was higher in 2021 in the Baltic countries due to drier than average conditions. In Norway PC remained low in both years as growing conditions during grain filling favoured formation of high TKW and a high proportion of starch. Similarly, Knapp et al. (2017) demonstrated an increase in PC from 11.9–15.4% in Swiss and 13.8–16.4% in Austrian datasets due to variable growing conditions, and Sattar et al. (2020) reported an increase from 8.8 to 14.6% caused by combined heat and drought stress.

In contrast to PC, TW and TKW were lower due to drought stress in the locations where these traits were assessed. A decrease of up to 52% in TKW was attributed to heat and drought stress by Sattar et al. (2020). A reduction of TKW by 15–34% caused by heat, drought, or combined stress was reported by Sareen et al. (2023). Earlier study by Sareen et al. (2014) showed a reduction in TKW of 14.1% under heat stress conditions, but even a 1.1% increase occurred under drought conditions due to a compensatory effect offset by the decrease in number of kernels per spike.

Variation in TW remained between 73.9 and 82.1 kg hl−1 and in TKW between 32.3 and 43.5 g in our trial. In the study of Knapp et al. (2017), TW displayed variation in the same magnitude (77.0–84.5 kg hl−1 in Swiss and 77.5–82.7 kg hl−1 in Austrian datasets).

GP was shorter in drier conditions in the Baltic countries than in Norway, where weather conditions were more stable and average GPs were similar in both years. Similarly, GP was shown to decrease by 4.5 and 10.7% in drought and heat stress conditions by Sareen et al. (2017).

Stability of grain yield and main quality traits

Stability of yield and quality across environments is an important breeding objective. The AMMI model based WAASB index proposed by Olivoto et al. (2019) was used in data processing to identify genotypes with stable yield and quality traits. The highest yielding genotypes did not show the highest yield stability, and there was no correlation between GY and its stability. However, we found genotypes that combined high yield with high stability. These genotypes (CKV 13–3, Tybalt, and Uffo) achieved close to 6 t ha−1 in yield and were ranked among the five most stable ones. Similarly, varieties with the highest GYs were not among the more stable ones when assessed by AMMI stability values (ASV) in the study by Al-Ashkar et al. (2023). Grogan et al. (2016) found a negative relationship between stability and GY of winter wheat in a multi-location trial involving numerous genotypes.

Although there was a significant negative correlation between PC and its stability we identified three genotypes (Cervino, Tjalve, and Polkka) that combined high PC (14–15%) and high stability. However, Tanin et al. (2022) found no clear relationship between the level and stability of PC.

There was a negative correlation between TKW and its stability, and the genotypes with the highest TKW showed lower than average stability in this trait. The best-balanced genotypes by the level and stability of TKW were Irsijevskaja, Cornetto, and Telimena. Contrary to our results, Ali (2017) reported a slight positive trend between the TKW and AMMI ASV. Some genotypes with high TKW also had good stability. Karimizadeh et al. (2012) recorded high stability of durum genotypes with small, medium, and high TKW values, but there was no relationship between stability and TKW among genotypes. Lama et al. (2023) reported that some spring wheat genotypes with the highest stability had the lowest mean values for TKW.

Contrary to PC, TW stability increased along with TW. Despite this, the genotypes with the highest TW did not show the highest stability. Genotypes Soliat, Paros, and Crickett were characterized by high stability and TW. Yabwalo et al. (2018) likewise found some spring wheat varieties combining high TW combined with high stability; other varieties had high TW but low stability across 10 environments. Karimizadeh et al. (2012), reported a negative trend between AMMI stability and the level of TW, although the most stable genotype also had a high TW.

Interestingly, the stability of GP was highest for the genotypes characterized by average length of GP. As the GP shortened or extended, there was a clear reduction in stability. GP varied significantly depending on temperature and rainfall in the long-term study (Kheiri et al. 2021). Differences between the stability of grain filling duration of cultivars in a multi-environment experiment using AMMI ASV stability analyses were shown by (Wu et al. 2018).

Genotypes with high yield and superior quality traits

An essential aim of breeding is to combine high yield potential with superior quality traits. Most high-yielding genotypes in this study had lower PC, whereas TKW and TW correlated positively with yield. In the studied NOBALwheat panel of 300 genotypes, we found some lines and varieties that had high yield combined with one or more important quality traits.

Conclusions

Weather conditions were variable during the two study years, especially in the Baltic countries. Extreme weather events such as severe drought and excessive rainfall occurred, consistent with the general trend towards a less predictable climate in the region. The yield and quality traits of spring wheat genotypes were highly variable, mainly due to the differing environments. GxE for all studied traits was highly significant. However, we identified genotypes that were consistent in producing high grain yield and quality under variable conditions. These genotypes have a high potential to adapt to a changing climate. Whereas grain yield and yield stability were not correlated we were able to identify promising genotypes with high, stable yield and above-average quality. Based on the results we selected 19 genotypes with high yield and superior quality traits. Taken together, a collaborative multi-national approach comprising a large panel of genotypes from the Baltic countries and Norway phenotyped across the region enabled selection of a number of potential breeding parents to address future Northern European climatic conditions and thereby improve wheat production in the region.