Background

Ethanol is a biofuel produced by Saccharomyces cerevisiae via the fermentation of hexose sugars derived from feedstocks, such as corn, sugarcane, and beet [1]. Further, this yeast species is the dominant organism driving fermentation during the manufacture of historically and economically important beverages, including wine, beer, and sake [2]. One of the reasons that S. cerevisiae predominates in alcoholic fermentations is related to its metabolic preference for ethanol production over respiratory processes, even in the presence of oxygen (i.e., the Crabtree effect). In addition, S. cerevisiae has an exceptional tolerance to life-restricting ethanol levels that occur toward the end of fermentations, potentially allowing this species to outcompete other microorganisms for ecological dominance [2].

Ethanol injures organisms primarily by disrupting the phospholipid structure of the cell membrane, which increases permeability and dissipates the electrochemical gradient [3, 4]. Damage to the cell membrane hinders the function of protein transporters and uptake of nutrients. Moreover, ethanol causes inner-cell damage by denaturing proteins and impairing mitochondrial function, provoking the formation of reactive oxygen species [3,4,5]. Furthermore, ethanol has mutagenic effects by interfering with the DNA replication system [6]. Despite an intrinsically high alcoholic tolerance, S. cerevisiae strains used in industrial processes are inevitably exposed to the detrimental effects of ethanol build-up at the final stage of fermentation, which inhibits yeast metabolism and thereby ethanol production [7, 8]. Therefore, industrial yeast strains with higher ethanol tolerance could more efficiently convert sugar into alcohol, improving ethanol yields while lowering distillation costs and reducing the waste footprint [7,8,9,10]. For a large-scale ethanol plant operating daily with fermentation vats having capacities > one million liters, even a minimal gain in ethanol yield per fermentation batch represents a considerable increment in annual production [9].

To increase such economic gains provided by alcohol-tolerant yeasts, we intended to design selection screens for isolating strains that are more suitable for the bioethanol industry. Consequently, how ethanol tolerance is defined and quantified should be considered [11, 12]. Evaluating the growth capacity of yeast cells in the presence of high ethanol concentrations is the most commonly used approach for assessing alcoholic tolerance [11,12,13]. Cell propagation under restrictive ethanol levels provides a practical selection method, permitting estimation of alcoholic tolerance via spot assays, maximum specific growth rates (µmax) on growth curves, or competition assays against reference strains. However, higher growth rate in the presence of high ethanol concentrations has only shown a weak to moderate correlation with a higher ethanol production capacity in fermentations, indicating that the two phenotypic traits have a partially distinct genetic basis [7, 10]. Furthermore, strains displaying tolerance to moderate ethanol levels (e.g., 6–8% v/v) under laboratory conditions, sometimes do not exhibit the best performance under higher ethanol concentrations (12%–17% v/v) reached in industrial fermentations [7]. This implies the existence of various genetic factors controlling yeast tolerance to distinct ethanol levels [7, 13]. Another drawback is that several studies conducted to assess ethanol tolerance have used laboratory strains that are usually stress sensitive, making it difficult to extrapolate the results to industrial yeasts [11].

The ability to produce ethanol under very high gravity fermentations (reaching ethanol titers of 15%–17% v/v) is a more realistic proxy for assessing ethanol tolerance under industrial conditions [10, 11]. However, using high ethanol production capacity as a benchmark in selection screenings is rather unpractical because it requires fermentations lasting for several days and quantification of ethanol accumulating in several individual strains [10]. Alternatively, proliferation at moderate (e.g., 10%–12% v/v) [7] and high (17% v/v) [8] ethanol concentrations has been used for selecting strains demonstrating high ethanol accumulation. Through these various approaches, quantitative trait loci (QTLs) mapping studies have revealed the genetic basis of maximum ethanol production capacity in yeasts by identifying specific causative alleles related to genes MKT1, APJ1, VPS70, and KIN3 [8, 10, 14].

Another commonly used parameter to evaluate ethanol tolerance is cell survival at prohibitive ethanol concentrations [7, 11, 12]. Ability to survive a few hours of exposure to very high ethanol levels (hereafter referred to as ethanol shocks) provides an easy method for screening tolerant yeasts [7, 15]. Selection schemes using ethanol shocks can be applied in bulk, and high lethality facilitates selective sweeps, wherein adaptive genetic variants swiftly reach dominance in a population [15, 16]. However, the increase in fitness for survival to harsh treatments may compromise the fitness in another condition [15,16,17]. Adaptation to severe stresses is even more prone to trade-off effects associated with low growth fitness; cells tend to reallocate energy resources, normally used for propagation, to overcome a life-threatening condition [18,19,20]. Therefore, the main challenge of genetic screenings and ALE schemes is isolating stress-resistant yeasts strains that exhibit good growth and fermentation performance [7, 15, 17, 21].

Herein, we used an ALE protocol that combined ethanol shocks with a growth recovery period to balance the selection of strains able to tolerate acute ethanol treatments with minimal losses in growth fitness. ALE is used to isolate alcohol-tolerant yeasts, mostly via propagation in the presence of ethanol [13, 22,23,24,25,26,27,28,29,30,31]. Although ALE experimental designs also used drastic ethanol treatments [32, 33], numerous questions regarding the outcome and utility of this selection method remain unanswered. These questions pertain to the extent to which attained tolerance to ethanol shocks causes a decline in growth fitness. Furthermore, much information regarding the mutational spectrum underlying tolerance to drastic ethanol treatments remains unknown. For example, do all the adaptive alleles for survival in ethanol shocks negatively impact growth rates, or do some mutations benefit growth in the presence of ethanol, supporting higher ethanol yields during fermentation? Another important point to interrogate is whether yeast selection via harsh ethanol treatments could confer robust strains with superior performance in bioprocesses involving various stress factors; this question is pertinent to the Brazilian bioethanol production from sugarcane substrates, which involves high cell densities wherein yeasts are exposed to fluctuations in osmotic pressure, temperature, microbial contamination, toxic compounds, and a sharp sulfuric acid shock intercalating sequential fermentation cycles [1, 34, 35].

Results

Yeast ALE with drastic ethanol treatments

We created four haploid ALE populations (P1–P4, Fig. 1A) of the Brazilian bioethanol strain S. cerevisiae PE-2 [34]. P1, P2, and P3 are replicate MATa populations derived from a colony propagated from the progenitor PE-2_H4 (Additional file 1: Table S1), whereas P4 originated from an MATα spore resulting from a cross between PE-2_H3 and PE-2_H4 [36]. For ALE, P1–P4 populations were subjected to several cycles consisting of ethanol shocks for 2-h followed by recovery periods of 2–4 days growth in liquid yeast extract-peptone plus 2% sucrose (YPS) (Fig. 1A). For each ethanol shock, 1 mL (about 108 cells) from the previous recovery growth was transferred into a 1.5 mL microtube, pelleted by centrifugation, washed, and resuspended in 1 mL PBS 1X buffer with ethanol (beginning with 19% v/v) to initiate a 2-h treatment at 32 °C. After the ethanol exposure, cells were centrifuged and the pellet was washed and resuspended for transferring the content into 20 mL YPS for a 2–4 days recovery growth (32 °C, 60 rpm). After recovery, 1 mL of cells was taken to the next shock/recovery cycle. To challenge the yeast adaptation, the 2-h shocks entailed progressive increments of ethanol concentration from 19% up to 30% (v/v) (Fig. 1B). At the end of the evolution experiment, a single colony (clone) was selected from solid YPS medium cultures (plus 8% v/v ethanol) of each population (Fig. 1A). Isolated clones P1c–P4c were subjected to whole-genome sequencing (WGS), and P1c–P3c were phenotypically characterized. The isolated P4c was excluded from most fitness analyses as the P4 population evolved in a snowflake-type aggregation (see below), which precluded the detection of individual cells through flow cytometry and colony-forming assays.

Fig. 1
figure 1

ALE experiment with ethanol treatments. A The scheme of ALE wherein populations P1–P4 were subjected to repeated cycles of 2-h ethanol shocks and outgrowth. At the end of the experiment, one colony from solid medium with 8% (v/v) ethanol was isolated from each population for WGS. B The increase in ethanol concentrations (red) is plotted according to the number of shock/recovery cycles for each population. C The progenitor (Prog.), P1c, P2c, and P3c were propagated in YPS for five cycles, then separately mixed in equal proportions with the reference tester-GFP. A competition assay was performed through one cycle of ethanol shock (24% v/v for 2 h) and growth recovery in YPS. Flow cytometry recorded the proportion of cells for each competitor just before the ethanol shock (Start) and at the stationary phase following recovery growth (End). D Selection coefficients (S) were calculated for nonadapted competitors (panel C, S2 Table) and for cells adapted by five progressive ethanol treatments (EtOH-adapted). The S > 0 indicates that the cell numbers of evolved clones largely exceeded those of the progenitor at the competition end point. S was normalized with data obtained for the progenitor. E For cellular viability estimation, after 19% (v/v) ethanol shocks, cells were diluted and plated into solid medium for counting the resulting colonies. For trehalose quantification, postshocked cells were recovered in YPS. Quantification was obtained from 108 cells by the enzymatic conversion of trehalose into glucose. (*) p < 0.001, one way ANOVA followed by Bonferroni post-test for multiple comparisons

To test whether the increase in ethanol tolerance observed during ALE (Fig. 1B) was a stable trait, we propagated P1c, P2c, and P3c over five passages in YPS without ethanol. Then, competition assays were performed wherein the progenitor and evolved clones P1c–P3c were separately mixed in equal proportions with a green fluorescent protein (GFP)-expressing PE-2_H4 strain (tester-GFP). The competitors were exposed to a 24% (v/v) ethanol shock for 2 h, followed by a recovery in YPS until the stationary phase. The proportion of each competitor at initial and final points of the assay was determined using a flow cytometer (Attune NxT, Thermo Fisher Scientific) (Fig. 1C and Additional file 1: Table S2). The ALE progenitor showed a slight decrease in proportion when competing with the tester-GFP, whereas P1c–P3c outcompeted the tester-GFP by a large margin (Fig. 1C). The selection coefficients, expressing the relative fitness of the evolved clones to the progenitor (see methods for calculations), demonstrated a pronounced improvement during evolution (Fig. 1D). We repeated the fitness assays by subjecting all competitors to a preacclimation (i.e., progressive exposures to 15%, 15%, 20%, 20%, and 22% v/v ethanol) before the 24% (v/v) ethanol shock. After ethanol treatment and recovery, the selection coefficients for preadapted P1c–P3c were only slightly higher than those of the nonadapted clones, and only significant for P1c (Fig. 1D). Ethanol tolerance of the evolved clones was also characterized by a marked increase in survival rates following 2-h treatments with 19% (v/v) ethanol (Fig. 1E). Interestingly, following exposure to ethanol, trehalose accumulation by P1c–P4c demonstrated similar improvements as their postshock fitness and survival rates, indicating a common mechanism for ethanol tolerance (Fig. 1E). For example, trehalose content was highest in P1c, which also displayed the highest survival rate and selection coefficient following the ethanol shock.

WGS of evolved clones

Illumina MiSeq platform was used for sequencing the genomic DNA of P1c, P2c, P3c, and P4c. Variant calling for P1c, P2c, and P3c was accomplished using the PE-2_H4 genome as a reference, whereas the P4c read mappings to both PE-2_H3 and PE-2_H4 parental genomes [36] were used to identify de novo single nucleotide polymorphisms (SNPs) not present in either reference genomes. We cataloged 46 mutations in total (Fig. 2A and Additional file 1: Table S3). We identified potentially adaptive alleles related to genes that were mutated in the various populations (Fig. 2A–C). The gene ATH1 (encoding the acid trehalase [37]) had defective mutations in P1c, P2c, and P3c (Fig. 2A–C). CYR1 (adenylate cyclase [38]) exhibits single nucleotide polymorphisms in P1c and P2c, whereas PTR2 (Peptide TRansport 2 [39]) is mutated in P3c and P4c. MDS3 (Mck1 Dosage Suppressor 3 [40]), ROM2 (Guanine nucleotide exchange factor for Rho1p and Rho2p [41]), and USV1 (a C2H2 zinc finger transcription factor [42]) display single nucleotide variants in P1c and P3c. Various mutated genes acting on the same metabolic or signaling pathway imply parallelism. Mutations affecting genes related to the cAMP/PKA pathway such as CYR1, MDS3 and its paralog PMD1 [40], RAS2 (GTP-binding protein regulator of Cyr1p [43]) and IRA2 (GTPase activator that negatively regulates Ras1/2p [44]) are dominant in our dataset. In addition to the three inactive ath1 alleles, NTH1 (encoding the neutral trehalase) had a frameshift mutation in P3c, indicating that blocked trehalose catabolism was under selection during ALE. This criterial triage of mutations was important for guiding our reverse-engineering program described below.

Fig. 2
figure 2

WGS data from P1c, P2c, P3c, and P4c. A Venn diagram displaying unique mutated genes found for each clone, and genes that had mutations in more than one clone (intersections between circles). Highlighted in bold back fonts are genes whose mutated alleles are important for this study. B Sanger sequencing of selected alleles in progenitors, final, and intermediate populations. Gene names in uppercase indicate the alleles found in progenitors. Gene names in lowercase indicate the mutated alleles at the point they were first detected by Sanger sequencing. When progenitor and evolved alleles are depicted, the evolved alleles are at an intermediate frequency on that sampled point. C List of key mutated genes and their encoded proteins important for further analysis in this work

We confirmed a selected set of mutations via Sanger sequencing of PCR fragments amplified from the progenitor and final populations (Fig. 2B and Additional file 2: Appendix S1). With the exception of the ira2 allele (observed in the P3c, but not in the P3 final population), we confirmed that all mutations were present in the corresponding final populations, whereas, as expected, progenitors had wild-type alleles. Furthermore, we Sanger sequenced the same loci in intermediate ALE populations sampled following ethanol shocks 20, 40, and 60 (Fig. 2B and Additional file 2: Appendix S1), allowing us to identify when the mutation became dominant in the population (i.e., there was a unique peak in the corresponding position on the chromatogram) or when alleles had intermediate frequencies in the population (i.e., overlapping peaks corresponding to the wild-type and evolved alleles were present on the chromatogram). Generally, in ALE experiments, mutations related to large fitness effects appear early during ALE experiments [15, 16]. Therefore, in our evolution experiment, cyr1, mds3, rom2, ras2, and ptr2 alleles (all present at shock 20) are the most likely drivers of ethanol tolerance (Fig. 2B). Interestingly, although three ath1 defective alleles appeared in different populations, they emerged only later during ALE (Fig. 2B), suggesting that the earlier mutations (e.g., cyr1, mds3, and pmd1 that affected the cAMP/PKA pathway and increased trehalose accumulation) could have potentiated their appearance.

Evolved clones increased Msn2/4-mediated stress responses

Mutations affecting the cAMP/PKA pathway occurred in all four ALE populations. Disruptive mutations, such as those present in RAS2 (P4), MDS3 (P2 and P3), PMD1 (P3), and nonsynonymous mutations in CYR1 (P1 and P2), can downregulate protein kinase A (PKA) and its downstream responses (Fig. 3A) [38, 40, 45, 46]. If PKA is inhibited, transcription factors Msn2/4 are retained in an unphosphorylated form and shuttle into the nucleus to drive the expression of stress-responsive genes [46]. One such stress-regulated gene is HSP12, which encodes a heat-shock protein, whose expression when fused with GFP has been used as a standard biosensor to measure Msn2/4 regulation (Fig. 3A) [45]. We introduced a chromosome-integrated HSP12-GFP biosensor into P1c, P2c, P3c, and in the parental PE-2_H4 backgrounds. Furthermore, the biosensor was introduced into a strain reverse-engineered with the P1c allele cyr1A1474T (Fig. 3B–D and Additional file 3: Fig. S1A-D). Relative to the PE-2_H4 reference, the P1c–P3c displayed a constitutive upregulation of the HSP12-GFP expression; the expression was more pronounced in the P2c (Fig. 3B and C, Additional file 3: Fig. S1). Moreover, the cyr1A1474T reverse-engineered strain exhibited a slightly higher HSP12-GFP expression than that of its reference PE-2_H4. The higher GFP expression in evolved clones than in the parental strain seems to be constitutive, and not dependent on the ethanol treatment (Fig. 3D), supporting the hypothesis that evolved clones, carrying mutations in CYR1, RAS2, MDS3, and PMD1 (Fig. 2), display a lower PKA-activation phenotype wherein Msn2/4p is activated and upregulates the expression of stress-responsive genes [38, 40, 45,46,47]. The boost in trehalose accumulation, demonstrated by the higher content of trehalose observed in the evolved P1c–P4c (Fig. 1E), is typical of strains with low PKA activity [38, 47].

Fig. 3
figure 3

Upregulation of Msn2/4p-mediated stress responses in evolved clones. A Scheme depicting the HSP12-GFP biosensor to measure Msn2/4p-mediated stress responses. Msn2/4p binds to STRE elements in the HSP12 promoter, inducing biosensor expression. Components of the cAMP/PKA pathway controlling the Msn2/4p function have been shown. Mutations, found through WGS of evolved clones, affect molecular components controlling the PKA, putatively resulting in upregulation of Msn2/4p-mediated transcription. The biosensor was integrated into the P1c, P2c, P3c, and cyr1A1474T backgrounds. B Time course of GFP fluorescence signal, obtained at the stationary (sta) and logarithmic (log) growth phases, during four serial transfers in YPS (without ethanol). Fluorescence measurements are expressed as log2 of fluorescence fold changes (FC) relative to the PE-2_H4 signal at the same time point. C Time course of GFP fluorescence along 24 h in cells propagating in 8% (v/v) ethanol. Data represent log2 FC relative to the PE-2_H4 fluorescence at the same time point. D Relative fluorescence FC at 8% (v/v) ethanol were obtained by comparing with the cells propagating without ethanol at the same time point. Values indicate the mean of measurements obtained from three independent replicates

Gains in tolerance to ethanol shocks involve growth fitness losses

We reverse-engineered key evolved mutations into the parental background PE-2_H4 (Additional file 1: Tables S1 and S4; Additional file 4: Text S1). Loss-of-function mutations (i.e., premature stop codons or frameshift mutations) were mimicked by insertional disruption of genes with the MX cassette [48]. In other instances, coding frame deletions were edited using the CRISPR/Cas9 EasyGuide method [49]. Furthermore, point mutations were introduced via CRISPR/Cas9. Thus, series of single- and double-mutants was constructed and tested via three different competition assays against the tester-GFP (Fig. 4A) (Additional file 1: Table S5).

Fig. 4
figure 4

Ethanol tolerance of ALE clones and reverse-engineered strains. A Scheme showing three assays wherein a strain under test was competed against a GFP-marked parental PE-2_H4 (tester-GFP). Competition 1 was based on 20% (v/v) ethanol treatments for 2 h followed by recovery growth and flow cytometry measurements. Competitions 2 and 3 involved propagations in liquid YPS medium without ethanol (approximately 8–10 doublings) and with 8% (v/v) ethanol (approximately 5–7 doublings), respectively. B Initial and final ratios of competitors (S5 Table) were used to calculate the selection coefficients (S) of each tested strain for assays 1 (left), 2, and 3 (right) normalized with S obtained for competitions of the reference strains ALE progenitor (Prog), PE-2_H4 (H4), and PE-2_H4::kanMX (H4::MX) against the tester-GFP. Values were obtained from three replicates. For competitions 2 and 3 the S values are expressed per cell doubling. A positive S indicates that the strain under test has a higher fitness than its parental. (*) p < 0.05, one way ANOVA followed by Bonferroni post-test for multiple comparisons. C Microplate growth assays in 8% (v/v) ethanol and normal YPS medium. D Maximum specific growth rates (µmax) obtained in 8% ethanol and without ethanol were plotted. The strain cyr1A1474T/usv1Δ is highlighted (red). E Plotting of µmax vs. maximum optical density (ODmax) at the plateau of growth curves in 8% (v/v) ethanol

For competition assay 1, we used a cycle of 20% (v/v) ethanol shock for 2 h, followed by a recovery growth in YPS (Fig. 4A and B). Under this condition mimicking our ALE protocol, evolved P1c, P2c, and P3c, as well as individual mutants cyr1A1474T, mds3::MX, and ras2::MX that in different studies have been related to the cAMP/PKA pathway [38, 40, 45, 46], exhibited higher fitness (i.e., positive selection coefficient, S > 0) than their respective parentals (Fig. 4B). The ath1Δ deletion and usv::MX disruption were also adaptive. A combination of cyr1A1474T plus ath1::MX displayed an additive effect and cyr1A1474T/usv1Δ double-mutant demonstrated a slightly though not significantly better fitness than cyr1A1474T alone. However, no positive fitness contributions were observed for rom2G440R and ptr2W362G even when they were combined with other alleles. These observations were contrary to our expectations, as ROM2 and PTR2 displayed mutations that emerged early during ALE in two different populations (Fig. 2A and B).

In addition to testing yeast tolerance to a transient 2-h pulse of high alcohol concentration (20% v/v), we probed adaptation for propagation in constant ethanol exposure. Competition assay 2 involved inoculation of competitors in YPS at an 8% (v/v) ethanol concentration and propagation for up to 3 days until the stationary phase (Fig. 4A and B). Furthermore, we analyzed the growth for 24-h wherein competitors were inoculated in YPS without ethanol (competition assay 3). Most evolved clones and reverse-engineered strains had a notable fitness loss when propagated in the presence of 8% (v/v) ethanol and also without ethanol (Fig. 4B). Despite being highly adapted to ethanol shocks, the evolved clones P1c–P3c and mutants cyr1A1474T/ath1::MX, cyr1A1474T/mds3Δ, cyr1A1474T/rom2G440R, ras2::MX, and mds3::MX exhibited a pronounced decline in fitness when propagated under 8% (v/v) ethanol, and most of them also had a decreased performance when grown in normal YPS. These observations indicate trade-off effects for mutations that cause adaptation to transient and drastic ethanol treatments; i.e., although mutations promote cell survival in a stressful environment, they imposed a fitness cost for propagation under mild or nonstressful conditions.

However, cyr1A1474T, usv1::MX, ath1Δ, and cyr1A1474T/usv1Δ maintained growth fitness similar to that of their parental strains, indicating that an equilibrium between acute stress tolerance and propagation fitness might be attainable. To gain further insights into the tested strains, we performed microplate growth assays involving the PE-2_H4 parental, the evolved clones P1c–P3c, and the best-performing reverse-engineered strains that could balance ethanol shock tolerance with cell-doubling fitness (Fig. 4C and D, Additional file 1: Table S6). The ath1Δ mutant displayed excellent growth rates in YPS with and without ethanol; however, it had a very low maximal optimal density (ODmax) in the presence of ethanol. In contrast, the evolved clone P1c demonstrated superior biomass yield at 8% (v/v) ethanol, a selective trait that was presumably advantageous during the ALE postshock recovery growth until the stationary phase. We recorded an optimal performance for strain cyr1A1474T/usv1Δ, which presented the highest specific growth rate (µmax) at 8% (v/v) ethanol; in normal YPS it maintained a µmax slightly below than that observed for the parental PE-2_H4. These results indicated that our screening with ethanol shocks could select alleles that, in combination, improved yeast growth in the presence of ethanol.

Population P4 evolved a snowflake phenotype driven by BUD3 disruption

Yeast populations that form multicell aggregations displaying a “snowflake” appearance result from mutations that preclude the correct separation of mother-daughter cells following mitotic division [50, 51]. Generally, these mutations are related to the RAM network that involves the Ace2p transcription factor [52, 53]. Cell-to-cell attachments nucleating large clusters of yeasts often emerge in laboratory evolution experiments and in industrial settings from selections for increased sedimentation rates [50,51,52], or as adaptations to withstand various stresses [53]. We observed that the P4 population displayed a snowflake-type phenotype that became apparent by ethanol shock/recovery cycle number 40 (Fig. 5A). This coincided with the dominance of the bud3F851fs loss-of-function allele caused by a single nucleotide deletion (Fig. 5A and B, Additional file 1: Table S2). Although this mutation was present at low frequency at sampled point 20, it prevailed following ethanol treatment 40 (Fig. 5B and Additional file 2: Appendix S1). Bud3p acts as a determinant for axial bud site selection and localizes to the bud neck contractile ring during mitosis [54, 55]. By mimicking the ras2 and bud3 loss-of-function alleles which prevail by shock 20/40 we observed that only the bud3 insertional knockout generated an aggregative phenotype resembling the one observed for P4 (Fig. 5C). Interestingly, the diploid bud3::MX knockout did not display the snowflake phenotype (Additional file 3: Fig. S2); this was in accordance with the role played by Bud3p only in haploid cells wherein the component helps to set a landmark for an axial budding pattern, while diploid cells undergo a different bipolar budding program [54, 55]. Clusters of cells in P4 may represent a collective adaptation to ethanol stress; i.e., yeasts inside the cluster may be more protected from the damaging effects of alcohol, which predominantly affect the cell plasma membrane [56, 57].

Fig. 5
figure 5

Snowflake phenotype related to P4. A Time course of ethanol shocks with population P4 and dominance of the snowflake phenotype at sampled point 40. B Although the mutation 2553DelT (bud3F851fs) was already present at sampled point 20, it became dominant by ethanol shock/recovery number 40. C Genetic underpinnings of the snowflake phenotype. The loss-of-function mutations ras2::MX and bud2::MX were constructed in the PE-2_H4 background. The cell-aggregation phenotype is associated with the insertional mutant bud3::MX, but not with ras2::MX. Magnification, × 40

Evolved clone P1c and the double-mutant cyr1 A1474T/usv1Δ display optimal balance between ethanol tolerance and fermentation performance

Our screening involving transient ethanol shocks selected strains that were tolerant to an acute stress condition. The Brazilian bioethanol production system, from sugarcane juices and molasses, exposes yeasts to a dynamic fluctuation of stresses during fermentations in industrial vats [1, 34, 35]. These stresses include osmotic pressure, heat, toxic compounds, and build-up of ethanol concentration at the end of the process. Fermentation with cell recycling, used in the Brazilian ethanol production, adds a particular stressful factor; following each fermentation cycle, yeast cells are centrifuged and treated with sulfuric acid (pH = 2.5) for about 1 h to kill bacterial contaminants [1, 34, 35]. This drastic acid shock constitutes a pulse of stress that circumstantially resembles the ethanol treatments used in our ALE protocol. Therefore, we explored whether tolerant yeasts selected using our shock/recovery regime exhibited a good performance in benchtop fermentations that simulate Brazilian ethanol production using cell recycling and sulfuric acid treatments [34, 35].

During fermentations using cell recycling a biomass increase of about 10% can occur. Furthermore, cell death may account for fluctuations in cell numbers from one cycle to the next [9, 34, 35, 58]. Considering such variations in biomass, we asked whether our selected strains would persist from one cycle to another in a competition against the reference during fermentations of sugarcane molasses using cell recycling. This would allow us to calculate cumulative selection coefficients (S), relative to the progenitor, which express the propensity of the strain to increase (or not) a viable biomass during the stressful fermentation. To test the fitness of evolved clones P1c and P3c, probed strains and the parental ALE progenitor (Prog.) were separately mixed with the PE-2_H4 tester-GFP to initiate eight cycles of sugarcane molasses fermentations (Fig. 6A). During the process, cells were sampled after each cycle and the proportions of GFP-labeled to nonlabeled cells were measured using an Attune NxT flow cytometer. While P3c was outcompeted by the tester-GFP during the fermentation cycles, the increasing cumulative fitness of P1c showed that the strain was better adapted than its progenitor to the harsh conditions of ethanol fermentation with cell recycling (Fig. 6A).

Fig. 6
figure 6

Fitness and fermentation performance of ethanol tolerant strains. A P1c, P3c, and the ALE progenitor (Prog.) were individually mixed with the tester-GFP and subjected to eight fermentation cycles. The TRS input for cycles 1–8 were, respectively, 14.9%, 16.7%, 18.8%, 21.3%, 21.6%, 17.2%, 20.7%, and 20.9%. The conditions resembled the Brazilian ethanol production wherein cells are recovered by centrifugation following each fermentation and treated with sulfuric acid (H2SO4) 0.5 M for 1 h (pH 2.5). Flow cytometry measurements allowed estimation of cumulative selection coefficient (S) after each cycle expressed as the difference from the cumulative fitness recorded for the progenitor in a competition against the tester-GFP. B Fermentation of sugarcane molasses during seven cycles with TRS content of 14.9%, 18.5%, 21.4%, 22.0%, 22.0%, 21.7%, and 21.5%, respectively. Final ethanol accumulation and cell viabilities for P1c, the ALE progenitor (Prog.), and parental PE-2_H4 were quantified at each cycle. Trehalose content was estimated at the beginning and end of the experiment and was expressed as the wet cell weight percentage for each strain. C Competition assays between the cyr1A1474T/usv1Δ double-mutant and the tester-GFP through sequential passages in YPS with 8% (v/v) ethanol and without alcohol. Plotted are the cumulative S per number of calculated generations at each passage. D Fermentation of sugarcane molasses with the cyr1A1474T/usv1Δ and PE-2_H4. Fermentation cycles 1–5 were performed with initial TRS content of 17.8%, 20.3%, 22.3%, 22.8%, and 22.5%, respectively. The content of ethanol, biomass, and cellular viabilities were estimated at the end of each cycle. Trehalose was quantified at the beginning and end of the fermentation. (*) p < 0.05, one way ANOVA followed by Bonferroni post-test for multiple comparisons

A separate fermentation assay conducted using individual strains demonstrated that P1c achieved ethanol titers similar to those of the ALE progenitor and parental PE-2_H4 (Fig. 6B). More importantly, P1c displayed higher cell viability after each cycle (Fig. 6B), indicating that this strain was better adapted to the fermentation conditions. Trehalose accumulation, a hallmark of stress tolerance in Brazilian bioethanol yeasts [34, 35], which was already higher in the P1c compared with the parental strains at the beginning of the experiment, reached over 23% of cell wet mass at the end, nearly double the trehalose content reached by the parental strains (Fig. 6B). Despite the superior traits exhibited by P1c, strains reverse-engineered with P1c mutations, cyr1A1474T, ath1Δ, and cyr1A1474T/ath1::MX, during fermentations with cell recycling displayed slightly reduced cell viabilities, while maintaining ethanol titers similar to those of the reference, PE-2_H4 (Additional file 3: Fig. S3).

We tested the double-mutant cyr1A1474T/usv1Δ by competitions against the tester-GFP via successive transfers in liquid YPS medium with 8% (v/v) ethanol and without ethanol (Fig. 6C). At each passage the proportion of the competitors was recorded, and the cells were counted (Additional file 1: Table S7). Plotting the calculated cumulative S against the number of cell doublings at each passage allowed the estimation of S per doubling using linear regression obtained from the data points. Values were normalized for fitness calculated for the parental PE-2_H4 competing against tester-GFP. The double-mutant cyr1A1474T/usv1Δ exhibited a consistent fitness gain of 5.46% per doubling in the presence of 8% (v/v) ethanol, whereas it exhibited a fitness loss of 1.07% in the absence of ethanol (Fig. 6C).

Finally, we subjected the engineered strain cyr1A1474T/usv1Δ and PE-2_H4 to five cycles simulating the sugarcane molasses fermentation process. A progressive increase (17.8%, 20.3%, 22.3%, 22.8%, and 22.5%) of total reducing sugars (TRS) from cycles 1 to 5 challenged the stress tolerance of the strains. Under this condition, the double-mutant cyr1A1474T/usv1Δ had accumulated approximately 1% more ethanol from cycles 3 to 5 than its parental (Fig. 6D). More importantly, the average biomass gain of cyr1A1474T/usv1Δ was more than that of PE-2_H4 throughout the five fermentation cycles. Cell viability did not significantly differ between strains (Fig. 6D).

The results from competition assays in 8% (v/v) ethanol indicated that mutations cyr1A1474T/usv1Δ improved strain ethanol tolerance in a rich medium. Furthermore, fermentations of sugarcane molasses with cell recycling by the double-mutant and P1c indicated fermentation performance with higher biomass formation (for cyr1A1474T/usv1Δ) and higher cell viabilities (for P1c) than that of the parental PE-2_H4. Although ethanol titers had not improved significantly, it should be noted that both strains had been challenged during fermentations via a combination of stress factors, such as toxic compounds in molasses, high ethanol levels, temperature, and acid treatment [34, 35]. Under these stresses, the cyr1A1474T/usv1Δ double-mutant and P1c exhibited higher capacities than their parental strains to sustain a viable biomass from one cycle to another. Such persistency represents the most important technological property for Brazilian bioethanol production as it allows the yeast to survive and maintain high ethanol production rates throughout the whole fermentation season of about eight months [34, 35, 58].

Discussion

Alcoholic tolerance is a key trait of yeasts used in large-scale production of the biofuel ethanol [1]. To understand the polygenic basis of alcohol tolerance for improving ethanologenic yeasts, several studies have been conducted using various approaches, such as transcriptomics [3, 59], screening of genome-wide knockout collections [59], global transcriptional machinery engineering [60], and QTL mapping [8, 10, 14]. Herein, we used ALE to investigate the genetic underpinnings of ethanol tolerance in the bioethanol strain PE-2_H4. Adaptive evolution has been used for raising the alcoholic tolerance of S. cerevisiae strains, mostly through protocols involving continuous propagation over hundreds of generations in the presence of increasing ethanol levels [13, 22,23,24,25,26,27,28,29,30,31]. Such studies highlighted the role of diploidization [13, 25], increase in cell size [29], and remodeling of membrane lipids [29] and cell wall [30] as important adaptations of yeasts to high ethanol concentrations.

Instead of propagation under constant ethanol exposure, our ALE approach, and that of two other studies [32, 33], favored cell survival to ethanol shocks as a paradigm for investigating ethanol tolerance in yeasts. The shock protocol that we applied was substantially different from these two previous ALE experiments, which used 10–12 rounds of quick 2-min exposures [32] or up to 30 cycles of 1–3 h shocks of up to 25% (v/v) ethanol [33]. These protocols also relied on a postshock propagation in rich culture medium to promote physiologic recovery and selection for growth fitness [32, 33]. Although, in those cases, the bimodal protocol supported ethanol tolerance combined with good fermentation performance [32, 33], in our ALE experiment, most reverse-engineered strains and evolved clones showed low growth fitness, either in the presence or absence of ethanol. Possibly, these side effects were observed only in our study because of the severe treatments that we used, which involved application of 2-h ethanol exposures and usage of far more cycles of ethanol shocks (up to 82 rounds) with higher ethanol concentrations (up to 30% v/v) than those used before [32, 33].

To shed light on the genetic basis of cell survival during acute ethanol stress, our ALE study used WGS of evolved yeasts and performed a comprehensive evaluation of the fitness exhibited by reverse-engineered alleles. Regarding ALE protocols for ethanol tolerance in yeasts, genomic surveys were conducted in only two previous studies [13, 30]. One of them relied on three rounds of turbidostat cultivation with increasing amounts of ethanol. WGS of isolated clones revealed alleles associated with SSD1 and UTH1 as determinants of ethanol tolerance [30]. Another study was based on a turbidostat cultivation of six S288C populations for 200 generations with ethanol levels raised from 6 to 12% (v/v) [13]. The WGS uncovered hundreds of mutated genes associated with various functions, such as stress response, cell cycle control, DNA replication/repair, and respiration. Fitness measurements demonstrated that evolved alleles associated with PRT1, MEX67, and VPS70 were adaptive to high levels of ethanol [13].

Similar to previous studies, our ALE experiment also detected diverse mutations (46 in total) related to various cellular functions, emphasizing the complex polygenic nature of ethanol tolerance trait in yeasts. However, our four evolved populations were particularly enriched in mutations affecting components of the cAMP/PKA signaling and trehalose degradation pathways. The Protein kinase A (PKA) complex, under glucose abundance, acts as an effector for cell proliferation, cell cycle progression, and ribosome biogenesis [38, 46]. Simultaneously, active PKA is an inhibitor of Msn2p and Msn4p transcription factors [45,46,47]. When glucose is depleted, causing low cAMP levels, the PKA-mediated inhibition is reverted and Msn2p and Msn4p become dephosphorylated and active; they shuttle into the nucleus to activate the environmental stress response, which includes factors involved in heat shock, cell wall remodeling, DNA repair, antioxidant defense, and trehalose biosynthesis [45,46,47]. Mutants that downregulate the Ras2p/cAMP pathway display a characteristically low PKA phenotype marked by the ectopic activation of Msn2/4p-mediated stress responses [38, 45,46,47]. In our ALE populations, we detected mutated alleles of RAS2 and CYR1, as well as of MDS3 and PMD1; knockouts of the latter two genes mimicked a low PKA phenotype [40]. We propose that downregulation of the PKA function is a central adaptation during ALE to mobilize yeast stress responses to withstand ethanol lethality. This was corroborated by the fact that evolved clones P1c, P2c, and P3c displayed higher trehalose accumulation and constitutive upregulation of HSP12-GFP expression, a standard biosensor responsive to Msn2/4p [45]. Accordingly, in our study, strains with the engineered ras2::MX, mds3::MX, and cyr1A1474T alleles exhibited higher fitness than the parental strain to tolerate the ethanol shock.

The disaccharide trehalose is a well-known stress protectant that possibly acts as a chemical chaperone to hold the folding of proteins and maintain the plasma membrane integrity during stress [61]. ATH1 encodes an acid trehalase that is presumably involved in the extracellular degradation of trehalose [61, 62]. Ath1 knockouts are tolerant to several stresses, such as heat [62], dehydration, freezing, and high ethanol concentration [37]. In three of our ALE populations, ath1 defective alleles swept to dominance after a cAMP/PKA-related allele had emerged. Furthermore, we obtained the highest-fitness strain for tolerating ethanol shocks from the synergism between cyr1A1474T and ath1::MX mutations. Higher trehalose synthesis triggered by mutations that downregulate the cAMP/PKA pathways (e.g., mds3, pmd1, and cyr1) [38, 40] may potentiate the emergence of defective ath1 alleles, blocking trehalose degradation and promoting its cellular accumulation and protective association with the plasma membrane [37, 61, 62].

Such peripheral protection may be an important mechanistic principle for ethanol tolerance, which is also supported by the snowflake phenotype of population P4. A defective allele affecting BUD3 (acting on the axial bud site selection in haploid cells [54, 55]) probably compromises correct daughter cell separation following cytokinesis, resulting in multicellular aggregations. Possibly, within these clumps, inner cells may be shielded from ethanol damage, which is constrained to the cell cluster periphery [56]. Consistent with this idea is the fact that yeasts with aggregation phenotypes are more resistant to multiple stresses (including freeze/thaw, hydrogen peroxide, heat, and ethanol treatments) than individual cells [56, 57]. A further possible link between ethanol tolerance to cell wall shielding is provided by the ROM2 alleles identified in P1 and P3. Rom2p is a guanine nucleotide exchange factor for Rho1p and Rho2p GTPases. It plays a role in the cell wall integrity signaling to remodel the cell wall in response to environmental stresses [41]. Moreover, Rom2p mediates stress resistance and cell growth by interacting with the Ras-cAMP pathway [63]. However, we could not demonstrate any fitness contribution to ethanol tolerance by the rom2::MX disruption or rom2G440R allele. A similar lack of validation is the case for alleles related to PTR2, which encodes a di–tripeptide transporter at the membrane [39]. Because we narrowed our reverse-engineering scope to a few genes or pathways that had parallel mutations in at least two populations, it is possible that ROM2 and PTR2 related alleles may be adaptive in association with mutations not tested in this study. Multiple genetic interactions and the diversity of 46 evolved alleles recovered via WGS in our ALE are consistent with the idea that ethanol tolerance may be achieved through various and complex mutational pathways [13].

In our genetic analysis, we combined alleles cyr1A1474T/usv1Δ to generate a strain with higher fitness to acute ethanol treatments and better growth in the presence of ethanol than its parental strain. Usually, CYR1 mutations that decrease cAMP levels tend to negatively impact growth rates. However, some alleles related to CYR1 have demonstrated to combine stress tolerance with optimal fermentation and growth performance. For example, progressive 90- and 120-bp deletions of the CYR1 promoter rendered strains with decreased CYR1 expression and lower cAMP levels that displayed 14% and 15% higher ethanol yield, respectively, during very high gravity fermentation [64]. A further case is provided by the fil1 mutant that exhibits a glutamate to lysine exchange at residue 1682 of Cyr1p. Engineering the fil1 mutation into the Y55 strain improved its freeze and drought resistance without compromising fermentation performance [38]. In our case, higher fitness of the cyr1 mutant for propagation in the presence of ethanol was only achieved when in combination with USV1 deletion. Usv1p is a C2H2 zinc finger transcription factor that activates gene expression under nonfermentable carbon sources and respiratory conditions [42, 65], and represses the transcription of genes involved in sulfur metabolism [66]. Interestingly, USV1 takes part in transcriptional responses to hyperosmolarity [42] and its expression is induced in the multi-stress-tolerant strain BT0510 as part of a common response to ethanol treatments applied sequentially to osmotic, oxidative, and glucose withdrawal stresses [67]. Furthermore, evolved strains isolated through ALE protocols to improve yeast tolerance to oxidative stress [68], iron [69] and silver [70] toxicity, and caloric restriction [71] exhibit upregulation of USV1 expression as part of the adaptive responses. These results correlating USV1 expression with stress adaptation are in contrast to our findings that USV1 loss-of-function confers ethanol tolerance to the PE-2_H4 strain. We suggest that such discrepant results and the reasons why USV1Δ exhibits positive epistasis with cyr1A1474T may be clarified in further research focusing on the transcriptional responses under stress conditions related to the wild-type and knockout USV1 alleles.

Trade-off effects are common in ALE populations subjected to selection pressure for prolonged periods [15,16,17]. ALE populations tend to become specialists in the selective environment frequently displaying lower fitness than their ancestors when moved to an alternative niche, subjected to a different propagation mode, or nutrient condition [15,16,17]. Therefore, it is not surprising that several clones and reverse-engineered strains derived from yeast populations adapted to 68–82 ethanol shocks exhibit low growth fitness. We suggest that propagation fitness decay may partially reflect the overall downregulation of protein biogenesis and other growth-promoting components resulting from the constitutive activation of the environmental stress response [47, 72]. Slower growth rates related to mutants downregulating the PKA pathway and accumulating trehalose have been well documented [38, 46, 47]. The demonstrated inverse correlation between growth rates and resistance to severe stresses, a phenomenon that conforms to the principle of energy balance wherein resources used for growth under nonstress conditions are redirected to overcome the environmental stress [18,19,20], is relevant in our case, which involves yeasts tolerant to drastic ethanol treatments.

Conclusion

Considering the possible negative impact on fermentation performance, it is questionable whether strain selection protocols that apply harsh treatments are worth pursuing [15]. Through ALE, we selected the evolved clone P1c as the one presenting the highest tolerance to ethanol shocks. Using WGS information of ALE clones we reconstructed the cyr1A1474T and usv1Q73stop alleles into the parent background and could demonstrate that the combined mutations improved yeast tolerance for growing in the presence of ethanol. The alleles usv1Q73stop and cyr1A1474T are part of the P1c genetic makeup. Similar to the double-mutant cyr1A1474T/usv1Δ, P1c exhibited excellent performance under conditions simulating the Brazilian ethanol production system (i.e., using cell recycling and sulfuric acid shocks). These results indicate that harsh selection schemes may be useful for isolating strains suitable for bioprocesses wherein yeasts are subjected to multiple stresses; such as the fermentation of highly toxic biomass hydrolysates for cellulosic ethanol production [15]. Moreover, even if shock-based screenings tend to present negative side effects, we suggest that it is possible to disentangle adaptive alleles from fitness-costing mutations through reverse engineering of adaptive alleles into a parental background (as shown here) [15, 17], or by applying sexual strategies to dissociate beneficial mutations from the deleterious ones [16, 73]. This may involve backcrossing shock-selected yeasts with their parental strains and subjecting the resulting recombinants to further selection for growth fitness under stress (de Bem and Gross, manuscript in preparation). These approaches may provide excellent complementary procedures for refining protocols for selection of stress-tolerant ethanologenic yeasts.

Materials and methods

Adaptive laboratory evolution

The S. cerevisiae haploid strains PE-2_H3 (MATα) and PE-2_H4 (MATa) are spore derivatives dissected from a tetrad of the Brazilian bioethanol isolate PE-2 [34, 36]. For populations P1, P2, and P3, a single progenitor was generated by integrating the natMX marker into the PE-2_H4 genome (Additional file 1: Table S1 and Additional file 4: Text S1). Then, this transformant was mated with PE-2_H3 (MATα). After sporulation, a nourseothricin-resistant MATα spore was propagated to establish the P4 haploid population. All the strains used in this study are listed in Additional file 1: Table S1.

To initiate the ALE experiment, populations P1–P4 were propagated overnight in 20 mL of YPS medium (yeast extract-peptone plus 2% sucrose) in 50 mL Erlenmeyer flasks (60 rpm at 32 °C). From this initial preinoculum, and after each recovery growth during ALE, 1 mL (containing approximately 108 cells) was transferred into a 1.5 mL microcentrifuge tube and centrifuged at 8500 × g for 5 min. Then, the obtained pellet was washed once with 500 μL deionized water. Next, the cell pellet was resuspended in 1 mL PBS 1X buffer containing an initial ethanol concentration of 19% (v/v). During the ethanol treatment (i.e., the ethanol shock) cells were kept static at 32 °C for 2 h. Following treatment, cells were centrifuged at 8500 × g for 5 min, washed with 500 uL deionized water, resuspended, and transferred into 20 mL YPS for a recovery growth in a shaker incubator (32 °C, 60 rpm). Nourseothricin (100 µg/mL) was supplemented to prevent contamination. The culture was maintained for 2–4 days, i.e., until the yeast cells achieved a stationary phase. Then, 1 mL of the culture was taken to begin a new shock/recovery cycle, and another 0.5 mL was cryopreserved in a 25% glycerol stock. Generally, during ALE, ethanol concentrations used for shocks were increased whenever the population showed fast postshock growth (e.g., reaching the stationary phase in 2 days). Conversely, the ethanol concentration was decreased whenever the postshock culture for a given population repeatedly showed prolonged recovery (approximately 4 days growth), indicating poor adaptation to the applied ethanol level. Final ethanol concentrations reached 30% (v/v) for populations P1 and P3, and 29% (v/v) for P2 and P4. Furthermore, the number of shock/recovery cycles varied among the populations (Fig. 1B). At the end of the evolution, for each population, the single best growing colony on solid medium (YPS) supplemented with 8% (v/v) ethanol was selected for WGS and downstream analyses. These originated the evolved clones P1c, P2c, P3c, and P4c.

WGS of evolved clones and variants identification

The evolved clones P1c–P4c were grown in liquid YPS until saturation. The sampled cellular mass from each clone was transferred into a lysis buffer (DNeasy® Plant Mini Kit, QIAGEN), and cells were disrupted by 1.5-min vortexing (Beadbeater, BioSpec) in the presence of zirconia beads. Genomic DNA was extracted from the lysate using the DNeasy® Plant Mini Kit (QIAGEN) following the manufacturer’s protocol. The isolated DNAs were fragmented using the NEBNext dsDNA Fragmentase (New England BioLabs), and paired-end libraries were constructed according to the Illumina TruSeq DNA PCR-Free Low Throughput Library Prep Kit (Illumina). The four paired-end libraries were quantified by qPCR using the KAPA Library Quantification Kit (Roche). Genome sequencing was conducted on the Illumina MiSeq platform at the Federal University of Rio Grande do Sul, Brazil, using the MiSeq Reagent Kit v3 supporting 600-cycles of 2 × 300 paired-end reads (Illumina). The resulting sequence reads were filtered following a cut-off of Phred quality scores ≥ 30 and read length ≥ 75 bases. The sequence reads obtained for P1c–P4c were submitted to NCBI (https://www.ncbi.nlm.nih.gov) under the BioProject number PRJNA1026594.

Sequence reads derived from P1c, P2c, and P3c genomic libraries were mapped against the PE-2_H4 genome (GenBank accession number GCA_905220315.1) using the Burrows–Wheeler aligner algorithm implemented by the CLC Genomics Workbench 8.01 (QIAGEN). A cut-off of 0.8 for aligned read length and 0.8 for minimal required identity were set. The mapped reads were subjected to a variant detection performed on the CLC Genomics Workbench. A frequency of ≥ 50% was set as the cut-off; however, selected variants had frequencies tending to 100%, in accordance with a haploid background derived from a single colony. The progenitor of population P4 had a recombinant haploid genome derived from a PE-2_H4 vs. PE-2_H3 crossing. To identify mutations in the P4c sequence reads, we first mapped PE-2_H3 reads (GenBank accession number GCA_905220325.1) against the PE-2_H4 genome to obtain the coordinates for all polymorphisms (SNPs and small InDels) distinguishing the two genomes. We mapped the P4c sequence reads against the PE-2_H4 genome to call variants as described above. By comparing the coordinates obtained with those observed for the PE-2_H3 polymorphisms (Microsoft Excel), we identified P4c variants that were absent in PE-2_H3 and PE-2_H4 genomes. Key mapped mutations were confirmed by Sanger sequencing of PCR fragments derived from the final populations and their respective progenitors. PCR oligonucleotides were designed to flank the analyzed mutations (Additional file 1: Table S4). The presence of evolved alleles was also examined by Sanger sequencing of the PCR fragments derived from DNA extracted from cryopreserved intermediate populations (Additional file 2: Appendix S1).

Yeast molecular genetics and strains construction

The strains constructed in this study were all derived from the Brazilian bioethanol yeast PE-2_H4 (MATa) [36] (Additional file 1: Table S1 and Additional file 4: Text S1). An exception was the P4 progenitor (described above). Several strains were constructed with mutations mimicking the ALE evolved alleles. Constructs for insertional disruption of alleles via homologous recombination were assembled with PCR products of the targeted genes flanking the kanMX PCR fragment [48]. PCR reactions were performed using Phusion® high-fidelity DNA polymerase according to the manufacturer’s protocol (New Englang BioLabs). Amplified flanking recombination regions were merged with the MX cassette through Circular Polymerase Exchange Cloning into the pUC19 plasmid [74]. Alternatively, a construct assembly was performed in two steps via in vivo cloning in E. coli [75]. Insertional kanMX cassettes were amplified using PCR (Phusion® high-fidelity DNA polymerase), and about 500–1000 ng PCR products were transformed into the PE-2_H4 via standard lithium acetate protocol [76]. In some cases, the kanMX cassette was directly amplified, with primers carrying tails with 40-nts homology at the 5′ region for integration into the targeted locus, and readily transformed into the yeast [48].

Targeted deletions and point mutations were introduced via CRISPR/Cas9 following the EasyGuide method developed by our group [49]. Donor sequences specifying genome edits were preassembled into the pUC19 vector, and then amplified using PCR (Phusion® high-fidelity DNA polymerase), and cotransformed with gRNA-encoding PCR fragments for in vivo recombination [49]. Alternatively, donors were directly amplified by PCR with oligonucleotides carrying at least 40-nts homology arms for recombination, and readily cotransformed with gRNA-encoding PCR fragments [49]. Diagnostic PCRs (Taq DNA polymerase, Thermo Fisher Scientific) were routinely used to confirm genome edits and kanMX insertions. Single nucleotide edits were confirmed by Sanger sequencing. All primers used in this study for strain construction and authentication are listed in Additional file 1: Table S4. Molecular genetic procedures are detailed in Additional file 4: Text S1.

Phenotypic analyses

Evolved clones P1c, P2c, and P3c were subjected to phenotypic analyses using the ALE progenitor as a reference. For cell viability, P1c–P3c and the progenitor were acclimatized by two shocks of ethanol 15% (v/v) and recovery growth. From the second outgrowth culture, 1 mL (108 cells) was pelleted, washed, and resuspended in 1 mL PBS 1X buffer containing an ethanol concentration of 19% (v/v). Ethanol treatment lasted 2 h at 32 °C. A 10−5 dilution was plated into solid YPS medium. After 3 days, colony forming units (CFU) were counted and compared to the number of CFUs obtained in a control treatment without ethanol (representing 100% survival). Assays were performed in triplicates. Statistical analysis were performed using GraphPad Prism 9.5.0 package (GraphPad Software, La Jolla, California, United States).

Trehalose estimation following ethanol shocks was performed through the enzymatic trehalase assay [77]. Cells from P1c–P4c and the progenitor were acclimatized by two shocks of 15% (v/v) ethanol and outgrowths, according to our ALE protocol. About 108 cells were taken from the last outgrowth, pelleted, washed, and resuspended in 1 mL PBS 1X buffer containing 19% (v/v) ethanol. The treatment lasted 2 h before recovery growth in YPS. After 2 days, 108 cells estimated on a Neubauer chamber were centrifuged and washed twice with cold water to remove any residual ethanol. The cells were resuspended in 0.25 M Na2CO3 solution, vortexed, and incubated at 95 °C for about 3 h. The pH was adjusted to 5.5, and the cells were vortexed and transferred to a fresh tube to estimate the amount of trehalose. Porcine kidney trehalase (Sigma-Aldrich) was used with a pH adjusted to 5.8. The cells were incubated at 37 °C overnight. The glucose liberated was quantified using a glucose colorimetric detection kit, following the manufacturer’s protocol (Invitrogen) and normalized for the number of cells used in the sample. The amount of trehalose was calculated in µg of glucose-equivalents. Assays were conducted in triplicates.

The HSP12-GFP Msn2/4p-responsive biosensor was constructed by chromosomal integration (conducted using the CRISPR EasyGuide [49]) of the ORF expressing the ymUkG1 GFP [78] to produce an in frame fusion with the HSP12 gene [45] (Additional file 4: Text S1). The biosensor was constructed into the PE-2_H4 parental strain and P1c, P2c, P3c, and cyr1A1474T haploid backgrounds. The first assay was performed by propagating three replicates for each strain in YPS and recording fluorescence signals at the stationary and logarithmic growth phases over 4 days. Fluorescence measurements were obtained using the Attune NxT flow cytometer (Thermo Fisher Scientific) by recording 10,000 events. Fluorescence intensity values were taken from the median derived from flow cytometry histograms [45]. Fluorescence intensity values for the strains were normalized by the average numbers obtained for the PE-2_H4 reference at the same sampled point (Fig. 3B). A second assay involved comparing fluorescence signals recorded from cells growing for 2 h, 4 h, 6 h, 8 h, and 24 h after inoculation with or without 8% (v/v) ethanol. Normalization was made with the PE-2_H4 values at the same sampled points (Fig. 3C). Alternatively, to demonstrate the effect of ethanol treatment on the HSP12-GFP expression for each strain, at each sample point values were expressed as a ratio of the fluorescence obtained for the ethanol exposed and nonexposed cells (Fig. 3D).

Competition experiments and fitness measurements

Haploid PE-2 yeasts tend to aggregate in groups of about two–six cells (Fig. 5B). This mild aggregative phenotype is absent in the diploid background, when separate cells can be observed. To facilitate counting of individual cells by flow cytometry during competition assays, all strains used were diploidized by mating-type switching, induced via plasmidial expression of the HO endonuclease, followed by mating [79]. For an initial phenotypic evaluation, diploid P1c–P3, progenitor, and GFP-tagged PE-2_H4 strain (tester-GFP, expressing the ymUkG1 GFP [78] integrated into the HO locus) were subjected to five passages in 20 mL YPS (with no addition of ethanol). Then, for each sample, about 5 × 107 cells were mixed with an approximately equal amount (one: one) of the tester-GFP in a 1X PBS buffer. The initial proportion of GFP-marked and nontagged cells was then recorded (about 10,000 events in 10 µL) in the Attune NxT flow cytometer (Thermo Fisher Scientific) (Additional file 1: Table S2). Gating parameters were set with separate cultures of GFP-expressing cells and nonmarked yeasts. Mixed cells were distributed in three replicates, and each one was subjected to ethanol treatment (24% [v/v]) for 2 h, and then to a recovery growth in 20 mL of YPS (32 °C, 60 rpm) until the stationary phase was achieved. After outgrowth, the proportions of the competitors were estimated in the Attune NxT flow cytometer and plotted. In a parallel experiment, P1c–P3, progenitor, and tester-GFP were acclimatized by sequential 2-h shocks of 15%, 15%, 20%, 20%, and 22% (v/v) ethanol, each one followed by an outgrowth. Then, ethanol-adapted cells were counted, mixed (one: one) with tester-GFP, and subjected to a 24% (v/v) ethanol treatment and recovery growth, as described above.

By recording the initial (i) percentage of GFP-marked (GFPi) and nonmarked evolved (EVOi) cells, and the final (f) proportion of competitors (GFPf and EVOf) after outgrowth, it is possible to calculate the selection coefficient (S), describing the fitness of the evolved cells, according to the following equation:

S = ln[EVOf/GFPf] − ln[EVOi/GFPi] [13, 80].

All selection coefficient values were normalized to the S obtained in a competition of the progenitor vs. tester-GFP (designated as S = 0). Therefore, S always expressed the fitness of evolved clones, or reverse-engineered strains, relative to the progenitor and discounted any fitness effects for GFP expression.

A slightly different ethanol shock assay was used for systematic analyses of evolved clones and reverse-engineered strains (Competition 1, Fig. 4A and B). This included two cycles of adaptation with a 15% (v/v) ethanol shock and recovery, followed by a 20% (v/v) ethanol shock after mixing the evolved strain and the tester-GFP in an approximately one: one ratio (Additional file 1: Table S5). The proportion of competitors was recorded following outgrowth, and fitness was calculated using the equation above, using the S obtained for the ALE progenitor and parental PE-2_H4 strain used for reverse engineering for normalization. For competitions 2 and 3 (Fig. 4A and B), the competitors were acclimatized via two passages in 20 mL YPS without ethanol and with 4% and 6% (v/v) ethanol, respectively. Competitors were then mixed with the tester-GFP in an approximately equal ratio. From this initial mixture, 20 µL for competition 2 and 100 µL for competition 3 were, respectively, inoculated in 20 mL for propagation at 32 °C (60 rpm) up to the stationary phase. The proportion of GFP-labeled and nonlabeled cells were recorded by the end of the experiment. At the beginning and end of propagation, the cells were counted in the Attune NxT flow cytometer allowing the estimation of cell doublings during the propagation. A selection coefficient/cell doubling was estimated by dividing S for the total numbers of doublings (S/d) [13, 80]. All values for S/d were normalized to the data obtained for the parental strains (Additional file 1: Table S5).

Competitions with the cyr1A1474T/usv1Δ and PE-2_H4 against the tester-GFP were conducted through five passages in liquid YPS medium with 8% (v/v) ethanol or without ethanol. Acclimation and initiation of competitions were as described above. The proportion of competitors at the end of the first passage was stipulated as the starting point for measurements (S = 0). Cumulative S and cell doublings were estimated at each transfer (Additional file 1: Table S7). Values plotted for cyr1A1474T/usv1Δ were normalized for those obtained in the PE-2_H4 vs. tester-GFP competition. Selection coefficients per cell doublings (S/d) were calculated from the trendline fitted into the data points.

Microplate growth assays were conducted on the Tecan Sunrise© (Tecan). The preadaptation of strains with or without ethanol were as described above. About 2 × 106 cells were inoculated into 200 µL medium per well. Assays were conducted in triplicates under static conditions at 28 °C. The optical density (OD)600 was recorded every 15 min (Additional file 1: Table S6). Maximum specific growth rates (µmax) were calculated from OD600 values obtained during the exponential growth phase (ranging from OD600 0.4 to 0.9). A plot of the natural logarithm of OD600 values (Microsoft Excel) against the collected time points (h−1) allowed fitting of a simple linear regression and obtaining the μmax values from the slope [81]. The μmax was separately calculated for each replicate and expressed as an average value. At 8% (v/v) ethanol, strains cyr1A1434T, ath1Δ, cyr1A1434T/ath1::MX, and usv1::MX displayed a very poor growth profile, reaching a plateau of maximum optical density (ODmax) below 0.9. In these cases, the µmax was obtained from OD600 in the range between 0.1 and 0.5. The ODmax values were obtained at the stationary growth phase.

Sugarcane molasses fermentation

Fermentations of sugarcane molasses were performed according to the scale down system proposed by Basso et al. [34] and described by Raghavendran et al. [35]. This closely simulates the Brazilian industrial fed-batch process with cell recycling and sulfuric acid treatment. Briefly, strains were propagated in molasses medium (10% w/v of the sugar hexose) at 32 °C, centrifuged at 3000 × g for 20 min and resuspended. For an initial inoculum, and at each new fermentation cycle, a cell suspension (70% w/v) was diluted in water up to 30% (12 mL) of the total fermentation volume (40 mL). The added substrate (diluted molasses) constituted the remaining 70% (28 mL), with TRS content adjusted to yield, at the end of each fermentation round, the intended ethanol titer, calculated according to ~ 90% of the theoretical conversion rate of 0.511 g ethanol per g of reducing sugars (glucose/fructose) [35]. Fermentations were performed in triplicates at 34 °C in a volume of 40 mL in 50 mL-centrifuge tubes. At the end of the fermentation, cell viability counts were taken as described by Basso et al. [34]. Cells were pelleted down before the wet biomass was weighed. Supernatant was saved for ethanol estimation as described by Basso et al. [34]. Preceding each cycle, collected biomass from the previous cycle was subjected to a sulfuric acid treatment, performed for 1 h by adding 0.5 M H2SO4 (pH 2.5). Molasses feeding for the next cycle was performed as described above. Trehalose content was estimated from biomass collected at the beginning of the fermentations and at the end of the last cycles, as previously described by Basso et al. [34].