Introduction

Antimicrobial resistance (AMR) is a global health concern. Humans are in a constant arms race with bacteria that are continually evolving resistance to multiple lines of antibiotic treatment. AMR can be encoded either chromosomally or on plasmids, with the latter enabling the rapid spread of resistance within and between populations [1]. In many cases, several genes encoding resistance to a variety of antibiotic classes can be acquired, leading to the emergence of multidrug resistance [2]. Multidrug-resistant (MDR) pathogens, including Escherichia coli, can cause more severe, prolonged infections and can ultimately have global pandemic potential [3].

The E. coli species has considerable genotypic and phenotypic variation and a resistance spectrum from largely drug-susceptible commensal to globally disseminated MDR pathogenic clones [4, 5]. Pathogenic E. coli include enteropathogenic E. coli (EPEC), enterotoxigenic E. coli (ETEC), enterohaemorrhagic E. coli (EHEC), enteroaggregative E. coli (EAEC), and extraintestinal pathogenic E. coli (ExPEC) [1], with increasing diversity as the species expands and evolves. The E. coli species can be broadly divided into phylogroups A, B1, B2, C, D, E, or F, with further categorization into sequence types (ST) (Fig. 1). ExPEC can be important pathogenic MDR clones, and their STs mainly fall into phylogroup B2 (ST131, ST1193), although some are found across phylogroups A (ST167, ST410) and F (ST648) [6,7,8,9,10]. STs can be split further into clades, a monophyletic group, and clones, here defined as organisms that share common phenotypic or genotypic traits, characterized by a strain typing method, and that are descendants of a common ancestral organism by non-sexual reproduction [11]. Within the E. coli species, “clone” may refer to a subset of a clade (e.g., ST131-H30R1) or a whole ST (e.g., ST1193) (Fig. 1). The development of AMR, and in particular multidrug resistance, is a key feature of a pandemic clone. In addition, pandemic clones must be easily transmitted, persistent, and able to adapt to a variety of hosts if they are to be found globally and lead to infection on a pandemic scale [9]. Here, we focus on pandemic MDR clones of E. coli that encode resistance to antibiotics including β-lactams and carbapenems.

Fig. 1
figure 1

The E. coli species can be subdivided (by increasing level of specificity) into phylogroups, sequence types, clades, and subclades. Key groups discussed in this review are represented to show their relative relationships to one another. Branch lengths are not to scale. *Entity referred to as a clone in this review. **C1-M27 is referred to as a subclone

The emergence of MDR E. coli lineages has been rapid (Fig. 2), with a notable rise in the last three decades that is attributed widely to increased global travel [12,13,14] and the current usage of antimicrobials in humans and animals. Clinical causes of concern are extended-spectrum β-lactamase (ESBL)-producing E. coli, resistant to β-lactam antibiotics including penicillins, cephalosporins, and monobactams [15], and carbapenem-resistant E. coli (CREC). The increasing presence of CREC further limits antibiotic treatment options, with colistin as one of the few remaining drugs of last resort. Intriguingly, multidrug resistance is only observed within certain E. coli lineages and is restricted to specific clones within them [6]. Understanding this restricted distribution of multidrug resistance in E. coli is essential if we are to predict and prevent the emergence of new clones.

Fig. 2
figure 2

Key evolutionary dates for ST131 (yellow), ST410 (blue), ST648 (purple), and ST1193 (red). The timeline demonstrates emergence dates estimated by studies using time-scaled phylogeny techniques (*). Examples of key dates first reported (†) and first observed (‡) in published papers are also noted. Estimated dates vary between different datasets, and these have been included as alternative suggestions. Key references are provided at the relevant points in the text

In this review, we describe the emergence and evolution of pathogenic E. coli MDR clones, using the well-studied ST131 as an illustrative example, and identify parallelisms in other lineages. We present a hypothesis of the stepwise evolution of MDR clones that arise by a series of potentiating mutations. We argue that, instead of resistance being the primary evolutionary driving factor, the acquisition of resistance is instead a result of a series of founding events that prime a lineage to develop an MDR phenotype (Table 1).

Table 1 The potentiating mutations discussed in this review, including the stage of MDR clone evolution in which they may play an important role and key clone-specific references

ST131 as a model lineage for MDR clone evolution

ST131 is a widely studied lineage of E. coli, capable of frequent interspecies transmission and containing within it several important pandemic MDR clones [16]. The most recent common ancestor (MRCA) of ST131 is estimated to have emerged in the mid-1800s [6] (Fig. 2). ST131 possesses a three-clade structure that is dictated by the possession of different FimH alleles: H41, H22, and H30 (alternatively referred to as clades A, B, and C in the Petty classification) [6, 17]. ST131-H30 incorporates an H30S (susceptible) basal component (known alternatively as C0), which gave rise to the ST131-H30R (resistant) clade [18,19,20,21].

ST131-H30R, characterized by the FimH30 allele, is thought to have diverged from the drug-susceptible H30S basal component in 1982 before splitting into parallel sister subclades H30R1 and H30Rx [18, 22]. Both ST131-H30R1 and ST131-H30Rx clones harbor fluoroquinolone resistance (FQR) and encode blaCTX-M genes [6, 23]. ST131-H30R, possessing blaCTX-M-14, was prevalent in the early 2000s, but numerous studies indicate that it has since been superseded by ST131-H30R1 (blaCTX-M-27) and ST131-H30Rx (blaCTX-M-15) [19, 24,25,26,27]. The extent of the prevalence of these different ESBLs varies by geography and study location (e.g., community, hospital) [19, 24,25,26,27]. In 2016, a Japanese epidemic of ESBL-producing ExPEC was reported, caused by the rapid evolution of ST131-H30R1-blaCTX-M-27 (C1-M27 subclone) [19]. C1-M27 had already spread to five countries on three continents at the time of its discovery [19], and prevalence has remained high since [28]. The ST131-H30Rx subclade encodes the ESBL blaCTX-M-15 [6, 23]. Global nosocomial infections of ST131-H30Rx were reported by 2008, and this clone has since become dominant [19, 29, 30]. ST131 is perhaps an unusual lineage because it is a successful pathogen while encoding multiple plasmid-encoded resistance genes, thereby contradicting the expectation that fitness costs associated with plasmid carriage would decrease pathogenicity [29].

The emergence of ST131 as a lineage containing pandemic MDR clones has been rapid and has come under extensive observation. Here, we examine the signatures of the ST131 pandemic clones that have likely contributed to their success. We postulate that the ability to colonize a niche, and subsequently expand within it, are the first steps in this pathway.

Colonization

Pathogenic lineages of E. coli are heavily shaped by foothold genetic changes that can be considered primary events that give rise to phenotypes favoring colonization [31]. A key factor that shapes the ability of E. coli lineages to colonize a novel niche, particularly their transition from commensal to pathogenic, is their complement of virulence factors (VFs). VFs can be introduced via mobile genetic elements or located on chromosomal pathogenicity islands [32], with the former being particularly problematic. A mobile VF that improves the fitness of a pandemic clone may result in rapid spread through a community or healthcare setting.

ExPEC-specific traits, such as enhanced iron uptake and adhesion to extraintestinal tissue, are encoded by VFs that provide ST131 with a competitive advantage in host colonization [10, 33, 34]. The possession of VFs does not necessarily manifest a more virulent phenotype directly [33]. VFs may instead contribute to the success of a pandemic clone in more nuanced ways. The production of adhesins, including papACEFG [33] may, for example, increase the ability of a clone to colonize a host via attachment to epithelial cells. Proteases such as ompT may also aid adhesion to host tissues and play a role in the formation of intracellular bacterial communities [35]. E. coli pandemic clones are associated with clone-specific alleles of the fimH gene, encoding the type 1 fimbrial adhesin, with these alleles associated with increased pathogenicity [17, 32, 36, 37]. FimH is a mannose-binding adhesin that is critical for the colonization of bladder epithelial cells during infection by uropathogenic E. coli [35]. The ST131-H30R clade is associated with the FimH30 allele, and there is speculation that clones within this clade may have greater fitness in a pathogenic niche in comparison to other E. coli, although an exact cause of this phenotype has not been identified [36]. Greater fitness could conceivably contribute to the rapid, global dissemination of ST131-H30Rx.

Furthermore, clones can successfully overcome host restriction of iron availability, a common defense strategy to combat bacterial pathogens [38]. Siderophores are secreted by a clone to bypass this nutritional immunity as siderophores bind to Fe3+ with a greater affinity than the host proteins, transferrin and lactoferrin, to relinquish sequestered iron [38]. E. coli ST131 clones are known to encode siderophore-related genes including the aerobactin siderophore receptor iutA [33]. VFs may therefore increase the likelihood that a clone will colonize successfully through their contribution to cellular invasion and adhesion [39, 40].

Characteristics provided by VFs, beyond direct virulence, may therefore represent contributing factors that set a clone on a pathway to multidrug resistance by increasing the likelihood that it will establish a foothold within a given niche. This increased likelihood may occur due to an overall enhancement of clonal fitness rather than influencing the severity of infection inflicted by the clone directly. This hypothesis is supported by investigations in a murine sepsis model that showed that ST131 isolates did not correlate with higher infection severity over non-ST131 isolates [41]. In this study, the authors noted that the unparalleled success of ST131 could therefore be explained solely by enhanced colonization capabilities. These capabilities would result in a greater proportion of hosts being colonized by ST131, consequently increasing the likelihood of a persistent infective population that could go on to acquire resistance.

Niche domination and niche migration

Following successful colonization, the next step on the pathway to a widely disseminated MDR clone is expansion both within its environment and beyond. Metabolic flexibility may assist with the former. An ability to adapt rapidly to fluctuating nutrient availability via a larger complement of metabolic transporters or enzymes might allow one species, clade or clone, to outcompete another, increasing the likelihood that an isolate can persist within a given environment. A metabolically flexible clone may be more likely to subsequently acquire plasmids encoding multidrug resistance as a product of their increased persistence. This theory has been explored in drug-resistant clades of ST131. Specifically, genes involved in anaerobic metabolism were found to be more diverse in the drug-resistant clade ST131-H30 in comparison to the drug-susceptible ancestral clades [34]. For example, seven variants of the eutA gene, encoding an ethanolamine ammonia-lyase reactivase that acts in the ethanolamine degradation pathway, were found in ST131-H30 genomes. An allele of the cobW gene, involved in the biosynthesis of the cofactor cobalamin, was also found in the drug-resistant clade. Together, these imply selection for increased metabolic flexibility in the anaerobic gut environment [34]. A recent study has also found that, in the presence of antibiotics, strains of E. coli from clinically relevant lineages will develop mutations in core metabolic genes [42]. This includes mutations in the icd gene that encodes isocitrate dehydrogenase, a key enzyme within the tricarboxylic acid (TCA) cycle. That variation can occur in core central metabolism genes when under selection pressure from antibiotics underscores the likeliness of a relationship between metabolism and resistance that should be explored further.

For the successful pandemic clones of ST131, wider expansion is key. Most ExPEC lineages show stable population frequencies over 10 years following the emergence of ST131 in the UK despite exhibiting inferior resistance profiles. This suggests that ExPEC lineage distribution is shaped by negative frequency-dependent selection, preventing the MDR lineages from sweeping to fixation and completely dominating a population [10, 34].

It is important to note that clonal success may not depend purely on the possession of a resistant phenotype. A recent study of surveillance isolates collected in Norway found that the largely drug-susceptible ST131-H41 of ST131, notably lacking blaCTX-M and FQR genes, was well-established and dominant, relative to more resistant ST131-H22 isolates [43]. If resistance was the sole driver of the success of a clone, one might expect that the MDR clone, here from ST131-H22, would over time become the dominant isolate. A reanalysis of this dataset has found that 61 of the 75 ST131-H41 genomes did however encode non-blaCTX-M β-lactamases, raising the question of whether ST131-H41 can in fact be defined as drug susceptible [44]. Further, a longitudinal study of E. coli isolates in Oxfordshire, UK (2008–2018), has found that the presence of AMR genes is linked to increased incidences of bloodstream infections [45]. The drivers of E. coli clonal success are therefore likely to be multifaceted and with AMR not necessarily essential for the success of a clone [46].

Mutations that give rise to resistance

Colonization and expansion can be considered as primary events in ST131 MDR clone evolution, increasing the likelihood that the clone will acquire mutations that lead to resistance. This proposed trajectory is supported by work that revealed epistasis between the duration of host bacterial carriage and resistance; resistance confers a greater fitness advantage when the duration of carriage is extensive [46].

FQR is a key characteristic common to pandemic E. coli clones. FQR in E. coli is encoded chromosomally as a result of mutations in several genes, namely, DNA gyrase (gyrA) and topoisomerase IV (parC) [47]. There has been a correlation established in clinical isolates of E. coli between the number of resistance mutations carried by a strain and the minimum inhibitory concentration of fluoroquinolones against that strain, with resistant isolates also found to have a higher mutation rate than susceptible isolates [48]. FQR is notably present in ST131-H30R, within which a blaCTX-M-15 − carrying clone ST131-H30Rx has also been discovered [23, 49]. It is perhaps most likely that the mutations conferring FQR have been fixed once, followed by clonal expansion of the resistant ancestor. As mentioned previously, a fixed alternative FimH allele in a clone is also often heavily associated with FQR [36].

FQR in isolation is not known to trigger the expansion of a pandemic clone, but the fitness advantage that the FQR phenotype presented for ST131 in the clinical context of 1980–1990 is likely to have been a factor in its overall success [18]. Indeed, the likely importance of the sequential acquisition of VFs and FQR as a contributor to the global dissemination of ST131 has been suggested previously [18], lending support to the hypothesis that stepwise changes are key to the emergence of MDR clones. The evolution of FQR therefore aptly demonstrates the influential role that small but significant foothold mutations play in MDR clone evolution.

Priming for multidrug resistance plasmid acquisition

Contrary to the accepted “plasmid paradox” whereby plasmid carriage imposes significant fitness costs, ESBL plasmids have been shown to incur no general fitness cost in ST131 and in some cases even increase virulence properties [50]. It has been suggested that mutations in gene regulatory regions may compensate for plasmid fitness costs and contribute to the dominance of ST131 as an MDR pathogen in the clinical setting [16, 34, 51, 52].

Analysis of gene regulatory regions across ST131-H41, ST131-H22, and ST131-H30 has shown that alleles found in ST131-H41 and ST131-H22 are more similar to one another than those of ST131-H30, suggesting that the long-term presence of multidrug resistance plasmids in the latter could be the influential factor [16]. Numerous studies have shown that the genetic background of the recipient heavily influences the fitness impact of a plasmid, although more work is needed to pinpoint the specific epistatic features that enable these clones to carry multidrug resistance plasmids so efficiently [51, 53, 54].

The gain of multidrug resistance plasmids

The acquisition of multidrug resistance-inducing mobile genetic elements is the final step in the evolution of a pandemic ST131 clone. CTX-M class ESBL genes are a common feature, conferring resistance to third-generation cephalosporins and typically brought in by IncF-type plasmids [52]. ST131-H30Rx is thought to be particularly proficient in acquiring and maintaining large multidrug resistance plasmids [34]. The acquisition of the blaCTX-M-15 gene via IncFII plasmids is a key evolutionary event in ST131-H30Rx, a subclade of ST131-H30R [6, 9]. The dominant variant harbored by ST131-H30Rx is blaCTX-M-15, but other variants including blaCTX-M-14 and blaCTX-M-27 can be found within ST131-H30R1 at lower frequencies [6]. Beyond resistance, ESBL plasmids can also introduce beneficial colonization traits including biofilm formation, further solidifying the repertoire of enhanced survival capabilities [10, 16].

The evolution of pandemic clones within ST131 could therefore be considered to occur via a series of stepwise events that together contribute to the success of these MDR clones. There are interesting parallelisms that have been observed in other E. coli STs, although with a lineage-specific variation. This lends support to the hypothesis that these steps are important to the development of an MDR clone, while the exact nature of certain adaptations brought about by potentiating mutations may be lineage dependent.

Parallelisms in other lineages: ST410

ST410, the same as ST131, is estimated to have emerged in the 1800s, with key clones, specifically B2/H24R and B3/H24Rx, arising in the 1980s [9] (Fig. 2). This was followed by the emergence of carbapenem-resistant ST410-B4/H24RxC, encoding the carbapenemases blaOXA-181 and blaNDM-5 in addition to the blaCTX-M-15 encoded by ST410-B3/H24Rx [9, 55]. Crucially, multiple polymorphisms have been identified that have influenced the evolution of ST410-B4/H24RxC and that are associated with the acquisition of these carbapenemase genes. These potentiating mutations include porin genes ompC and ompF and the β-lactam target ftsI [56]. The acquisition of a blaCTX-M-15 gene, via an IncFII plasmid, is another feature of ST410-B4/H24RxC in common with ST131 clones [6, 9].

Clones within the ST410 lineage display other parallelisms with ST131 pandemic clones in several regards. ST410-B4/H24RxC is, for example, able to sequester iron from lactoferrin [55], a property which promotes gut colonization. The genetic causes of this key phenotype are single nucleotide polymorphisms (SNPs) and clone-specific core gene alleles, most notably a unique allele of fhuA, encoding the ferrichrome porin responsible for ferric hydroxamate uptake [57]. In this instance, the unique fhuA allele was introduced via a homologous recombination event early in the formation of the clone. ST410-B4/H24RxC is also associated with distinct clone-specific alleles of fimH, Fim24 [9], and fimbriae-like protein yadC [55]. The speculation that the FimH30 allele of ST131-H30Rx improves the fitness of the clone may reasonably be extended to ST410-B4/H24RxC and FimH24. Another mirrored trait between ST410 and ST131 pandemic clones is the observation that ST410-B4/H24RxC does not show an augmented virulence phenotype in comparison to other isolates of the ST410 lineage [54].

Furthermore, there is evidence for clone-specific SNPs in anaerobic metabolism loci in ST410-B4/H24RxC [55] that mirror prior findings within ST131 [16]. Specifically, a study of clinical CREC strains identified 382 SNPs in ST410-B4/H24RxC, of which 60 were non-synonymous mutations in 48 genes [55]. The authors note that the majority of these genes function in metabolism, including araAB (L-arabinose degradation) and thiBPQ (thiamine transport), and three genes that encode dehydrogenases involved in anaerobic metabolism: leuBlpdA, and the putative shikimate dehydrogenase gene. In another parallelism with ST131, FQR, arising from mutations in gyrA and parC, is known to be present in ST410 clones [9].

The ST410-B4/H24RxC MDR clone is further distinguished from the rest of the ST410 lineage by specific SNPs in genetic loci involved in gene regulation [55]. The intergenic regions in which these SNPs have been found include those upstream of fhuA and downstream of sgrR and sgrT, the regulator and inhibitor, respectively, of a glucose transporter [55]. Intergenic SNPs, including those related to metabolic processes, may therefore be considered as a species-wide evolutionary marker of an MDR clone, given the parallelisms between ST410 and ST131.

ST167

ST167, the same as ST410, is a lineage within phylogroup A (Fig. 1). ST167 emerged from the ST10 clonal complex and has only been reported clinically within the last 10 years [58]. It is now a significant CREC lineage that is known to carry the carbapenemase blaNDM-5 [8, 58].

The evolution of ST167 draws parallels with ST131-H30 in the diversity exhibited in its anaerobic metabolism genes. It is known to encode, for example, unique alleles of tdcD, a propionate kinase that functions in the anaerobic degradation of L-threonine [8]. A clone-specific allele of eutA is present in ST167, a reactivase that also has clone-specific alleles present within ST131 [8, 34]. While ST410-B4/H24RxC is known to have a unique allele of fhuA, ST167 possesses a variant of fhuB, encoding an iron(III) hydroxamate ATP-binding cassette (ABC) transporter membrane subunit [8]. These variations, and that observed in the anaerobic metabolism-related genes of other MDR clones, may contribute to increased fitness in the environment of the mammalian gut and to the overall success of a pandemic clone.

In a similar manner to clones of ST131 and ST410, small mutations also likely played a role in the expansion of the ST167 lineage. ST167 exhibits clone-specific compensatory mutations in intergenic regions, namely, IGR0012 and IGR0014, mirroring observations in ST131 and ST410. These mutations in intergenic regions are proposed to be an evolutionary event underpinning the successful integration of large AMR plasmids and hence the emergence of multidrug resistance [8].

This CREC lineage is also known to have acquired mutations that link to resistance, notably ftsI which encodes the penicillin-binding protein 3. This has a high similarity to an allelic variation of the same gene that is present in carbapenem-resistant ST410 strains [8, 56]. It could therefore be concluded that ST167 displays many of the same hallmarks of multidrug resistance evolution that are epitomized in ST131.

ST648

ST648 is a globally disseminated ExPEC that is known to encode resistance to ESBLs [40, 59, 60]. It was first reported in 2009 [61] (Fig. 2), and less is known about this lineage than that of ST131. There are, however, still parallelisms encoded by ST648 that could be regarded as potentiating mutations.

ST648, for example, is known to harbor more virulence-associated genes than the mostly commensal ST10 lineage [10]. This includes VFs that promote colonization, a trait observed in ST131 pandemic clones. VFs that relate to colonization and competitiveness and that are common to both lineages include the previously mentioned iutA and ompT [40]. The ST648 pandemic lineage also exhibits enhanced curli fiber production in comparison to ST10, aiding adherence to epithelial cells and biofilm formation [10]. This phenotype is particularly interesting given that the ST648 isolates investigated by Schaufler et al. were found to lack certain genes, including the pap operon that are typically associated with urinary tract infections [10]. There are however other isolates of ST648 that are predicted to encode pap genes [62], indicating there may be interesting differences within this lineage. Additionally, ST648 has been shown experimentally to have colonization capabilities comparable with ST131 in a murine in vivo assay [62].

There are therefore clear similarities between ST648 and clones of both ST131 and ST410. ST648 is categorized into phylogroup F (Fig. 1), demonstrating that parallelisms to ST131 can be found across the E. coli phylogeny. As of now, little is known about the potential for important metabolic variations within ST648. Given the observations surrounding SNPs in anaerobic metabolism genes in other E. coli lineages, this should be investigated as a priority in ST648 to establish whether the metabolic variation is an important part of its evolution.

ST1193

ST1193 is an emerging FQR clone of E. coli that carries the FimH64 allele [63, 64]. This lineage has emerged much more recently than ST131 and ST410 (Fig. 2) but is nevertheless a rapidly evolving hallmark of a pandemic clone. ST1193 has been reported to be one of the most common E. coli ST in some studies of nosocomial infections, second only to ST131 [65, 66]. ST1193 isolates are known to encode blaCTX-M-15, blaCTX-M-27, and blaTEM, as well as resistance to trimethoprim-sulfamethoxazole and tetracycline [67, 68]. Significantly, it has also been reported to possess the iron uptake gene iutA at a frequency of over 90% in a screen of Chinese hospital isolates in 2014–2015, among other VFs including fyuA (yersiniabactin) and kpsM II (group II capsule) [64].

ST1193 differs from ST131 in its evolution of FQR, with an additional mutation in parE in some ST1193 isolates [69]. The mutations that led to FQR in ST1193 are thought to have been gained simultaneously in 11 homologous recombination events [70]. This evolution of FQR is a significant feature of ST1193, with Tchesnokova et al. suggesting that this could be a key driver behind the success of this clone [70]. It would be interesting to monitor the evolution of ST1193 to track further similarities and differences between this recently emerged clone and the pandemic clones of ST131.

Concluding remarks

In many respects, E. coli ST131 could be considered a model for the evolution of MDR clones. The stepwise emergence of multidrug resistance as a result of a series of potentiating mutations promoting colonization and expansion is a pathway that could conceivably be mirrored in other species. Indeed, there are several parallelisms (the influence of enhanced colonization capabilities, the role of point mutations, and the fixing of FQR) in pandemic clones of other E. coli STs. This is a highly clinically relevant consideration; the acquisition of multidrug resistance plasmids may be the last link in a chain that could be broken earlier in non-resistant lineages. Specifically, one could consider whether variations in virulence factors or metabolic genes could be an early warning for the propensity of a lineage to subsequently evolve into an MDR clone.

Overall, the evolution and emergence of MDR E. coli clones are multifaceted. The influential factors discussed here are likely only small parts of the story, and uncovering the remainder of the narrative will have far-reaching clinical implications.