Introduction

Helicobacter pylori infection is recognized as an important public health concern in more than 50% of the world's population and may be associated with such gastrointestinal diseases as chronic gastritis, gastric/duodenal ulcers, MALT-lymphoma and stomach cancer1.

Various regimens for H. pylori eradication therapy are used worldwide, while the choice of appropriate eradication therapy is determined by the local resistance patterns2. Conventional triple therapy (clarithromycin (CLR), proton pump inhibitor (PPI), metronidazole (MTZ) or amoxicillin (AMX)) may be recommended as first-line treatment in areas of low (< 15%) CLR resistance, while quadruple therapy (bismuth, PPI, MTZ, tetracycline) if the local CLR resistance is over 15%2,3,4. According to the recently published Maastricht VI guidelines, routine susceptibility testing (molecular or after culture) is recommended even before prescribing first-line treatment. However, systematic antimicrobial susceptibility testing (AST) in clinical practice remains controversial due to the practical, economical and logistical issues that must be considered3,5.

To date, Russia belongs to countries with a high prevalence of H. pylori infection (65–90% depending on the region)6. According to a meta-analysis that summarized studies of H. pylori antibiotic resistance in various regions of Russia over 10 years (2011–2020), the resistance rate to CLR reached 10.39%. That indicates a low resistance of the pathogen (< 15%) and allows to consider the triple scheme of treatment as the first-line empirical therapy (based on the absence AST data) in our country7. However, the empirical therapy frequently results in treatment failure, which, in turn, leads to the development of antibiotic resistance of the pathogen.

During the last decade, many molecular approaches (such as polymerase chain reaction (PCR), Sanger sequencing, whole genome sequencing (WGS) were used to identify known and novel markers for prompt prediction of the phenotypic antimicrobial resistance. It is generally accepted that point mutations A2142G and A2143G (according to Taylor et al. numbering) in V domain of the 23S rRNA gene are mainly responsible for H. pylori resistance to CLR8. At the same time, point mutations outside these positions (T1942C, G1939A, C2147G, G2172T, T2182C, A2116G, A2144G/T, A2115G, G2111A, T2117C, G2141A, T2182C, T2717C, T2289C, G2224A, C2245T) vary geographically and their role in CLR resistance is not yet clear9,10. Based on WGS results, some sequence changes in rpl22 (encoding a ribosomal protein that interacts with the 23S rRNA domains) and infB (encoding a translation initiation factor, IF-2) genes in CLR-resistant H. pylori strains were described11. In addition, four gene clusters of efflux pump systems (HP0605-HP0607 (hefABC), HP0971-HP0969 (hefDEF), HP1327-HP1329 (hefGHI), and HP1489-HP1487) were identified as the resistance-nodulation-cell division (RND) family and are expected to be involved in development of CLR resistance. Some studies suggest that the number of single nucleotide variants (SNVs) in the RND family is significantly higher in CLR-resistant than in CLR-susceptible H. pylori isolates8,12,13.

Due to the lack of data on CLR resistance patterns in Russian H. pylori clinical isolates, we aimed to comprehensively investigate sequence variations by WGS-based approach to detect putative markers of CLR resistance in H. pylori clinical isolates and evaluate the correlation between genotypic and phenotypic AST.

Results

Phenotypic AST of study cohort

For our analysis we selected two types of isolates: resistant (inhibition zone diameter < 10 mm) and susceptible (> 41 mm). Isolates with intermediate susceptibility were not included in our study. Of 44 H. pylori clinical isolates, 27 were obtained from newly diagnosed patients and 17—from previously treated patients after eradication failure.

As a result, 23 out of 44 H. pylori isolates were phenotypically determined as CLR-resistant, while 21 isolates showed susceptibility to CLR. All isolates from previously treated patients were CLR-resistant by phenotypic AST.

Plasmid detection in H. pylori isolates

Of the 44 H. pylori genomes analyzed in our study, 40 comprised hypothetical proteins, recognized by annotation tools as plasmid sequences. However, no resistance genes in putative plasmids were found.

Single nucleotide polymorphisms in 23S rRNA genes

Among 44 H. pylori isolates, 21 nucleotide substitutions were identified in 23S rRNA gene, however, only one of them- A2147G (aka A2143G)- was significantly associated with phenotypic resistance of H. pylori to CLR: of the 23 CLR-resistant isolates, 13 (56.5%) carried the 2147G allele (p = 0.00025) (Table 1). However, A2147 allele was related to CLR susceptibility in 95.2% (20/21) of cases, since one strain phenotypically determined as CLR-susceptible possessed A2147G mutation.

Table 1 nsSNVs and Indels in coding regions of CLR-resistant and CLR-susceptible H. pylori clinical isolates compared to the H. pylori 26695 reference genome.

The other point mutation—A2146G (aka A2142G)—was found in 13.0% (3/23) of CLR-resistant isolates (p = 0.33). The A2146C allele was not detected in our study. All CLR-susceptible isolates (100%; 21/21) possessed A2146 allele. None of the CLR-resistant isolates carried two A2146G/A2147G mutations simultaneously: all 2147G isolates carried the A2146 allele and, vice versa, all 2146G isolates carried the A2147 allele. Seven CLR-resistant isolates had neither A2146G nor A2147G mutations. The agreement between genotypic and phenotypic susceptibility to CLR based on both A2146G/A2147G mutations was 69.6%.

Out of 17 H. pylori isolates from previously treated patients, 13 (76.5%) possessed either A2146G or A2147G mutations. This observation is entirely consistent with the general agreement on increasing the resistance to CLR during treatment regimen.

Since the H. pylori genome contains two copies of the 23S rRNA gene, we assessed the presence of putative allelic variants due to mutational differences in each copy. A survey of the depth of mapped reads showed that 2146G and 2147G alleles were present in 100% reads of 15/16 CLR-resistant isolates which indicates that mutations has occurred in both copies of 23S rRNA. However, one CLR-resistant isolate (HP220) demonstrated a heterogeneity between two copies of 23S rRNA: the 2147G allele was detected in only 50% reads of one copy, while the second showed 80.4% reads of A2147 allele.

The remaining mutations outside 2146/2147 positions in 23S rRNA gene were found in H. pylori isolates regardless of phenotypic susceptibility (Table S1).

Nucleotide variants in rpl22, infB and RND family genes

Since the presence of four RND efflux clusters (HP0605-HP0607, HP0971-HP0969, HP1327-HP1329, HP1487-HP1489) and certain genetic variants in rpl22 (HP1314) and infB (HP1048) genes have been reported to be presumably associated with CLR-resistance, we analysed the distribution of sequence variations in these genes between two phenotypic groups as well as between the A2146G/A2147G-mutant isolates and the wild type8,10,11,13.

As a result, we found that 12 efflux pump genes were present in all 44 H. pylori genomes. Moreover, we did not reveal any significant differences in either individual SNVs or total SNVs number between CLR-resistant and susceptible groups.

Due to analysis of infB gene sequence variations we did not identify any of C718T, A1899G, G160A, G60A mutations, which were considered to be involved in increasing antibiotic resistance9,10,11. Among all types of SNVs, only one—C2763A was significantly predominated in CLR-susceptible isolates (p = 0.03; OR: 4.6743) (Table 1). The other synonymous variant G2118A was found specifically in CLR-resistant isolates (26.1%; p = 0.021) however, silent mutations are not likely to affect the phenotype. We also did not find significant differences in total number of missense mutations in infB gene neither between CLR-resistant and susceptible groups nor A2146G/A2147G-mutant isolates and the wild type.

The most interesting finding among genetic variants in rpl22 gene is the absence of two types of mutations—3 bp (GTG) deletion and 9 bp (TTCCATGTA) insertion—which have been detected in numerous studies8,9,10,11,33. Instead, we identified three missense variants, none of which were associated with phenotypic resistance/susceptibility status (Table S1).

Distributions of virulence-associated genes in H. pylori clinical isolates

To investigate whether the presence or absence of virulence-associated genes is related to phenotypic resistance H. pylori to CLR, we evaluated the presence of cag pathogenicity island (cagPAI) genes (cag1-cag26), vacuolating cytotoxin gene (vacA), ulcer-associated restriction endonuclease (induced by contact with epithelium) and adenine specific DNA methyltransferase genes (iceA1, hpyIM), urease gene cluster (ureABCIEFGH), gamma-glutamyltranspeptidase gene (ggt), virulence factor protein (mviN), outer membrane protein family (omp1-omp32, ompP1), toxin-like and iron-regulated outer membrane proteins genes (fecA, frpB), flagellar family genes (flaA, flaB, fla, flaA1, flgH, fliF, flgI, fliE, flgB, flgC, flgG, flhA, flhF, flhB, fliP, fliR, fliH, fliI, flgE, flgK, fliD, motA, motB, fliG, fliM, flaG, hpaA, fliN), heat shock protein family genes (hslU, hslV, htpX, ibpB) in two groups of H. pylori isolates. Since the cagA (cag26) gene is considered to be a marker of cag pathogenicity island, the presence of cagPAI region was assessed by detection of cagA gene.

As a result, the majority of virulence genes were certainly observed in all 44 H. pylori genomes. Remarkably, only the presence of cagPAI in CLR-susceptible group was higher than that in CLR-resistant group (16/21 vs. 13/23), however, not significantly (p = 0.42) (Fig. 1). Interestingly, that none of Russian isolates contained cag2 (HP0521) gene and all 44 H. pylori isolates regardless of antimicrobial susceptibility harbored A589G mutation in the cag13 (HP0534) gene leading to the loss of stop codon and splicing with coding region HP0533.

Fig. 1
figure 1

Comparison of the presence of cag pathogenicity island (cagPAI) between resistant and susceptible groups of H. pylori isolates.

Liu et al. (2022) found that the average frequency of the iron-regulated OMP genes (fecA and frpB) in the CLR-resistant group was significantly lower than that in the CLR-susceptible group (p = 0.01)27. In our study, both genes were present in all H. pylori isolates, and the average number of mutations in the fecA and frpB genes was not associated with any phenotypic group.

For the first time, we also compared the average number of all fsIndel and nonsynonymous variants in 106 genes listed above between CLR-resistant and CLR-susceptible groups of H. pylori isolates. There were no statistical differences in the average number of mutations between two phenotypic groups (p > 0.05).

Novel sequence variations in coding regions of Russian H. pylori isolates

To detect new putative markers of phenotypic resistance/susceptibility to CLR we conducted a global gene analysis of all H. pylori isolates compared to the H. pylori 26695 reference genome. We investigated sequence variants assessed by the annotation tool as having a significant impact on the translated protein and evaluated their association with the antibiotic phenotype. The majority of variants were equally shared between two phenotypic groups of isolates. However, we were able to detect five novel resistance-associated variants and variants that were predominated in either resistant or susceptible phenotypic group. These findings were summarized in Table 1 and Table S1.

Among OMP (Outer Membrane Protein) family genes the most interesting findings were related to omp32 (HopW), omp5 and omp25 (HopI) genes. Missense mutation A2C resulting in loss of start-codon in the omp32 gene, conservative inframe insertion 7:insTTTTGA in the omp5 gene, and nsSNV C1400T in the omp25 gene were detected in 21.7% (5/23) CLR-resistant isolates per each (p = 0.049). None of these mutations were found in CLR-susceptible isolates.

Several unique frameshift (fs) alterations in HP0820 coding region were detected in 30.4% (7/23) CLR-resistant isolates. A fs deletion 252:delCGGGT was linked with another fsIndel in HP0820 onwards: deletion of two (262–263delGG) or three (262–264delGGG) nucleotides and insertion of TTAGCACA fragment.

Among 28 flagellar family genes three nsSNVs in flaB gene were found predominantly in resistant group, however, only one of them (C783A) was significantly associated with resistance (p = 0.021). Alternatively, we revealed that the prevalence of nsSNV (C1997G) in the fliD gene was significantly higher in the CLR-susceptible rather than CLR-resistant group (33.3 vs. 4.3%, respectively) (p = 0.019, OR: 0.1057741).

Three multidrug resistance protein genes (hetA, msbA, spaB) were analysed in present study. As a result, the proportion of CLR-resistant H. pylori isolates carried missense mutation G98A in multidrug resistance protein SpaB gene was significantly higher than that in CLR-susceptible isolates [52.2 vs. 19.0% (p = 0.030; OR: 4.3874)]. Interestingly, using WGS analysis, Liu et al. (2022) showed that the spaB was absent in most strains isolated from Shanghai, whereas in our study this gene was certainly observed in all clinical isolates27.

In addition, the same report showed that four variations (S275N, D353N, 120-121delGG and ins116G117) within the integrase/recombinase gene xerD were significantly associated with both CLR and MTZ resistance27. In the present study, S275N mutation was found without significant differences between resistant and susceptible phenotypes (82.6 vs. 80.9%; p > 0.05) (Table S1). Instead of 120-121GG deletion, we identified missense multiple substitutions of four nucleotides (117-124CTAGGATT → TTAATATC) that were linked together and have prevailed in resistance (34.8%) compared to susceptible (19.0%) group, however, not significantly [p = 0.32; OR: 2.1937]. Insertion 116G117 and missense variant D353N were not detected in our study.

Five point substitutions within recombinational repair protein RecA gene were linked together and were carried predominantly by resistant (34.8%) rather than susceptible (9.5%) isolates [p = 0.072; OR: 4.674359]. The prevalence of missense variant (C1T) in the type III restriction enzyme gene (res2) leading to the replacement of alternative start-codon Val with standard Met was higher in resistant group, however, not significantly [p = 0.097; OR: 0.1472963]. It is known that the changing GUG into AUG results in a several-fold increase in the translation efficiency28.

Since rpl are considered to be the genes with the greatest value, we initially conducted the analysis of 49 ribosomal proteins genes. As a result, we found one nsSNV (C257T) within rpl10 gene (encodes ribosomal protein L10/50S) that was predominated in CLR-resistant group, but not significantly [p = 0.072; OR: 4.6743].

Further analysis of the distribution of newly detected mutations specific for resistant group in the A2146G/A2147G-mutated H. pylori clinical isolates revealed that 6 of 7 (85.7%) HP0820-mutated isolates possessed either 2146G or 2147G alleles (Table 2). In total, more than half (67.8%) of CLR-resistant isolates with mutations in five coding regions (omp5, omp25, omp32, HP0820, flaB) were A2146G/A2147G-positive.

Table 2 The distribution of newly detected mutations in A2146G/A2147G-positive and A2146G/A2147G-negative CLR-resistant H. pylori clinical isolates.

Phylogenetic analysis and relationship between isolate relatedness and CLR-resistance patterns in H. pylori isolates

Based on 153,292 core SNVs of 44 H. pylori genomes we constructed the phylogenetic tree based on the maximum likelihood estimation with respect to CLR-resistance patterns (Fig. 2). As a result, no isolates relatedness associated with the antibiotic phenotype was revealed. Instead, these isolates had distinct resistance profiles regardless of their genetic relatedness.

Fig. 2
figure 2

Phylogenetic analysis of 44 Russian H. pylori genomes with respect to CLR-resistance patterns. Maximum likelihood tree based on core SNVs alignment from draft genome mapping against H. pylori 26695 reference strain and resistance profile corresponding to specific genetic determinants that differ between CLR-susceptible and CLR-resistant groups of H. pylori isolates. The susceptible and resistant patterns are denoted by blue and red rectangles; the presence and absence of the specific loci are denoted by green and grey rectangles, respectively.

Discussion

Despite insufficient knowledge and interregional dynamic differences in the resistance of H. pylori to the main antimicrobials, the level of the primary CLR resistance in Russia is low (10.39%) and allows to consider a standard triple eradication regimen as a first-line therapy. Thus, routine H. pylori susceptibility testing usually is not provided in clinical settings, while empirical therapy may result in eradication failure as well as increasing resistance rate. According to the latest data (2020), the prevalence of primary CLR resistance of H. pylori in Moscow reached 10.87%, in Smolensk—5.74%, in Kazan—10.0% and in St. Petersburg—22.26%29,30. Such a difference indicates the wide variation in resistance rates in Russia, highlighting the need to ongoing examination of CLR resistance as well as cure rates in each geographic region.

For our study, we used conventional disc diffusion method to detect susceptibility of H. pylori isolates to CLR. Although the EUCAST has recommended the E-test for MIC breakpoints detection, the high cost of E-test strips makes it difficult to use. Recently, Tang et al. (2020) assessed the disk diffusion technique against E-test and estimated that susceptibility agreement between two methods was 96.0% for CLR14. It was demonstrated that disc diffusion is simple and affordable alternative to E-test for in vitro CLR susceptibility testing31,32.

The mechanism of drug resistance in H. pylori is believed to be due to mutations located in the chromosome and not related to horizontal transfer of plasmids or other mobile elements33. It is generally accepted that H. pylori resistance to CLR is associated with two point mutations A2146C/G and A2147G (in accordance to H. pylori reference strain 26695 numbering) in the V domain of 23S rRNA gene that decreases the affinity of drug binding to the ribosome by disrupting the peptidyl transferase loop conformation6. In the present study, A2147G was the most frequent mutation found in CLR-resistant H. pylori isolates (56.5%) and highly correlated with CLR resistance. However, A2146G mutation has been detected in only 13% CLR-resistant isolates that does not allow us to assess the actual role of this mutation in the resistance development. The combination of mutations A2146G and A2147G was detected in 69.5% isolates, which is consistent with prior studies conducted in Europe demonstrated that both mutations were present in 60–90% of CLR-resistant isolates8,33,34. In addition, we did not find any isolates carrying both mutations simultaneously, which also is in agreement with most previous observations8,9,34,35. The high rate of CLR-resistant isolates without A2146G/A2147G mutations (30.5%) in our study indicates that these isolates may have genetic determinants other than 23S rRNA gene responsible for pathogen resistance development.

Notably, one isolate phenotypically determined as CLR-susceptible in our study possessed A2147G mutation. Such a case may indicate heteroresistance caused either by evolutionary changes in a single strain or by the presence of mixed susceptible and resistant isolates. As shown by our previous study, this isolate was positive for mixed combinations of vacA alleles (s1s2 and i1i2) indicating the coexistence of several genetically different H. pylori isolates in gastric sites as a result of mixed infection6.

Furthermore, there are two copies of the 23S rRNA operon within the H. pylori genome. Sequence differences between both copies have rarely been studied and data on whether 2146/2147-mutations in single or both copies of 23S rRNA are necessary for the strain resistance to occur are still controversial35. In this study, 16 CLR-resistant isolates possessed A2146G/A2147G mutations, and only one of them demonstrated a heterogeneity between two copies of 23S rRNA: the 2147G allele was detected in 50% reads of one copy, while the second copy showed 80.4% reads of A2147 allele. This observation suggested that H. pylori isolates should be considered resistant if they have the mutation in at least one copy of 23S rRNA gene.

Other genes assumed to be involved in CLR resistance are rpl22 and infB. In fact, several point mutations in infB (C718T, A1899G, G160A, G60A) as well as 9-bp insertion and/or 3-bp deletion in rpl22 genes have been described, which in combination with 23S rRNA mutations were predicted to increase the resistance level of H. pylori isolates11,36. Interestingly, no indels in rpl22 gene were detected in our study: instead, we found 3-bp deletion in infB gene primarily in CLR-susceptible isolates, but not significantly. Notably, we did not find any of point mutations in infB gene that were detected in other studies. Among all SNVs in infB, no substitutions unique for resistant isolates were identified. Altogether, these data provide further evidence that high genetic diversity of H. pylori populations is the crucial factor in the searching for putative determinants of resistance.

Another possible mechanism for CLR resistance is multidrug efflux pump systems. It has been reported that four conserved RND families of efflux pump transporters may be responsible for development of macrolide resistance in H. pylori by reducing intracellular antimicrobial concentration8,10. As shown by Iwamoto et al. (2014), there is a significant difference in the number of SNVs in hefABC (hp0605–hp0607) cluster between CLR-resistant and susceptible groups, as well as CLR-resistant isolates are more prone to single-nucleotide variants in all four clusters of efflux genes13. Moreover, Chen et al. (2018) revealed that the number of mutations in the RND family was significantly higher in H. pylori isolates harboring the A2147G point mutation in 23S rRNA gene9. In our study, we did not find any significant differences in number of SNVs among 12 genes neither between resistant and susceptible phenotypes, nor between isolates carried A2147G mutation and the wild type. Noteworthy, among all types of genetic variants, we found only single-nucleotide substitutions: no deletions or insertions were detected in four clusters of efflux genes. This result is consistent with the hypothesis that SNVs in efflux pump genes may contribute to CLR resistance through a synergistic effect between 23S rRNA mutations and efflux pumps.

Giving the enormous genetic diversity of H. pylori and the high heterogeneity of resistance genotypes across geographical regions, identification of novel resistance patterns as well as interpretation of the obtained results appear to be challenging. WGS-based approaches have enabled to provide a comprehensive view of bacterial genotypes and are particularly useful for tracking novel genetic determinants responsible for antimicrobial resistance of the pathogen10.

To identify novel mutations potentially contributing to antibiotic resistance in H. pylori isolates we next sought to analyze genetic variants in other gene families presumably associated with virulence and multidrug resistance. In the current study, the largest number of mutations specific for resistant group was observed among OMP family genes: omp5, omp25 and omp32. It is known that OMPs are associated with macrolide resistance in Gram-negative bacteria. Lin et al. (2023) suggested that alteration of outer membrane proteins is a potential mechanism for CLR resistance development in H. pylori36. Moreover, aside from mutations, the expression of the omp25 and omp32 genes was reported to be down-regulated by long-term low-dose of AMX, which has a significant impact on the treatment efficacy36,38. Interestingly, Thompson et al. (2003) showed the close coexpression of the cagA gene with the omp5 and omp29 genes, suggesting these OMPs are involved in the secretion and/or activation of CagA37. Among all variants in the OMP family genes we identified, the missense mutation in the anion-selective porin Omp32, leading to the loss of the start codon, had the greatest impact on the gene. Since it is unknown whether the accumulation of such mutations may indirectly be involved in the resistance development, its hypothetical impact needs to be further investigated.

Another interesting finding: resistance-associated fs deletion in the coding region HP0820 (cj1247c in C. jejuni), which Porcelli et al. (2013) described as a predicted leaderless mRNA in all species containing this gene, which is always upstream of the uvrC DNA repair gene. Genes translated from the leaderless mRNAs have been shown to encode the proteins involved in stress responses, as well as outer membrane efflux and multidrug efflux pump proteins39. It is well known that antibiotics create conditions that favor selection of beneficial mutants with elevated mutation rates and this is most often linked to the DNA repair system and SOS response, that can lead to increased activity of associated multidrug efflux pumps. Given the fact that H. pylori has a defective DNA mismatch repair system (lacking genes as mutS, mutL, and mutH implicated in this pathway), as well as lack of the SOS regulon, the antibiotic-induced mutagenesis in various types of regulatory genes increases the genetic diversity of bacterial populations and, as a consequence, enhances the evolution of antibiotic resistance35,40.

Missense mutations in two different flagellar genes flaB and fliD may also be candidates indirectly involved in underlying mechanisms of resistance and/or susceptibility of H. pylori. The pathogenesis of H. pylori infection is thought to be determined in part by flagellar motility, which has a direct impact on colonization, inflammation, and immune evasion. Flagellar filament FlaB and the filament-capping protein FliD are one of the main structural proteins of the H. pylori flagellum and play an important role in bacterial motility41,42. Even though the actual impact of these mutations on resistance development is difficult to assess, the mutational buildup in the genome may be a result of response to antimicrobials and an important predictor of treatment failure.

As for relationship between virulence-associated genes and phenotypic resistance, we determined that there is no significant correlation between the presence of main virulence factors and resistance to CLR. On the one hand, our findings are consistent with majority of previous reports27,43,44. However, it is worth noting that the number of cagPAI-negative isolates in the CLR-resistant group predominated roughly twofold compared to those in the susceptible group (10/23 vs. 5/21). Thus, on the other hand, this observation in consistent with other findings showing that CLR resistance is obviously higher in cagA-negative than in cagA-positive isolates45,46.

The present study has several limitations. First, despite the high level of agreement between disc diffusion method and E-test, the main drawback in this research is the inability to determine MIC values, which does not allow us to reveal a causal relationship between mutations and high/low MIC values of phenotypic resistant isolates. Second, the sample size is rather small and obtained results do not necessarily reflect resistant patterns in general Russian H. pylori population. Third, we did not provide further experimental analysis to confirm the involvement of newly detected mutations in CLR resistance development. Nevertheless, in spite of these limitations obtained data provided the foundation for the future investigations.

Conclusions

This study presents the first WGS insight into genetic diversity of H. pylori in Russia with a particular focus on the molecular basis of antibiotic resistance. For the first time, the average number of all fsIndel and nonsynonymous variants in 106 virulence-associated genes between resistant and susceptible groups was compared. It has been shown that among all genetic variants we obtained, the combination of mutations A2146G/A2147G in the 23S rRNA gene is the most reliable for prediction of phenotypic AST. Sequencing data did not reveal the involvement of nucleotide variants in any efflux pump, infB or rpl22 genes in phenotypic prediction of CLR resistance, emphasizing the enormous genetic diversity of H. pylori populations. The novel mutations in other genes were described in our study as potential markers for the resistance development. Among them, the most prominent mutation is frameshift deletion (252:CGGGT) in HP0820 coding region, which is a good candidate for further investigation.

Materials and methods

Study design and sampling collection

In total, 44 H. pylori clinical isolates collected between 2014 and 2022 were retrieved from the bacterial strain collection of the St. Petersburg Pasteur Institute, St. Petersburg, Russia. All strains were isolated from gastric biopsy specimens taken during endoscopy from the separate adult patients with chronic gastritis (n = 32), duodenal ulcer (n = 11) and gastric cancer (n = 1). The patients were 18 men (40.9%) and 26 women (59.1%); median age 45.8 ± 4.5 years (range 22–70 years). For this study, H. pylori isolates were deliberately selected based on their CLR-resistance phenotype and hence are not an epidemiologically representative sample of H. pylori primary antibiotic resistance.

Endoscopic biopsy specimens were homogenized and used for the culture on H. pylori-selective medium (Columbia agar base with the addition of 5–7% defibrinated horse blood and 1% IsoVitalex solution). The bacterial cultures were incubated at 37 °C under microaerophilic conditions (10% CO2, 85% N2, 5% O2, GasPak 100, BD Biosciences, USA). After 7–10 days incubation H. pylori colonies were identified by microscopy of Gram-staining culture smears and biochemical tests (urease, catalase, and oxidase). Final identification was made using MALDI-TOF MS (Autof MS1000, China). The H. pylori cultures were stored at − 80 °C for further examination.

Ethics approval

The retrospective study was approved by the Independent Ethics Committee of the St. Petersburg Pasteur Institute, Russia (protocol No. 50/04-2019, 22.06.2020). All patient-related data were treated anonymously. All methods in our study were conducted in accordance with relevant guidelines and regulations.

Phenotypic antimicrobial susceptibility testing (AST)

The susceptibility of H. pylori isolates to CLR was determined by conventional disc diffusion method. The test was performed by direct suspension of a 72-h cultivated H. pylori strains in Müller-Hinton broth adjusted to a density of 0.5 OD600 or 1–2 × 108 cells CFU/ml according to the McFarland scale. Bacterial suspension (0.1 ml) was inoculated onto plates with fastidious Mueller Hinton agar medium supplemented with 5% defibrinated horse blood and evenly distributed over the surface with a spatula. Immediately after inoculation, 6-mm-diameter discs containing CLR (15 µg) were placed on the plate surface (1 disk per plate). After incubation at 37 °C under microaerophilic conditions for 72 h, the inhibition zone diameters were measured in millimeters (mm). H. pylori NCTC 12,823 strain was used as a control for susceptibility to CLR. To date, there are no standardized cutoff values for disk diffusion method. Using the linear regression analysis, Tang et al. (2021) calculated the inhibition zone diameter breakpoint to be 41 mm for CLR corresponding to the EUCAST MIC cutoff value of 0.5 mg/L14. Such inhibition zone diameter, obtained using the 15-μg CLR disks, also correlate well with the MIC by E-test (r =  − 0.894).

DNA extraction and sequencing

The total DNA of H. pylori isolates was extracted using the QIAamp DNA Mini Kit (QIAGEN GmbH, Germany) according to the manufacturer's guidelines. The DNA concentration of each sample was quantified on a Qubit 4.0 fluorometer. Whole-genome shot-gun DNA libraries were prepared using MGIEasy FS DNA Library Prep Set and then sequenced on a DNBSEQ-G50 sequencer (MGI Tech Co. Ltd, Beijing, China) in a 2*100 paired-end (PE) mode.

Bioinformatics analysis

The raw paired-end reads were initially analyzed using FastQC software (v.0.12.1; Babraham Institute, Cambridge, UK) to assess their acceptability for further analysis. The reads were trimmed to remove adapters and low-quality sequences (Q-score < 20) and filtered by Trim Galore! (version 0.6.7). Bacterial genomes were assembled de novo using SPAdes assembler software (version 3.13.1)15, and the results were evaluated with QUAST (version 5.2.0)16.

To identify plasmid-encoded resistance genes, trimmed FASTQ files were analyzed by plasmidSPAdes pipeline (Version 3.15.4)17. Next, high-quality reads were merged by PEAR software18 installed via Conda environment19 and aligned to the H. pylori 26695 reference genome, available at NCBI GenBank under the accession number AE000511.1. To evaluate the genetic variations between H. pylori isolates and reveal potential genotype-to-phenotype correlations, the insertions/deletions (indels) and SNVs were called from alignments using Snippy pipeline v.4.6.0 (https://github.com/tseemann/snippy) with the following parameters: minimum base quality score—30, minimum mapping quality—60, minimum coverage—20, proportion for variant evidence—0.920. CLR-resistant isolates were also assessed for any novel point mutations. Putative impact of identified variants was predicted using SnpEff software v.4.321. Aligned nucleotide sequences were visually analyzed using UGENE v. 38.122. Particular genome features were retrieved from the UniProt platform (https://www.uniprot.org/)23. Mutations clustering analysis and a maximum likelihood tree was constructed by using the RAxML-NG is a phylogenetic tree inference tool based on core genome obtained using Snippy core24. The reliability of the phylogenetic tree branches was evaluated by bootstrapping method with 1000 replications.

All genome assemblies were deposited to the NCBI and available under BioProject “Whole-genome sequence variations in Russian Helicobacter pylori isolates” (Accession: PRJNA1011037 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1011037).

Statistical analysis

All statistics and data visualization were performed in R programming language, Environment version 4.3.1 with the following packages: tidyverse v. 2.0.0., epitools v. 0.5-10.1, psych v. 2.3.625,26.

The association between genotypic and phenotypic groups was screened using Chi-square and Fisher’s exact tests. The significance level of the differences was set at α = 0.05. We used two-by-two tables to calculate the odds ratio (OR).