Introduction

The Clostridioides difficile is a spore-forming, anaerobic, and gram-positive rod-shaped bacterium commonly found in the animal gut and the soil1. It is one of the commensal bacteria in the human gut but can lead to antibody-associated diarrhea or pseudomembranous colitis, referred to as C. difficile infections (CDIs). Transmission among the human population occurs primarily through the fecal-to-oral route. The attention has paid to C. difficile since it has been causing outbreaks in healthcare facilities, resulting in a significant socioeconomical burden. From a healthcare-associated infection (HAI) standpoint, implementing contact precautions is crucial for preventing nosocomial CDIs outbreaks2. The principal virulence factors of C. difficile include toxins A and B encoded in tcdA and tcdB, respectively, in the pathogenicity locus (PaLoc)3. Additionally, the C. difficile transferase (CDT), also known as binary toxin (BT), is another potent virulence factor encoded in cdtA and cdtB4. The prevalence of hypervirulent strains, particularly in North America and Europe, exemplified by PCR ribotypes (RT) 027 or 078 carrying all three C. difficile toxins, poses a significant global threat4,5,6,7,8,9.

Molecular epidemiology studies on C. difficile clinical isolates have been conducted to analyze epidemic dynamics within and between countries, as well as to assess CDI outbreaks in healthcare facilities. Various genotyping methods have been adopted for these studies, including ribotyping10,11,12, multilocus sequence typing (MLST/ST)13,14, toxinotyping10, surface-layer protein A (slpA) sequence typing15,16,17, pulse-field gel electrophoresis (PFGE) typing18, and PCR-based open reading frame typing (POT)19.

The POT method offers a cost-effective and user-friendly alternative to other genotyping protocols. Utilizing multiplex PCR, this method has proven effective in determining the bacterial species and in genotyping various bacterial pathogens associated with HAIs, such as S. aureus20,21. It was employed to investigate not only HAIs but also the regional spread of methicillin-resistant S. aureus22. In the case of C. difficile, the POT method involves two reactions, POT1 and POT2, each comprising 11 and 10 multiplex PCR reactions. Binomial PCR results are converted to decimal numbers, and the POT genotype is represented by hyphenated POT1 and POT2 numbers, akin to a postal code (e.g., 000-000)23. POT1 correlates with the sequence type (ST), while POT2 targets toxin genes (i.e., tcdA, tcdB, cdtA) and others, enabling further classification of the isolate. Consequently, the POT method possesses inherently finer genetic resolution compared to PCR ribotyping and MLST, as demonstrated in previous studies19,24. This high resolution enhances its utility in molecular epidemiological studies of C. difficile.

The historical molecular epidemiological studies on C. difficile has paid much attention on global and virulent strains, often utilizing genotyping methods like RT or MLST. However, in this retrospective multi-institutional study, we opted for the POT genotyping method due to its superior genetic resolution. Our analysis delves into the molecular epidemiology of C. difficile across Japan, with a specific focus on clinical isolates not only of commonly-found genotypes, but also those bearing minor genotypes as well as those lacking toxin genes. This approach aims to provide a unique and valuable insights into the molecular epidemiology of C. difficile.

Results

Overview of the POT genotypes

The 14 hospitals spanned from the north to south of Japan (Fig. 1a). The POT genotype data on 982 clinical isolates were collected. Each hospital contributed a dataset containing 10 to 177 isolates (Table 1 and Supplementary Information S1). The total number of unique POT genotypes reached 294. The most prevalent genotype was POT 826-279/ST8/RT002, accounting for 6.0% of all the isolates (59/982 isolates), and identified in 11 of the 14 hospitals. This was followed by POT 691-387/ST17/RT018-like (5.2%, 51/982 isolates) and POT 700-501/ST81/RT369 (4.6%, 45/982 isolates). The toxin gene genotypes of POT 826-279 and POT 691-387 were both tcdA+/tcdB+/cdtA-. The toxin gene genotype of POT 700-501 was tcdA-/tcdB+/cdtA-. The dominant POT genotype in each hospital ranged from 5.2% to 37.5%, demonstrating variability among hospitals. Among all, 190 isolates (19.3%) were toxin-negative. The most prevalent toxin gene-negative genotypes were POT 732-272 and 732-308, occupying the 9th and 10th positions, respectively.

Fig. 1
figure 1

A map of Japan depicting the locations of the hospitals included in this study and the POT genotype profile. (a) The 14 hospitals are illustrated on the density map of Japan, reflecting the number of passengers using railways. The pie graph displays the profile of POT genotypes for clinical isolates from all 14 hospitals. (b) Comparison of POT genotypes in C. difficile clinical isolates from 3 hospitals in Osaka prefecture to those from 8 hospitals in non-Osaka areas. Cross-regional genotypes are underlined. Region-specific genotypes (beige) are presented within other genotypes (gray) at the top-left. An asterisk denotes statistical significance in the isolation rate of the genotype in Osaka area relative to non-Osaka areas (P = 0.00002 by Fischer’s exact test, two-sided).

Table 1 The characterization and the genotype analysis of 14 hospitals involved in this study.

Cross-regional and region-specific genotypes

To assess the epidemic dynamics, our focus centered on the 11 hospitals that provided a POT genotype dataset of more than 40 clinical isolates. Cross-regional POT genotypes were defined as those detected in more than 8 out of the 11 hospitals, signifying their presence in remote geographic regions. Six cross-regional POT genotypes were identified, including 826-279, 691-387, 700-501, 732-272, 485-439, and 485-275. Notably, one of them, namely 732-272, was negative for toxin genes. The prevalence of cross-regional genotypes averaged 22.4%, with variations observed across individual hospitals, ranging from 8.3 to 45.8%.

Each of the 11 hospitals had unique POT genotypes, which was defined as region-specific genotypes (250 isolates, 218 distinct POT genotype species). This uniqueness extended even to hospitals in close proximity, such as hospitals E, F, and G in Osaka prefecture, or J and K in Fukuoka prefecture. The prevalence of region-specific genotypes averaged 26.5%, with variations observed across individual hospitals, ranging from 6.3 to 52.0%. The number of region-specific genotypes varied among hospitals, ranging from 2 to 38 species per hospital. Each region-specific genotype was isolated 1–4 times per genotype. Of particular interest, 26.0% of the region-specific genotypes were toxin gene-negative.

A regional analysis was conducted in a broader geographic scale. Specifically, clinical isolates from three hospitals in Osaka prefecture (369 isolates from hospitals E, F, and G) were compared with the isolates from other 8 hospitals (561 isolates, Fig. 1B). The most prevalent POT genotype in the Osaka area was 827-3 (tcdA+/tcdB+/cdtA-), constituting 6.8% (25/369 isolates), not one of the cross-regional genotypes. This genotype only accounted for 1.4% of isolates (8/561) in hospitals located outside Osaka, indicating a significant accumulation of this genotype in the Osaka area (P < 0.00002 by Fischer’s exact test, two-sided).

Analysis of the isolation rates of region-specific and cross-regional genotypes

The isolation rate of region-specific genotypes was negatively correlated with the occupancy of predominant POT (P = 0.009, Student’s t-test, two-sided). These findings suggest that the presence of divergent genotypes in a hospital reduces the occupancy of the predominant genotype. The hospital I in Shimane prefecture exhibited the highest isolation rate of region-specific genotypes at 52.0% (P = 0.022 by Smirnov–Grubbs test). This area is characterized by the low population density and the limited human mobility flow across prefectural borders25, which may be related to the high isolation rate of region-specific genotypes as addressed below.

Conversely, the hospital J in Fukuoka reported the lowest isolation rate of region-specific genotypes at 6.3% (P = 0.0495 by Smirnov-Grubbs test). The hospital J also showed a high percentage of the predominant POT genotype and a high isolation rate of cross-regional genotypes (37.5% and 45.8%, each P = 0.016 and 0.034, respectively, by Smirnov-Grubbs test). These findings suggest the possibility of cross-regional strains being endemic in this particular geographic area or the occurrence of nosocomial spread of this genotype strain within the hospital J.

The potential correlation between C. difficile spread and human mobility was investigated. Cross-regional strains may have become widespread compared to region-specific strains possibly due to their higher transmissibility. Horizontal transmission of C. difficile is anticipated to be more prevalent in mobile human populations. To test this hypothesis, we analyzed the correlation between the percentage of cross-regional genotypes and indicators reflecting human mobility flow. Hospital J was excluded due to specific reasons discussed above. We utilized the number of passengers at the train station nearest to each hospital in 2018 or 2019 as an indicator of human mobility, given the significant reliance on public transportation in Japan (Fig. 2 and Supplementary Information S2). A positive correlation was observed between these variables (r = 0.792, N = 10, P = 0.008 by Student’s t-test, two-sided). Notably, no such correlation was identified with the human population or population density in the respective municipalities where the hospitals were located.

Fig. 2
figure 2

The correlation between the isolation rate of cross-regional genotypes of C. difficile and the number of passengers utilizing the railroad station closest to each hospital. The two parameters exhibit a positive correlation with a correlation coefficient of r = 0.792 (N = 10, P < 0.008, by Student’s t-test, two-sided).

The analysis of toxin genes

Analysis of toxin genes revealed that among 982 clinical isolates, 794 (80.9%) were positive for at least one toxin gene using the POT method. The predominant genotype was tcdA+/tcdB+/cdtA- (75.6%, 600/794 isolates), followed by tcdA-/tcdB+/cdtA- (19.1%, 152/794 isolates), and tcdA+/tcdB-/cdtA- (0.3%, 2/794 isolates: POT 944-242 and 945-434). Notably, the genotype tcdA+/tcdB- had not been previously reported in Japan.

Out of the 982 isolates, 50 (5.1%) were positive for the cdtA gene. All cdtA-positive isolates belonged to the genotype tcdA+/tcdB+/cdtA+, with the exception of one isolate (tcdA-/tcdB+/cdtA+). Isolates positive for all three toxins are considered highly virulent26. Such isolates were identified across all geographic areas in Japan. For instance, genotype 4-315 was found in locations including Tokyo, Osaka, and Ehime, while 4-347 was isolated in Aichi, Hyogo, and Ehime (Supplementary Information S3). These findings underscore that individuals within the human population may unknowingly carry and spread highly virulent C. difficile strains asymptomatically.

Temporal profiling at university hospital E

Temporal profiling was conducted at the hospital E, spanning from 2019 to 2021, to longitudinally analyze the C. difficile POT genotype. The 3-year period was divided into six intervals (Fig. 3 and Supplementary Information S4). The isolation trends of the six most frequently identified genotypes in the hospital were assessed, namely 827-3, 700-501 (15 isolates each), 700-437, 732-308 (10 isolates each), 826-279 (9 isolates), and 691-387 (6 isolates). Most genotypes exhibited continuous isolation at similar frequencies throughout the study period. Notably, 700-501 was more frequently isolated in 2019, and 700-437 showed higher isolation rates in the late 2021 compared to other study intervals (asterisks in Fig. 3). Both 700-501 and 700-437 were identified as tcdA-/tcdB+/cdtA- and ST 81/clade 4. Without the implementation of the POT method, it would not have been discerned that these two peaks were attributed to two independent C. difficile strains.

Fig. 3
figure 3

Temporal profiling of POT genotypes of clinical isolates in the hospital E. The six most frequently isolated genotype strains are presented across six intervals from 2019 to 2021. *, P = 0.0001; **, P = 0.004 by Fischer’s exact test, two-sided.

From a clinical perspective, a nosocomial outbreak of CDI was suspected when patients exhibited GI symptoms and were hospitalized on the same floor of the ward, with onset within 4 weeks after the initial case. The infection control team (ICT) of the hospital E declared the CDI outbreak at the first peak in 2019 since the 6 of the 8 C. difficile POT 700-501-positive patients were hospitalized in the same floor of the ward although the 2 of the 6 patients received the medical service in distinct departments (single asterisk, Fig. 3). In addition, the two of the patients showed no apparent GI symptoms. Rooms of 2 out of the 8 patients were on a different floor and were therefore excluded from the outbreak control. Retrospective genotype data confirmed the nosocomial CDI outbreak. Following the initial outbreak, 5 patients tested positive for C. difficile POT 700-501, and all of them were on a different floor of the same ward in the latter half of 2019 (double asterisk, Fig. 3). In 2021, 6 patients tested positive for C. difficile POT 700-437 (double asterisk, Fig. 3). However, the ICT did not implement special actions against these latter two episodes primarily because the patients' rooms were on different floors, and no apparent link between the C. difficile-positive patients was identified.

Discussion

Here we provided evidence that the genotype diversity of clinical C. difficile isolates, and every hospital had both region-specific and cross-regional genotypes. It was notable that the region-specific genotypes were found in every hospital even though some of the hospitals were located in the close geographic proximity. These findings can be attributed to the utilization of a high-resolution genotyping method, the inclusion of hospitals located across Japan, and the analysis of clinical isolates encompassing minor and toxin gene-negative genotypes. This is the first report suggesting the existence of region-specific, indigenous strains of C. difficile.

It is widely accepted that the human population primarily acquires C. difficile through environmental sources or contaminated food1. The cross-regional strains may inhabit in the nationwide soil. The modern food chain could contribute to the widespread distribution of food contaminated with cross-regional strains. On the other hand, the correlation observed between the isolation rate of cross-regional strains and human mobility flow suggests that these strains might be spreading among the general population, extending beyond healthcare settings into common living environments. This way, the cross-regional strains might have occupied a significant portion of the clinical isolates. The spread of C. difficile in the community is difficult to stop partly because the limited efficacy of commonly used disinfectants, such as ethanol, in preventing transmission. This is due to the spore-forming nature of C. difficile. The cross-regional strains may have a higher potency to form spores. The identification of rarely isolated tcdA+/tcdB+/cdtA+-positive strains in geographically remote hospitals even suggests the community-wide spread of C. difficile.

Currently, cross-regional strains are negative for the CDT. The isolation rate of CDT-producing strains in Japan has been relatively low (0 to 6.8%)11,27,28,29,30,31,32,33,34,35,36. It could be a concern if a CDT-positive strain were to emerge as a cross-regional strain. Hence, the implementation of molecular surveillance is imperative to monitor and detect such potential changes.

Our data reveal distinct genotype dominance in each hospital, emphasizing the need for caution when selecting an institution to represent a country in molecular epidemiological studies comparing C. difficile genotypes among countries. The variability in genotype distribution across hospitals underscores the importance of choosing a representative medical institution that accurately reflects the genetic landscape of C. difficile within a specific geographic region.

Indigenous strains isolated from CDI patients demonstrated a potential virulence to cause antibody-associated acute gastroenteritis similar to cross-regional strains. However, it is crucial to emphasize that the assessment of virulence in indigenous strains should be addressed in future studies. These indigenous strains should be found in the soil of each region. Considering Japan's diverse climates and geological characteristics, C. difficile may be adapting to local environments, leading to the selection of indigenous strains with diverse genotypes. From this perspective, cross-regional strains might be adapting to a wide range of environments for survival. Another plausible scenario is that cross-regional strains may have been introduced to the region through bioaerosols from the Eurasian continent, transported by Westerlies, a phenomenon observed in other Clostridia species37,38,39,40,41.

The genotyping is a valuable tool for the nosocomial outbreak surveillance. Our findings underscore the importance of continuous monitoring of clinical C. difficile genotypes to immediately identify nosocomial outbreaks of CDI. This continuous monitoring is crucial because establishing the baseline prevalence of genotypes is essential for the appropriate evaluation of molecular epidemiological data.

The statistically significant detection frequency of CD POT 700-501 and 700-437 in 2019 and 2021, respectively, is noteworthy. The C. difficile-positive patients being hospitalized on different floors of the ward (Fig. 3). The contact transmission is the primary nosocomial transmission route for C. difficile. Thus, these events were not considered as nosocomial CDI outbreaks. Nevertheless, it is undeniable that nosocomial spread of C. difficile occurred in these two episodes, possibly not through direct contact transmission. The possible route of nosocomial transmission will be addressed further in detail in the future studies to better understand the transmission dynamics of C. difficile.

A limitation of this study is its retrospective nature. The genotyping process was conducted individually by each hospital, and there was variability in the number of isolates and the criteria for patient selection. Some hospitals initiated genotyping studies in response to reported CDI outbreaks, while others universally applied the genotyping approach to all submitted specimens positive for C. difficile rapid testing to the isolation culture. This might have influenced the observed dominance rate and genotype diversity. Despite this limitation, the central conclusion of our study remains robust—each hospital exhibited its predominant C. difficile genotype and a unique set of region-specific genotypes. Another limitation of this study is that the phenotype of the clinical isolates, including toxin production and antimicrobial resistance, was not assessed. This aspect will be addressed in future studies.

The POT genotyping method is PCR-based, characterized by its relative affordability and ease of execution. The typing results are strengthened by the inclusion of a control reaction performed in parallel with the specimens, enhancing the reliability of the POT data even when conducted in independent facilities. However, it is important to acknowledge a potential caveat associated with POT genotyping—the occurrence of amplification errors due to mutations in the C. difficile genome. This concern is particularly relevant when assessing toxin genes. It is crucial to note that the presence of toxin genes, as identified by POT genotyping, does not necessarily imply the actual production of the corresponding toxins as proteins. Therefore, estimation of the bacterial virulence solely based on POT genotypes should be interpreted with caution. Complementary assessments, such as functional studies or toxin expression analyses, may be necessary to more accurately infer the virulence potential of the identified genotypes. The microbial diagnostic method using matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry (MS), which has become increasingly widespread, is a viable option. This method can rapidly discriminate C. difficile strains based on virulence, though it requires an initial investment in equipment42,43.

Materials and methods

Retrospective multi-institutional study

A literature search was conducted through the PubMed service of the National Center for Biotechnology Information (NCBI) and the web search engines like Yahoo or Google to identify healthcare facilities in Japan that utilized the POT method for molecular typing of C. difficile. Genotypic data for C. difficile were compiled from 14 hospitals across Japan (hospitals A to N), covering the period from 2005 to 2022 (N = 982 clinical isolates). Table 1 and Fig. 1 (Table 1 and Fig. 1) provide an overview of the characteristics, geographic locations, and the number of clinical isolates analyzed using the POT method. Each hospital conducted the molecular typing studies either due to suspected CDI outbreaks or as part of universal screening purposes. In the latter case, stool specimens were collected from both gastrointestinal (GI) symptom-positive and -negative patients. This approach aimed to capture a comprehensive representation of C. difficile prevalence within the hospitalized individuals. C. difficile strains were isolated by spreading stool specimens treated with ethanol onto C. difficile selective agar plates. Each of the two types of agar plates, cycloserin-cefoxitin mannitol agar plate (CCMA, Nissui, Tokyo, Japan) or cycloserin-cefoxitin fructose agar plate (CCFA, Eiken Chemical, Tokyo, Japan, or Becton Dickinson, Tokyo, Japan), was used for this purpose. POT genotyping was conducted using the Cica geneus C. diff POT kit, following the manufacturer’s protocol (Kanto Chemicals, Tokyo, Japan). Genotyping was exclusively performed on toxin-positive isolates in hospitals C, J, and M. Minority genotype information was unavailable for hospitals L and N.

Data regarding the number of passengers at train stations were sourced from the website https://statresearch.jp, which aggregates information published by the Japanese Ministry of Land, Infrastructure, Transport, and Tourism. Population and population density data were acquired from https://www.e-stat.go.jp, utilizing information published by the Statistics Bureau, Ministry of Internal Affairs and Communications.

Longitudinal analysis in a university hospital

A longitudinal analysis was conducted on 177 C. difficile isolates collected at hospital E during the period from 2019 to 2021. Stool specimens were obtained from a total of 177 individuals, consisting of 9 outpatients and 168 inpatients. These individuals exhibited either gastrointestinal (GI) symptoms suspected of CDI or were investigated under the screening purpose of CDI outbreaks. The GDH rapid test (C.DIFF QUIK CHK Complete, Alere Medical, Tokyo, Japan) was employed for the initial diagnosis of CDI. Positive specimens underwent further cultivation on the selective agar plate, CCMA. Additionally, clinical specimens were processed using the GeneXpert system to detect tcdB, cdt, and a mutant form of tcdC. Bacterial DNA was isolated using the Cica geneus DNA extraction kit (Kanto Chemicals), followed by POT genotyping. In cases where C. difficile isolation was performed multiple times from a single patient, only one genotype was included in the analysis. When a patient was readmitted and the isolation of C. difficile was performed, the event was considered independently. The date of specimen collection was used the basis for temporal profiling analysis. The ST genotype was determined either by the protocol previously described13 or by referencing the ST-POT corresponding table provided by Kanto Chemicals and the reference24.