1 Introduction

Medullary thyroid carcinoma (MTC) is a malignant neuroendocrine tumor that arises from parafollicular C cells of the thyroid gland. Although accounting for less than 5% of all thyroid malignancies, MTC has demonstrated relatively a more aggressive biological behavior, such as cervical lymph node involvement and distant metastasis, and a higher death rate compared with differentiated thyroid carcinomas [1]. Approximately 25% of MTC are caused by activating germline RET mutations associated with an autosomal dominant hereditary cancer syndrome multiple endocrine neoplasia type 2 (MEN2), however, the majority of MTC (about 75%) is sporadic and the etiology is unknown [2]. With the development of tumor NGS technology, the relationship between specific mutations (RET, RAS) and clinical prognosis has been well established [3]. Selective RET inhibitors such as pralsetinib and selpercatinib, have been approved for the treatment of RET M918T -mutated MTC and have shown promising results [4]. Although somatic RET and RAS mutations have been identified in over half of cases in sporadic MTC patients, no known actionable mutations or biomarkers were found in the remaining part of patients. In addition, the lack of genetic markers complicates both prognosis prediction and the development of targeted therapies for this group of MTC patients, which urgently necessitates the utilization of innovative research strategies.

Single cell RNA sequencing (scRNA-seq) is a powerful genetic tool that provides a comprehensive gene expression profile for individual cells, revealing the complexity of cell populations within a specific tissue sample [5]. It surpasses the limitations of traditional sequencing, which struggles to effectively discern distinct cell types and their functions [6]. By measuring gene expression at the individual cell level, scRNA-seq tackles the challenge of cellular heterogeneity in tumors, a limitation that high-throughput sequencing struggles to overcome [7]. In recent years, scRNA-seq has been increasingly utilized as a new next-generation sequencing (NGS) method in the heterogeneity studies of multiple cancers, and has become an important tool for cancer diagnosis and treatment [6]. Nevertheless, reports on the application of scRNA-seq in sporadic medullary thyroid carcinoma remain scarce [8].

In this study, we analyzed the single-cell transcriptomic landscape of MTC from Formalin fixed and paraffin-embedded (FFPE) samples by scRNA-seq technology. We elucidate different cell populations and possible tumorigenic evolutionary processes in MTC tumors, which may contribute to the development of novel biomarkers and clinical strategies for the treatment of MTC.

2 Materials and methods

2.1 Patients and clinical samples

Tumor tissues and adjacent non-cancerous tissues samples were collected from 3 male patients with primary MTC pathologically diagnosed in Chongqing University Cancer hospital between May 2022 and March 2023. All 3 patients included in this study showed early sporadic MTC (pT1N0M0) clinical features, and were followed up for up to 1 year after surgery, and all showed good prognosis. And NGS targeted panel testing of 18 genes (including BRAF, RAS, and RET) was conducted on tumor samples from the three patients. The results revealed that one patient had a RET gene exon 10 missense mutation c.1832G > A p.C611Y, with a frequency of 49.2%, while no mutations were detected in the other two patients. The median age was 46 (aged 43, 46, and 49), the median preoperative serum calcitonin level was 630.3 (ranging from 352.5 to 777.5) pg/mL, and none of the cases had a significant history of infection or other malignancies, nor had they undergone radiotherapy or chemotherapy prior to surgery (Table 1). The study received approval from the Ethics Committee of the Cancer Hospital affiliated with Chongqing University (CZLS2023085-A-1), and informed consent was obtained from all participants.

Table 1 Clinical and pathological characteristics of medullary thyroid cancer patients

2.2 Single‑cell preparations

After surgery, MTC samples were promptly fixed in 4% neutral formaldehyde for 6 to 24 h and subsequently underwent tissue processing and then embedded in paraffin to prepare paraffin-embedded tissue blocks. Two sections were consecutively cut at a thickness of 50 μm from the same paraffin block to create a single-cell suspension, according to the protocol of Isolating Cells from FFPE Tissue Sections for Chromium Fixed RNA Profiling (10x Genomics).

2.3 Library preparation and RNA sequencing

The ChromiumTM microfluidic system (10x Genomics) was employed to partition the hybridized mRNA of individual cells into nanoliter-scale, barcoded Gel Beads-in-emulsion (GEMs) for subsequent ligation and extension. This was followed by GEM breaking, amplification, and library construction. Quality assessment was performed using Qubit 4.0 and Agilent 2100 instruments to evaluate the inserted DNA size. The concentration of the cDNA library was determined by qPCR, and a final concentration of 2 nM was adjusted for sequencing on the Illumina NovaSeq 6000 platform as described in the protocol of Single-cell Gene Expression 3′ kit (10x Genomics, USA).

2.4 Computational data analysis

Sequence data in FASTQ files were trimmed using Cell Ranger v3.1.0 software provided by 10xGenomics for read alignment (against the human reference genome GRCh38), transcript reconstruction/annotation, abundance quantification, and read quality trimming (q < 20). All data analyses were done with R Studio software version 4.0.4. Cell clustering analysis was conducted through R package Seurat version 4. The CopyKAT tool was employed to assess the landscape of copy number variations (CNVs) in single tumor cells, facilitating the distinction between malignant and non-malignant cells. Enrichment analysis of the principal cell subgroups for GO functions and KEGG pathways was undertaken to investigate the biological processes potentially involving these genes. And the “FindAllMarkers()” function was utilized to compare the gene transcription variations among cell clusters to identify potential candidate marker genes, with a p-value below 0.05 denoting statistical significance. Survival analysis is conducted using the “http://gepia2.cancer-pku.cn” function in the “Survival Analysis” R package. The optimal cutoff is determined as the point with the most significant split. Subsequently, the survival curve is generated using the “Plot” R package.

2.5 Immunohistochemistry and fluorescence in situ hybridization assays

Tumor samples were embedded in paraffin and cut into 4 μm thick sections for immunohistochemistry (IHC) and fluorescence in situ hybridization (FISH) assays. Immunohistochemistry staining was performed using the automated Ventana BenchMark XT system (Roche, Ventana Medical Systems Inc., Tucson) for MEKK4 (PA5-99324; ThermoFisher) with standard procedures. Amplification status of RICTOR and AKT1 gene in the tumor were evaluated by FISH assay with probes provided by Guangzhou Anbiping Pharmaceutical Technology Co., Ltd. (China). Representative areas were selected and evaluated by counting 30–30 nuclei in at least two regions of the tumor using the OLYMPUS BX63 fluorescence microscope according to the manufacture’s guideline.

3 Results

3.1 MTC transcriptome and cell subgrouping

By running multi-module in CellRanger, cells with a total number of genes (nFeature_RNA less than 200, and nCount_RNA less than 400, and percent.mt greater than 20%) were excluded. Subsequently, 32,544 cells of 3 patients were obtained from the tumor sample (B-10XFIX), with a median depth of 4,349 unique molecular identifiers (UMIs) and 2,250 genes; in the adjacent non-tumorous sample (A-10XFIX), 11,751 cells were profiled, with a median depth of 1,274 UMIs and 967 genes. A sequencing saturation of 80.25% indicates thorough coverage of the available transcripts (Fig. 1A).

Fig. 1
figure 1

Single-cell transcriptome profiles of MTC (B-10XFIX) and adjacent paracancerous samples (A-10XFIX). A Distribution of number of genes, UMIs and percentage of mitochondrial genes by filtering cells with quality metrics for scRNA-seq sequencing data. B Bubbledot showing differential expression of representative marker genes in different cell types of MTC. C Plot of seven cell types in MTC tumor tissue and paracancerous tissue after dimensionality reduction clustering. D Distribution of various cell types in MTC tumor tissue and paracancerous tissue. E Differential expression of CALCA gene in each cell subpopulation is shown in Vlnplot

R package of Seurat module was performed to explore the heterogeneity from MTC single-cell transcriptomic data. Nineteen cell subgroups of MTC were identified by clustering and differential gene expression analysis. The high-expression genes of each group were obtained by calculating the average log fold change. By integrating classic marker genes, they were classified as 7 principle heterogeneous cell subtypes (Fig. 1B) : Conventional dendritic cells (5.44%, marked by LAPTM5 and APOE), endothelial cells (6.60%, marked by PECAM1 and VWF), fibroblast cells (6.94%, marked by IGHG1 and LUM), neutrophil cells (0.84%, marked by S100A9 and G0S2), smooth muscle cells (7.38%, marked by TAGLN and ACTA2), T cells (2.96%, marked by CD3D and TRBC1), and Thyroid parafollicular cells (69.83%, marked by CALCA and CHGB) (Fig. 1C).

3.2 Heterogeneity analysis of cell subpopulations in MTC microenvironment

We next compared the proportion of cell subgroups in MTC tumor with adjacent non-tumorous tissues samples, and observed parafollicular cell subpopulations mainly within the tumor samples, whereas T cells, neutrophils, endothelial cells, and fibroblasts predominantly within the adjacent non-tumorous samples, which is consistent with the histomorphology under microscope (Fig. 1D). The transcription level of CALCA, a gene coding calcitonin and a marker associated with medullary thyroid carcinoma, was significantly elevated in the parafollicular cell subgroup (Fig. 1E). This is also consistent with the significantly increased serum calcitonin detected in MTC patients clinically.

MTC single cell transcriptome cell atlas analysis revealed seven distinct clusters of parafollicular cell subsets (Fig. 2A). The proportions of different subtypes varied between tumor and adjacent non-tumor tissues, with subtypes C0, C1, C3, C5, and C6 predominantly identified in tumor tissues, while C2 and C4 were mainly present in the adjacent non-tumor tissues (Fig. 2B). There was considerable heterogeneity in gene expression among these cell clusters (Fig. 2C). CopyKAT was employed to evaluate CNV landscape to distinguish malignant from non-malignant cells in single-cell analysis. The arm-level CNVs in tumor tissues were markedly elevated compared to those in adjacent non-tumor samples, with aneuploid cells being predominantly enriched in subtypes C2 and C4 (Fig. 2D).

Fig. 2
figure 2

Seven subclusters of parafollicular cells in MTC and their features. A Parafollicular cell subpopulations of the thyroid gland were further data-integrated and dimensionality-reduced clustering to obtain seven cell population subtypes. B The different isoforms are present in varying proportions in tumor and paracancerous tissues (with C0, C1, C3, C5 and C6 predominantly present in tumor tissue and C2 and C4 predominant in paracancerous tissues). C Heatmap of heterogeneity in gene expression across various subtypes. D Single cell CNV landscapes of tumors were assessed using CopyKAT to distinguish between malignant and non-malignant cells. Tumor tissues had significantly higher CNVs at the chromatin arm level than paracancerous samples, with aneuploid cells concentrated in the C 2 and C 4 subtypes

3.3 Cell type evolutionary construction of MTC tumor cell subtypes

We used the Monocle toolkit to order single cells in pseudo-time for trajectory analysis, and the evolution pathway corresponding to the process of malignant transformation of parafollicular cells in the thyroid was reconstructed. Our study suggested that the progression of tumor cells may undergo the transformation process along a developmental trajectory from the C2 subtype population to the C0 subtype. Furthermore, the C0 subtype appears to continue its evolution by differentiating into four major malignant subpopulations: C0, C1, C5, and C6 subtypes. However, the MTC cell clusters (C6) might be associated with the end of the evolutionary trajectory of tumorigenesis (Fig. 3A). Within the C2 subtype, a notable increase in the transcription of CHGA, SCG2, and PROM1 genes was observed. These elevated genes were important markers for cancer stem cell and neuroendocrine features, suggesting that cells of C2 subtype in medullary thyroid carcinoma may possess normal neuroendocrine functions and stem cell traits, which supported that the tumor cells might be in the early stages of cellular differentiation. In subtype C4 cells, the transcription of EPCAM and KRT19 genes was observed significantly elevated, suggesting the presence of a minor population of thyroid follicular cells within MTC sample, which was also found in the H&E-stained tumor sections under microscope. It is conceivable that in the late stages of tumor development, these follicular cells might be constricted by rapidly proliferating tumor cells, leading to eventual apoptosis and only a small number surviving in the tumor samples.

Fig. 3
figure 3

Life process and differentiation of malignant cells in parafollicular cell subgroups. A The proposed time-series analysis trajectory plot showed C2 was the early phase of tumor transformation, and progression began from C2 to the early cancerous cell population C3, which continues to evolve into four major malignant cell subpopulations C0, C1, C5 and C6. B IHC showed a strong positive expression of MAP3K4 in tumor tissue, while negative in paracancerous tissue. Scale bars, 100 μm. C Analysis of Differentially Expressed Genes for GO Function and KEGG Pathway Enrichment in Various Cell Types of medullary thyroid cancer. D PROM1, which is normally expressed in adult stem cells and is thought to maintain stem cell properties by inhibiting differentiation, decreases in tumor cells as tumors develop, whereas RICTOR and AKT1 increase in tumor cells as tumors develop. E Images of FISH analysis for RICTOR and AKT1 gene amplification in MTC tissue samples

The transcription profile of C3 subtype of parafollicular cells supported an intermediate or borderline state leaning toward transformed tumor cells. This subgroup exhibits high level transcription of the RAF1 and MAP3K4 genes. GO analysis indicates that this subtype is enriched in biological processes like epithelial-mesenchymal transition (EMT) and angiogenesis. Additionally, immunohistochemistry staining of MTC demonstrates high expression of MAP3K4 in tumor tissues (Fig. 3B). For C5 subtype, we observed high expression of the PGK1 and SDHA genes, and a further functional enrichment analysis in GO demonstrated the participation of SDHA in promoting angiogenesis and hypoxic signaling. The C1 subtype is characterized by high expression of genes such as RET and CEACAM5, which encode proteins involved in the intracellular signaling that dictates cell division or differentiation, thereby promoting the onset, progression, and metastasis of tumors. And functional enrichment analysis also indicated the involvement of RET and CEACAM5 in intercellular adhesion, biogenic amine metabolism, and biosynthesis of thyroid hormones. The PPIH gene, which encodes a protein belonging to the peptidyl-prolyl cis-trans isomerase (PPIase) family, was observed to be elevated in the C6 subtype, and GO functional enrichment analysis revealed it could act as a chaperone within the spliceosome and mediate interactions between various proteins. And for subtype C0, it demonstrated a notable high expression of the BCL2 gene which might contribute to the resistance to apoptosis due to the suppression of the apoptotic pathway (Fig. 3C). The activation of the RICTOR gene, by participating in the mTORC2 signaling pathway, such as AKT1 phosphorylation, mediates the survival, proliferation, and metabolism of tumor cells. Tumorigenesis frequently arises from the inactivation of tumor suppressor genes or the activation of proto-oncogenes in normal tissue stem cells. This implies that stem cells in normal tissue have the potential to evolve into tumor cells or cancer stem cells as a result of genetic changes during tumorigenesis. The presence of PROM1, commonly identified in adult stem cells and thought to maintain stem cell properties by inhibiting differentiation, diminishes in tumor cells during tumor formation. Conversely, the levels of RICTOR and AKT1 rise in tumor cells as tumors advance (Fig. 3D). Fluorescence in situ hybridization also confirmed the amplification of these two genes in tumor samples (Fig. 3E).

Survival analysis of cancer cell-associated genes was conducted using a Thyroid carcinoma cohort from TCGA dataset (TCGA-THCA). Patients with THCA showing high expression of C0-associated gene of BCL2 had a better survival rate compared to those with low expression. However, the expression levels of the marker gene in other subgroups did not show a significant association with patients’ overall survival after surgery. (Fig. 4).

Fig. 4
figure 4

Patients with THCA showing high expression of C0-associated genes BCL2 than those with low expression. High expression of the C0 subcluster-associated genes displayed worse overall survival than those with low expression in a Thyroid carcinoma cohort containing 510 samples. In contrast, high or low expression of the marker gene in other subgroups was not significantly associated with patients’ overall survival after surgery

4 Discussion

Tumor heterogeneity, along with features like sustained proliferation, evasion of growth inhibition, and resistance to cell death, presents a challenge to the precise diagnosis and treatment of cancer. It is increasingly recognized as a primary cause of therapy failure and cancer relapse [9]. ScRNA-seq as an powerful technology, enables identifying unique cell subpopulations and their distinct gene expression profiles to explore the potential targeted treatment strategies, refine the understanding of resistant clones contributing to disease relapse [10]. Furthermore, the application of scRNA-seq technique to uncommon tumors like medullary thyroid cancer continues to be an under-explored area of substantial interest. However, traditional transcriptomic assays often require fresh, or cryopreserved tissue or cells and are unsuitable for FFPE samples due to the difficulty in extracting high-quality RNA products [11]. In recent years, significant advancements have been made in the transcriptome analysis of cell nuclei from FFPE samples [12]. Compared to traditional methods for fresh tissue samples, these methods for FFPE samples have been shown to be superior in preserving cell state and batching samples to optimize workflow and reduce variability [13]. In this study, we have also demonstrated successful scRNA-seq analysis on clinical FFPE specimens utilizing commercially available molecular platform and products, which can greatly aid in the extensive exploration in biomedical research, particularly in the field of clinical oncology.

MTC is a rare malignant disease derived from a small number (less than 1%) of neuroendocrine cells of thyroid, and the RET mutation has been detected in in both sporadic and hereditary MTC and believed to be a predominant driver cause of tumorigenesis [14]. Since most cases were diagnosed at advanced stage, the prognosis had remained dismal although small molecular inhibitors targeting RETM918T-bearing MTC cells such as pralsetinib and selpercatinib had been developed for clinical treatment [15]. However, a comprehensive knowledge on the molecular feature of MTC has been elusive owing to the rarity of this disease. By performing scRNA-seq analysis of samples obtained from both tumor and adjacent non-tumor tissues, our present work has identified seven distinct parafollicular cell subtypes within the FFPE sample from MTC patients, which exhibited significant heterogeneity for each cell subtype. And by using the pseudotime information for trajectory inference, a dynamic network model of the transition state associated with MTC tumorigenesis was constructed based on expression of different sets of genes, which provide a novel viewpoint for explanation of the possible evolutionary relationship among the heterogeneous cellular components in MTC without the need for purifying cells into discrete states.

Understanding the evolutionary patterns of cancer cells can enhance our knowledge of tumor biology, prognostication, and the development of personalized therapeutics [16]. However, relevant findings remain scarce in the context of MTC [17]. In this research, we proposed that malignant transformation of tumor cells might initiate with the transcriptome profile the C2 subtype of parafollicular cells and subsequently differentiate into C3 subtype, while the C3 subtype might be in a transitional state evolving into four major malignant cell subpopulations of C0, C1, C5 and C6, due to the high expression levels of the RAF1 and MAP3K4 genes in this specific cell cluster. Both RAF1 and MAP3K4 genes are over-expressed in mammalian cells to activate the MAPK pathways during tumorigenesis [18]. Studies have indicated that the knockout of the MAP3K4 gene in cholangiocarcinoma cells can enhance in vitro cellular proliferation and invasiveness as well as tumor progression, accompanied by a typical EMT process, while overexpression of MAP3K4 could reversed EMT and inhibited the tumor cell invasion [19]. However, it has been demonstrated that elevated expression level of MAP3K4 in triple-negative breast cancer (TNBC) cells promoted tumor growth both in vivo and in vitro [20]. The above contradicting phenomenon highlighted the potential of MAP3K4 as possible target for cancer treatment. Immunohistochemistry in our study reveals that MAP3K4 (MEKK4) is highly expressed in tumor tissues, with lower expression in adjacent non-tumorous tissues. Furthermore, Gene Ontology (GO) analysis demonstrated that this subtype is enriched in biological processes such as EMT and angiogenesis.

In C0 subtype of parafollicular cells, a significant increasing of BCL2 and RICTOR gene transcription has been observed. BCL2 is a well known anti-apoptosis gene associated with overexpressing in most thyroid cancers and a poor prognostic factor of MTC [21], which might function as a braker for programmed cell death during tumor initiation and progression [22]. The RICTOR gene, by participating in the mTORC2 signaling pathway, mediates the survival and proliferation of medullary and aggressive papillary thyroid carcinomas [23]. A previous published study has indicated that RICTOR mutations in circulating tumor DNA are associated with the prognosis of patients with non-small cell lung cancer [24]. RICTOR is a critical component for the function and stability of the mTORC2 complex, and its overexpression can activate multiple downstream phosphorylated substrates, such as the protein kinase families PKA, PKG, PKC, thereby regulating cell survival and proliferation [25]. It has also been reported that RICTOR is inappropriately overexpressed in various type of cancers including lung, gastrointestinal, brain, head and neck, and breast, and downregulation of RICTOR can inhibit cellular proliferation and promote apoptosis, which suggested the promising future as prognostic biomarker and therapeutic target [26].

We furtherly confirmed the amplification of RICTOR and AKT1 genes in MTC tumor tissue by fluorescence in situ hybridization. Given that RICTOR is a key element of the mTORC2 complex, its amplification has been linked to poor prognosis and resistance to receptor tyrosine kinase-targeted therapy. Consequently, targeting RICTOR could serve as an additional strategy to overcome treatment failure in cancer [27]. Therefore, the prognostic significance and the precise targeted therapy of specific RICTOR / mTORC2 inhibitors in MTC merit further validation.

However, due to the rarity of MTC, survival analysis was only possible using the thyroid carcinoma cohort from the TCGA dataset (TCGA-THCA). We found that high expression of C0 subcluster-associated gene can predict poor survival for patients with THCA, suggesting that this subclusters are more aggressive and that these core gene signatures could serve as biomarkers of prognostic and therapeutic target.

This study has several limitations. First, only 3 cases of MTC were included in our research, which was partially contributed to the high cost of scRNA-seq assays, and it indicated that larger, multi-center studies are necessary to address the small sample size, Second, the single cell nature of scRNA-seq might complicate the accurate identification of tumor cells due to the inherent technical challenges such as amplification bias, noise, dimensionality and interpretation of data. Lastly, due to the absence of adult sporadic MTC bulk transcriptome data of public sources like The Cancer Genome Atlas and Gene Expression Omnibus datasets, a further comparation of transcriptome profiles between bulk and single cellular levels remained unfeasible.

In summary, in this study we have mapped the single-cell transcriptomic landscape of MTC by conducting scRNA-seq on FFPE samples from MTC patients, revealing the tumoral heterogeneity of MTC. By ordering individual cells for evolutionary trajectory evaluation, we proposed a possible developmental pathway among the heterogeneous cellular components for MTC tumorigenesis. The biological functions of annotated tumor cell subpopulations such as C0 and C3 subtypes of parafollicular cells suggested the possibility of discover novel therapeutic targets and biomarkers for MTC. However, our findings by scRNA-seq analysis still require validation with further in vivo and in vitro experimentation.