1 Introduction

Recent global epidemiological studies on tumors have illuminated oral cancer as the 16th most prevalent malignancy worldwide, accounting for an estimated 350,000 new cases annually [1]. Remarkably, over 90% of oral cancer cases are identified as squamous cell carcinomas, with a significant majority occurring in developing nations, prominently in China. Notably, the risk of OSCC amplifies with age, predominantly affecting individuals over 50 years old [2]. Research investigations focusing on early-stage OSCC treatment outcomes have revealed encouraging prognoses, marked by enhanced survival rates and improved quality of life for patients [3]. However, it is essential to recognize that early-stage OSCC often lacks symptomatic manifestations, resembling benign conditions, which diminishes patients' inclination to seek prompt medical attention. Consequently, early diagnosis and intervention for OSCC are pivotal, presenting the potential to substantially enhance patient prognoses [4]. Thus, an in-depth exploration of the underlying developmental mechanisms of OSCC has become an indispensable research imperative in contemporary oncology.

Topoisomerase IIα (TOP2) is a crucial nuclear protein essential for DNA replication and cell growth. Its primary function involves the deacetylation of intertwined DNA during the later stages of the cell cycle, facilitating proper chromosome segregation before cell division [5]. Given its pivotal role, TOP2 has emerged as a prime target for various chemotherapeutic agents extensively utilized in clinical settings, such as adriamycin, pedialyte glycosides, and mitoxantrone [6].

TOP2 exhibits two distinct gene isoforms, each possessing unique expression patterns and regulatory functions in tissue cells. Notably, TOP2A, one of these isoforms, is prominently expressed in rapidly proliferating cells and is intricately regulated by the cell cycle, peaking during the G2/M phase [7]. Elevated levels of TOP2A expression have been observed in various human malignant tumors, including adrenocortical, nasopharyngeal, gallbladder, ovarian, and esophageal cancers [8,9,10,11]. In these cancers, heightened TOP2A expression correlates with aberrant cell proliferation, formation of aneuploid chromosomes, aggressive tumor phenotypes, advanced disease stages, tumor recurrence, and diminished overall survival rates [12,13,14].

Notably, in breast cancer, TOP2A has been strongly linked to tumor proliferation and invasiveness [15]. Additionally, in prostate cancer, TOP2A plays a role in epigenetic regulation through the enhancer of Zeste homolog 2 (EZH2) [16]. However, despite these findings, the literature lacks exploration regarding the epigenetic regulation of TOP2A, especially in relation to OSCC. Currently, there is a notable gap in understanding the association between TOP2A and OSCC, including the expression levels of TOP2A in OSCC and the intricate molecular pathways underlying its involvement, warranting further investigation and exploration.

In this study, we embarked on an initial investigation into the expression profile and potential biological implications of TOP2A in the context of OSCC. Our approach integrated comprehensive big data analysis utilizing public databases with cutting-edge techniques, including tissue immunohistochemistry, gene microarray analysis, second-generation sequencing, CRISPR-Screen, and single-cell RNA sequencing (scRNA-seq).

2 Materials and methods

2.1 Immunohistochemistry

A total of 20 OSCC patients treated by oral and maxillofacial surgery at the People's Hospital of Guangxi Zhuang Autonomous Region were enrolled in the IHC experiments. The inclusion criteria for eligible samples were as follows: (1) tumors in OSCC patients were located in the lip, tongue, palate, floor of the mouth, gingiva, or buccal mucosa; (2) all OSCC cases were first onset, with patients not having received radiotherapy or chemotherapy prior to surgery; (3) the postoperative pathological diagnosis for all cases was “squamous cell carcinoma”; (4) patients had no history of systemic immune diseases, and no other lesions were detected during clinical examination in their oral mucous other than OSCC. This study was approved by the ethics committee of the People's Hospital of Guangxi Zhuang Autonomous Region. Surgical samples were fixed in a 10% formaldehyde solution. Embedding, sectioning, deparaffinization, hydration, antigen retrieval, serum blocking, and primary antibody(Proteintech, 24641-1-AP) incubation at 4 °C overnight and washed by PBS were incubated with HRP-Polymer-conjugated secondary antibody at 37 °C for 20 min. We stained sections with 33-diaminobenzidine solution for 1 min and counterstained the nuclei were with haematoxylin. The judgment criteria combined staining intensity and the percentage of positive cells. Staining intensity was scored as 'not detected', 'low intensity', 'moderate intensity', or 'high intensity', corresponding to scores of 0–3. Staining density was similarly scored as 'no density', 'low density', 'moderate density', or 'high density', with corresponding scores of 0–3. The product of the staining intensity and density scores was used as the overall evaluation of TOP2A protein expression levels.

2.2 Collection of OSCC GeneChip and high-throughput sequencing data

In this study, we formulated a search strategy utilizing the Mesh subject term: "OSCC OR Oral squamous cell carcinoma." Subsequently, we systematically screened global OSCC mRNA expression profiles from various databases, including Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/), ArrayExpress (https://ngdc.cncb.ac.cn/databasecommons/database/), The Cancer Genome Atlas (TCGA) (https://www.cancer.gov/ccg/research/genome-sequencing/tcga), Sequence Read Archive (SRA) (https://www.ncbi.nlm.nih.gov/sra), and ONCOMINE (http://www.oncomine.org/). The inclusion criteria for studies were as follows: (1) participants diagnosed with OSCC through surgical specimens; (2) tissue specimens tested; (3) a specified number of OSCC cases. The selected datasets were then consolidated based on the platform sequence number (GPL) to mitigate batch effects across datasets. This integration was performed using the R package 'sva', yielding a unified platform matrix.

2.3 Identification of differentially expressed genes in OSCC tissues

The standardized mean difference (SMD) of TOP2A gene expression values was calculated utilizing the integrated OSCC platform matrix to elucidate its expression patterns in OSCC tissues. To probe into the potential clinical significance of elevated TOP2A mRNA expression in OSCC, this study initially assessed its discriminatory capacity within each dataset by calculating sensitivity and specificity. Subsequently, the overall discriminatory ability of TOP2A across OSCC tissues was appraised through the construction of summarized receiver operating characteristic (sROC) curves and likelihood ratio Fogan plots.

2.4 Identify genes essential for OSCC cell growth

The Achilles Project [17], a genome-wide CRISPR-Cas9 tool, was used to identify genes crucial for the survival of various cancer cell types through gene knockdown. Utilizing this project, we identified key genes contributing to OSCC development. Genome-wide CRISPR data were obtained from the DepMap website (https://depmap.org/portal/download/) to screen for essential genes in OSCC cell growth. The impact of 17,000 candidate genes on OSCC cell growth was evaluated using the CERES algorithm. Candidate genes with CERES scores of 1 were designated as key genes significantly influencing OSCC cell growth. Subsequently, the identified key gene set underwent functional enrichment analysis using the Metascape online database (https://metascape.org/gp/index.html).

2.5 Analysis of TOP2A expression at the OSCC cell level using single-cell RNA sequencing (scRNA-seq) technology

GSE163872 is a study that examined scRNA-seq data from 13,903 cells from OSCC tissues. In this study, the expression matrix was subjected to principal component analysis (PCA) using the "runPCA" function, and the Uniform manifold approximation and projection (UMAP) was used to reduce the number of clusters in the cell clusters. Differentially expressed genes in cell subclusters were analyzed and screened using the Wilcox rank sum test. The differentially up-regulated genes of cell subpopulations obtained from this dataset were compared with the marker genes collected in the CellMarker (http://bio-bigdata.hrbmu.edu.cn/CellMarker/) database. Single-cell developmental trajectory maps were constructed using the R package Monocle 2, and the results of the analysis were visualized as two-dimensional images.

2.6 Statistical analysis

In this study, the Wilcoxon test was employed to compare disparities in TOP2A mRNA expression levels between OSCC and normal oral tissues. Given the substantial heterogeneity observed across the included datasets (I2 > 50%), a random-effects model was applied to assess the standardized mean difference (SMD) of TOP2A. Additionally, Egger's test was utilized to evaluate the potential presence of publication bias in the SMD results. Statistical significance in the analysis was defined as a P value less than 0.05.

3 Results

3.1 There is a trend of up-regulation of TOP2A protein expression in oral cancer tissues

Compared with normal oral tissue, OSCC tissues exhibited an up-regulation trend in the expression of TOP2A protein, as depicted in Fig. 1. Notably, positive staining signals for TOP2A protein were predominantly observed within the nuclei of OSCC cells. This observation underscores the potential of TOP2A protein to serve as a distinguishing marker for OSCC tissues compared to normal oral tissues.

Fig. 1
figure 1

Expression level of TOP2A protein in normal oral and OSCC tissues. a, b Immunochemistry of squamous cell carcinoma tissue. c, d Immunochemistry of adjacent oral tissue immunochemistry

3.2 Inclusion of high-throughput data

To confirm the expression pattern of TOP2A mRNA in OSCC tissues, we collected a high-throughput dataset of 1668 samples (1240 OSCC and 428 normal tissues) from 19 data platforms (Table 1). The study centers included nine different countries from China, the United States, Japan, Canada, Italy, India, Australia, New Zealand, and Sweden, reflecting the evidence-based concept that the basis of this study is grounded in data from around the globe.

Table 1 Details of high-throughput datasets included in public databases

3.3 TOP2A mRNA expression is significantly upregulated in OSCC tissues

Among the 19 high-throughput platform matrices for OSCC, TOP2A mRNA expression in OSCC tissues exhibited lower levels than in normal human oral tissues in only 2 microarrays, while the remaining datasets consistently indicated upregulation in OSCC tissues (Fig. 2a). Integrated analysis revealed a standardized mean difference (SMD) value of 1.51 (95% CI 0.94–2.07, Fig. 2b), underscoring a significant upregulation of TOP2A expression in OSCC tissues compared to normal tissues. Egger's test indicated that the included dataset did not exhibit statistically significant publication bias (Fig. 2C). According to the diagnostic test results (Fig. 3a–c), the sensitivity of using high TOP2A expression as a reference indicator to identify OSCC tissues was 0.89 (95% CI 0.71~0.96), the specificity was 0.92 (95% CI 0.78–0.97), the positive likelihood ratio was 11, the negative likelihood ratio was 0.11, and the summarized receiver operating characteristic area under the curve (sROC AUC) was 0.96 (95% CI 0.94–0.98). These findings underscore the robust discriminatory efficacy of TOP2A for identifying OSCC tissues.

Fig. 2
figure 2

Expression levels of TOP2A in OSCC tissues and normal tissues. a Comparison of TOP2A mRNA expression levels in the two groups of samples; b comprehensive assessment of TOP2A mRNA expression levels in OSCC tissues; c publication bias test of the included datasets

Fig. 3
figure 3

Potential clinical significance of high expression of TOP2A in OSCC tissues. a Forest plot of sensitivity and specificity; b Summary receiver operating characteristic curve plot; c Fogan plot showing the positive and negative likelihood ratios of TOP2A mRNA highly expressed in OSCC tissues

3.4 TOP2A is an essential gene for OSCC cell growth

To identify pivotal candidate genes influencing OSCC cell growth, we scrutinized genome-wide CRISPR-Screen data from DepMap. The analysis across 19 OSCC cell lines revealed the indispensability of TOP2A for sustaining OSCC cell survival. Additionally, our study delved into the potential pathways associated with 676 genes, coordinating with TOP2A to foster OSCC growth. These genes were predominantly enriched in pathways such as ribonucleoprotein complex biogenesis, ribosomal large subunit biogenesis, mitotic cell cycle, mRNA metabolic process, cytoplasmic translation, and other related process (Fig. 4). This observation implies that the altered gene set is intricately linked to the fundamental development of the cells.

Fig. 4
figure 4

Biological pathways involved in genes essential for oral squamous cell carcinoma cell growth

3.5 TOP2A expression in OSCC tissues at the single cell level

In the analysis of scRNA-seq data for OSCC, unsupervised clustering was used to categorize cells into distinct subpopulations (Fig. 5a–c). Subsequently, molecules exhibiting significantly up-regulated expression in these cellular subpopulations were identified through differential analysis and cross-referenced with information from the CellMarker. Eight cell types were ultimately discerned in OSCC tissues: fibroblasts (FI), oral squamous carcinoma cells (OSCC cells), basal cells (BC), epithelial cells (EC), endothelial cells (EOC), smooth muscle cells (SMC), natural killer cells (NKC), and T cells. Notably, TOP2A expression was concentrated in four cell types, including OSCC cells, EC, SMC, and NKC (Fig. 5d, e). Importantly, the expression of TOP2A in OSCC cells was significantly higher than in the remain-ing three cell types (Fig. 5f), consistent with the aforementioned analyses of immunohistochemistry, gene chip, and RNA-seq data. Motif signature genes defining the developmental process of OSCC cells were selected based on high dispersion (mean_expression ≥ 3.5 & dispersion_empirical ≥ 1 * dispersion_fit). The reverse graph embedding (DDR tree) algorithm was then employed to downscale the data, and OSCC cells trajectories were constructed based on clusters (Fig. 6b) obtained from UMAP with pseudo time series (Fig. 6c) data, aligning with the expression trend of the motif characterization genes (Fig. 6a). The differentigenetest function identified genes with time-series influences (Fig. 6d). During OSCC cells differentiation, TOP2A expression exhibited an upward and then downward trend (Fig. 6e) and demonstrated differential expression distribution in various cell states (Fig. 6f). These findings suggest that TOP2A may exert an influence on the tumor heterogeneity formed during OSCC cells differentiation.

Fig. 5
figure 5

Expression of TOP2A at the level of oral squamous cell carcinoma cells. a Eigengenes of PC1 and PC2; b t-distributed random neighbor embedding (t-SNE) distribution of 16 cell clusters by unsupervised clustering; c Eigengenes of 16 cell subsets; d TOP2A expression distribution in 8 types of cells

Fig. 6
figure 6

Analysis of oral squamous cell carcinoma cells developmental trajectories. a The developmental process of oral squamous cell carcinoma cells defined according to the expression patterns of motif-characterized genes; b the distribution of different oral squamous cell carcinoma cells subsets in the developmental trajectory of oral squamous cell carcinoma cells; c the chronological score of oral squamous cell carcinoma cells differentiation calculated according to the developmental trajectory of oral squamous cell carcinoma cells; d pseudo-temporal differential gene heatmap (number of clusters = number of cell states = 5); e TOP2A expression distribution along the trajectory of oral squamous cell carcinoma cells; f dynamic changes of TOP2A expression levels in different cell states during OSCC cell differentiation

4 Discussion

The clinical significance of TOP2A expression in OSCC tissues remains largely unexplored. This study, anchored by a comprehensive chain of evidence, including IHC, differential expression analysis, CRISPR-Screen, and scRNA-seq data from public databases, elucidated that TOP2A was significantly up-regulated in OSCC tissues and cells. Moreover, it emerged as a highly distinguishable marker between OSCC and non-tumor tissues. We first identified TOP2A upregulation in OSCC tissues by IHC of clinical samples, and then by high-throughput dataset spanning 19 different data platforms and 1668 samples from individual cases, underscored the consistent trend of elevated TOP2A mRNA expression, reinforcing the notion that the abnormal up-regulation of TOP2A in OSCC tissues constitutes a robust pattern in the human body. Based on these findings, we propose that TOP2A could serve as a diagnostic marker for OSCC.

Our findings are consistent with previous studies showing the upregulation of TOP2A expression in various tumor tissues, including pancreatic, breast, prostate, colon, gastroesophageal, and esophageal cancers, as well as hepatoblastoma and malignant peripheral nerve sheath tumors [18,19,20]. Exploring the biological pathways, TOP2A has been implicated in activating the β-linker pathway in pancreatic cancer [21], up-regulated TOP2A promotes ovarian cancer cell proliferation through the AKT/mTOR pathway [22]. While its downregulation has been linked to inhibiting the activity of ERK and AKT in colon cancer [23]. In our study, we used CRISPR-screen to knock down TOP2A and identified key growth genes in OSCC cells. These genes are primarily enriched in ribonucleoprotein complex biogenesis, which is crucial for cancer progression as the growth of cancer cells relies on the high level of protein synthesis provided by ribosomes [24]. Other pathways such as ribosomal large subunit biogenesis and cytoplasmic translation through modulate protein synthesis [25] and enhancing the translation of mRNAs to accelerates tumor cells proliferation [26]. The functional role of TOP2A and its related pathways in OSCC is not completely clear, and our study lays a foundation for further investigations in this regard.

By constructing a clinical diagnostic model, we found that the inclusion of TOP2A in the diagnosis of OSCC holds promise for enhancing the likelihood of patients receiving timely and effective interventions. Notably, researchers have identified TOP2A as a prognostic factor and driver gene associated with survival in breast and prostate cancer [15, 16].

Additionally, considering the substantial threat posed by lymphatic metastasis in OSCC patients [27]. Thus, there is an urgent need for reliable molecular markers indicative of metastatic risk. The scRNA-seq analysis uncovered high TOP2A expression in OSCC cells, but not other microenvironmental cells. Meanwhile, we found that TOP2A showed an increasing and then decreasing trend during the differentiation of OSCC cells, with different expression levels in different cell states. This finding suggesting a potential role for TOP2A in promoting OSCC metastasis and opens avenues for further investigation into the metastatic mechanisms associated with TOP2A in OSCC.

In the intricate landscape of cancer, cancer stem cells (CSCs) play a pivotal role in mediating chemoresistance and cancer recurrence [28]. These cells, representing a small fraction of cancer cells in tumors, form at the early differentiation stage, possess the ability of multidirectional differentiation, contribute to tumor heterogeneity, and are closely linked to the malignant phenotype and poor prognosis of tumors [29, 30]. Our single-cell trajectory analysis revealed significant overexpression of TOP2A in the early differentiated subpopulation of OSCC, indicating a close association between TOP2A and the stem cell properties of OSCC. This suggests that TOP2A could serve as a potential target for the development of OSCC-specific targeted therapeutic agents.

Despite the promising findings presented in this study, there are notable limitations that warrant acknowledgment. It is noteworthy that the number of cases included in the current analysis of TOP2A protein expression is relatively small, necessitating further expansion of the sample size for validation in future studies. While our study underscores the biological significance of elevated TOP2A expression in OSCC, further validation through serial ex vivo experiments is imperative to elucidate the specific biological role of TOP2A in OSCC and explore its potential upstream regulatory mechanisms. The clinical utility of non-invasive oncogene marker screening holds promise for enhancing early diagnosis and therapeutic monitoring of neoplastic diseases. Although our study employed various methods to assess TOP2A expression in OSCC, we have not delved into the status of TOP2A in the peripheral blood of OSCC patients. Consequently, whether the expression pattern of TOP2A in peripheral blood mirrors that in OSCC tissues, and the clinical significance of such expression, remains an area requiring clarification. Therefore, we need to further explore the value of TOP2A molecular markers in the clinical application of OSCC in subsequent studies.

5 Conclusion

Collectively, we analyzed the high-throughput dataset of the database by performing protein level experiments on the samples and found that TOP2A was significantly highly expressed in OSCC tissues and assessed its ability to identify OSCC tissues using diagnostic analysis. Using scRNA-seq, we determined that TOP2A is highly expressed specifically in OSCC cells and exhibits differential expression levels in various cellular states. Our analysis indicates that TOP2A up-regulation plays a crucial role in OSCC progression, potentially providing new insights for clinical treatment and a theoretical basis for future studies.