Abstract
Background
Bladder cancer is a prevalent malignant tumor with high heterogeneity. Current treatments, such as transurethral resection of bladder tumor (TURBT) and intravesical Bacillus Calmette-Guérin (BCG) therapy, still have limitations, with approximately 30% of non-muscle-invasive bladder cancer (NMIBC) progressing to muscle-invasive bladder cancer (MIBC), and a substantial number of MIBC patients experiencing recurrence after surgery. Immunotherapy has shown potential benefits, but accurate prediction of its prognostic effects remains challenging.
Methods
We analyzed bladder cancer RNA-seq data and clinical information from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases, and used various machine learning algorithms to screen for feature RNAs related to tumor-infiltrating immune cells (TIICs) from single-cell data. Based on these RNAs, we established a TIIC signature score and evaluated its relationship with overall survival (OS) and immunotherapy response in bladder cancer patients.
Results
The study identified 171 TIIC-RNAs and selected 11 TIIC-RNAs with prognostic value through survival analysis. The TIIC signature score established using a machine learning fusion method was significantly associated with OS and showed good predictive performance in different datasets. Additionally, the signature score was negatively correlated with immunotherapy response, with patients with low TIIC feature scores showing better survival outcomes after immunotherapy. Further biological functional analysis revealed a close association between the TIIC signature score and immune regulation processes, cellular metabolism, and genetic variations.
Conclusion
This study successfully constructed and validated an RNA signature scoring system based on tumor-infiltrating immune cell (TIIC) features, which can effectively predict OS and the effectiveness of immunotherapy in bladder cancer patients.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Urological malignancies are commonly regarded as a significant threat to the health of middle-aged and elderly men. Among them, bladder cancer (BLCA), ranking tenth among malignant tumors worldwide, exhibits increasing incidence and mortality rates over the years [1]. Clinically, BLCA can be categorized into non-muscle-invasive BLCA and muscle-invasive BLCA [2]. Approximately 75% of BLCA belong to the non-muscle-invasive type, for which transurethral resection of bladder tumor (TURBT), intravesical Bacillus Calmette-Guérin (BCG) therapy, and chemotherapy have demonstrated favorable therapeutic effects [3]. However, the 5-year recurrence rate of non-muscle-invasive BLCA remains high at 50–70%, leading to challenges such as frequent recurrence and high metastasis risk with limited subsequent treatment options [4]. With the deepening understanding of BLCA, new approaches such as immunotherapy, targeted therapy, and antibody–drug conjugates have shown promising efficacy. Particularly, the advent of immunotherapy has revolutionized the treatment landscape of BLCA, with immune checkpoint inhibitors (ICIs) such as PD-1/L1 and CTLA-4 monoclonal antibodies ushering in a new era of BLCA immunotherapy [5]. As more the positive data of immunotherapy clinical trials, immunotherapy has significantly reduced the risk of disease progression or death in advanced or metastatic urothelial carcinoma patients, resulting in a marked improvement in overall survival (OS) rates [6]. However, the occurrence of primary and acquired resistance remains a challenge, and some advanced BLCA patients fail to derive significant benefits from immunotherapy.
The tumor microenvironment (TME) is a complex system composed of tumor cells, tumor-infiltrating immune cells (TIICs), and extracellular matrix components, playing a crucial role in tumor development. It not only provides nutrients and growth conditions for tumor cells but also regulates immune escape and drug resistance of tumor cells [7,8,9]. TIICs are a key component of the TME, possessing tumor recognition and attack capabilities, and playing important roles in immunotherapy. TIICs are heterogeneous, exhibiting functional and phenotypic plasticity, and may exert both pro-tumor and anti-tumor effects. TIICs enhance the efficacy of immunotherapy and reduce tumor recurrence and metastasis through mechanisms such as direct attack on tumor cells, formation of immune memory, regulation of immune cell subpopulations, and blockade of immune inhibitory factors [10]. However, the presence of regulatory T cells (Tregs) and tumor-associated macrophages may release inhibitory cytokines such as TGF-β and IL-10, attenuating the activity of tumor-infiltrating T cells [11]. Additionally, tumor cells may evade immune surveillance by reducing tumor antigen expression or altering antigen structure, as well as by the accumulation of myeloid-derived suppressor cells (MDSCs), thereby promoting the occurrence of immunotherapy resistance [12].
With the rapid development of bioinformatics technologies, genome sequencing and high-throughput sequencing technologies have not only reduced the cost of TME detection but also enabled in-depth understanding of the TME [13]. Meanwhile, machine learning (ML) algorithms and artificial intelligence have become important tools for analyzing complex biomedical data, discovering therapeutic targets, and predicting clinical outcomes [14]. Based on the development of these technologies, multiple evaluation models for diagnosis, prognosis, or therapeutic sensitivity have been constructed and applied in clinical practice, including MSINet, a transfer learning model based on MobileNetV2 architecture, which accurately classifies the microsatellite instability status in colorectal cancer tissue slice images with an AUC of 0.93 [15]. DeepPATH, based on the Inception-v3 architecture, is a deep learning model used for three-class classification of lung tissue WSI (normal, lung adenocarcinoma, and lung squamous cell carcinoma), with a high AUC of 0.97 [16].
Therefore, in this study, we developed a signature related to TIICs based on analysis of single-cell RNA sequencing (scRNAseq) data and bulk RNA sequencing data using 20 ML algorithms. We systematically explored the performance of the TIIC signature in predicting prognosis and immunotherapy response in BLCA patients.
2 Methods
2.1 Collection and processing of the BLCA transcriptome data
In constructing the model, we opted for BLCA RNA expression profiles and corresponding clinical data from the TCGA database (https://portal.gdc.cancer.gov/, n = 406). To validate the model's stability and accuracy, we utilized datasets from the GEO database (https://www.ncbi.nlm.nih.gov/geo/) with sample sizes exceeding 50, including GSE160693 [17] (n = 52), Xiangya_Cohort (GSE188715, n = 56) [18, 19], and GSE176307 (n = 85) [20] as the validation set for RNA-seq. All data were transformed into TPM format, except for GSE160693, which was converted to log2 format for subsequent analysis. Additionally, we employed BLCA microarray data from GEO datasets with sample sizes > 50, including GSE13507 (n = 165) [21, 22], GSE31684 (n = 93) [23], GSE32894 (n = 224) [24], and GSE48075 (n = 73) [25], as the validation set. We conducted data normalization on the microarray data using the normalizeBetweenArrays function from the "limma" package [26]. Furthermore, immunotherapy data were obtained from IMvigor210 (along with data predicted by TIDE online), sourced from the "IMvigor210CoreBiologies" [27] R package.
2.2 Collection and processing of the BLCA scRNA-seq data
The single-cell dataset originated from the GEO database GSE169379, comprising 25 BLCA tumor samples, totaling 25 samples. Data analysis was performed using R software (version 4.1.3), with "Seurat" [28] package chosen for analysis. During data quality control, cells with mitochondrial content exceeding 10% were excluded, and restrictions were applied to UMI counts (200–20,000) and gene counts (200–8000) per cell. Data processing included normalization, selection of highly variable genes (2000 genes), and removal of cell cycle effects (using parameters vars.to.regress = c ("S.Score", "G2M.Score")), corresponding to functions NormalizeData, FindVariableFeatures, and ScaleData in the "Seurat" package, respectively. To address batch effects, the harmony method was employed. Subsequent analysis involved dimensionality reduction and clustering using UMAP, t-SNE, and Louvain clustering algorithms, all implemented within the "Seurat" package. Differential gene expression between clusters or cell types was calculated using the FindAllMarkers function, with parameters set to p-value less than 0.05, log2 fold change greater than 0.25, and expression fraction greater than 0.1.
2.3 Obtaining TIIC-related genes
Through comprehensive analysis at the scRNAseq level of immune cells, tumor cells, and bulk level of BLCA cancer tissues, we have developed a computational framework based on multiple algorithms for identifying TIIC scores. The specific steps are as follows:
Determining Potential Immune-Related RNAs: Identifying the top 15% of RNAs based on expression values as potential immune-related RNAs.
Using Tissue Specificity Index (TSI) [29] to Determine Potential Immune-Related RNAs: TSI is calculated using the formula:
where N represents the number of immune cell types and xRNA, i represents the expression intensity of immune cell i for RNA based on the normalized maximal expression value of RNA in each cell type. TSI ranges from 0 to 1, where TSI = 0 defines immune cell-general RNA and TSI = 1 defines immune cell-specific RNA. RNAs with high expression across all immune cell types are classified as immune-related universal RNAs (iuRNA).
Defining TIIC-RNA: iuRNAs significantly upregulated in immune cell types and downregulated in tumor cells are defined as TIIC-RNA.
Utilizing ML for Classification: Employing ML algorithms including Boruta, eXtreme gradient boosting (Xgboost), least absolute shrinkage and selection operator regularized logistic regression (LassoLR), support vector machine (SVM), random forest (RF), and prediction analysis for microarrays (Pamr) to screen for the most valuable TIIC-RNAs.
2.4 Construction of TIIC-signature by machine learning-based ensemble methods
The generation of TIIC-signature using machine learning-based ensemble methods involves the following steps:
Screening Candidate Prognostic TIIC-RNAs through Univariate Cox Proportional Hazards Regression Analysis.
Further Evaluating the Importance of Prognostic TIIC-RNAs using Three ML Classification Algorithms, including Random Survival Forest (RSF), Lasso Regularized Cox Regression (LassoCox), and Likelihood-Based Boosting (CoxBoost) Cox Models.
Processing the Scores using 20 ML Algorithms, which include RSF, Conditional Random Forests (CForest), LassoCox, Elastic Net Regression (Enet), Ridge Regression, Gradient Boosting with Regression Trees (BlackBoost), Regression for Parametric Survival Model (SurvReg), Conditional Inference Trees (CTree), Cox Proportional Hazards Model (CoxPH), Oblique Random Survival Forest (ObliqueRSF), Stepwise Cox Model (StepwiseCox), Survival Support Vector Machine (SurvivalSVM), Generalized Boosted Regression Models (GBM), Ranger, Partial Least Squares Regression for Cox Models (PlsRcox), Gradient Boosting with Component-Wise Linear Models (GlmBoost), Supervised Principal Components (SuperPC), Akritas Conditional Non-Parametric Survival Estimator (Akritas), CoxBoost, and Recursive Partitioning and Regression Trees (Rpart), aiming to determine the most reliable model based on comprehensive Concordance index (C-index).
Development of TIIC Signature Score based on Prognostic TIIC-RNAs using superPC Algorithm. We then plot the receiver Operating curve (ROC) to analyze the efficacy of the model, and the area under the curve (AUC) greater than 0.6 is considered to have good efficacy.
2.5 Cell annotation analysis
We first annotated cells using the following cell markers: Epithelial Cell Markers ("EPCAM", "KRT18", "KRT19", "CDH1"), Fibroblast Cell Markers ("DCN", "THY1", "COL1A1", "COL1A2"), Endothelial Cell Markers ("PECAM1", "CLDN5", "FLT1", "RAMP2"), T Cell Markers ("CD3D", "CD3E", "CD3G", "TRAC"), NK Cell Markers ("NKG7", "GNLY", "NCAM1", "KLRD1"), B Cell Markers ("CD79A", "IGHM", "IGHG3", "IGHA2"), Myeloid Cell Markers ("LYZ", "MARCO", "CD68", "FCGR3A"), Mast Cell Markers ("KIT", "MS4A2", "GATA2"). Based on these cell markers, we conducted a series of graphical analyses, including t-SNE plots and violin plots of cell markers.
We then segregated the clustering of immune cells and performed cell annotation analysis separately. For automated annotation, we employed the Sc-Type software.
2.6 Functional annotation of TIIC signature
We utilized multiple algorithms to quantify immune infiltrates in tumors, including: 6 immune cell types via Tumor Immune Estimation Resource (TIMER) algorithm [30], 22 immune cell types via single-cell gene set enrichment analysis (ssGSEA) algorithm [31], 10 immune cell types via Microenvironment Cell Populations-counter (MCPcounter) algorithm [32], Estimation of Stromal and Immune cells in Malignant Tumors using Expression data (ESTIMATE) algorithm [33]. Additionally, we conducted Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis using Gene Set Variation Analysis (GSVA) [34] and Gene Set Enrichment Analysis (GSEA), as well as enrichment analysis using Metascape [35]. Furthermore, we quantified 114 metabolic pathways from previous literature using GSVA and quantified metabolic pathways from the KEGG database using GSVA as well.
2.7 Prediction of immunotherapy response
For the prediction of immunotherapy response, we collected multiple datasets, including Nathanson (melanoma) [36], GSE35640 (melanoma) [37], GSE91061 (melanoma) [38], GSE78220 (melanoma) [39], IMvigor210 (urothelial carcinoma, UC) [27], Braun (renal cell carcinoma, RCC) [40], GSE179351 (colorectal adenocarcinoma and pancreatic adenocarcinoma, COAD and PAAD) [41], GSE165252 (esophageal adenocarcinoma, ESCA) [42], GSE103668 (triple-negative breast cancer, TNBC) [43], and GSE126044 (non-small cell lung cancer, NSCLC) [39], among others. In each dataset, we computed the TIIC signature score to predict the response to immunotherapy. Additionally, we utilized the Tumor Immune Dysfunction and Exclusion (TIDE) online analysis tool (http://tide.dfci.harvard.edu/) to predict the immune response and scores in TCGA dataset.
2.8 Gene differential analysis and GSEA analysis
To assess the functional characteristics of different TIIC groups, we conducted gene differential analysis using the "limma" package for two groups. The criteria for selecting differentially expressed genes were a p-value less than 0.05 and an absolute value of log2FC greater than 0.5. Subsequently, we performed GSEA analysis on the upregulated genes in both TIIC groups using the "clusterProfiler" package [44]. In the GSEA analysis, we utilized relevant gene sets from the KEGG and GO Biological Process (GOBP) pathways from the MSigDB database [45]. Enrichment was considered significant if the p-value was less than 0.05 after Benjamini–Hochberg correction. Finally, we visualized the results using the "enrichplot" package.
2.9 Comparison of genetic variants landscape
We utilized the "maftools" package [46] in R to operate mutation data and examined the differences in mutation burden between the high-risk and low-risk groups. Through maftools, we presented waterfall plot results of the top 30 genes in both high-risk and low-risk groups and conducted chi-square tests to analyze differences in gene mutation frequencies between the two groups. For Copy Number Variation (CNV) data, we processed it using the Gistic 2.0 software. Initially, we identified significantly amplified and deleted chromosomal segments and evaluated differences in CNVs on chromosomes. We also calculated the fraction of genome alteration (FGA), the fraction of genome gained (FGG), and the fraction of genome lost (FGL). Finally, we visualized these CNV results using the "ggplot2" package.
2.10 Statistical analysis
All data processing, statistical analysis, and plotting were performed using R 4.1.3 software. We assessed the correlation between two continuous variables using Pearson correlation coefficient. For categorical variables, we compared them using the chi-square test, while for continuous variables, we used either the Wilcoxon rank-sum test or the t-test for comparison. We used the "survminer" package to determine the optimal cutoff value and conducted Cox regression and Kaplan–Meier analysis using the "survival" package. In this study, we defined p < 0.05 as statistically significant.
3 Results
3.1 Identification of TIIC-RNA at single-cell level
Based on the BLCA scRNAseq dataset, we identified BLCA cells and six types of microenvironmental cells (Fig. 1A). We further selected BLCA cells and four types of immune cells for analysis (Fig. 1B). We also displayed the t-SNE plot of immune cells and BLCA cells (Fig. 1C), the top 15% of RNA expression levels in each immune cell were identified as potential immune-related RNAs for that immune cell type, totaling 4,743 RNAs. Based on TSI scores (TSI < 0.25), we further determined 2,807 immune-related RNAs as IURNA. The differentially expressed genes (DEGs) in immune cells are shown in Fig. 1D, which were used to validate the definition of cell populations. We calculated the differentially expressed genes between immune cells and BLCA cells, as shown in Fig. 1E. Compared to BLCA cells, 309 significantly upregulated DEGs in immune cells were defined as TIIC-RNA. We applied six ML algorithms for classification, including Boruta, Xgboost, LassoLR, SVM, RF, and Pamr, to identify 171 most valuable TIIC-RNA based on the previously defined TIIC-RNA (Fig. 1F).
3.2 Construction of the TIIC signature
We conducted univariate Cox proportional hazards regression analysis to investigate the prognostic value of TIIC-RNA in OS of BLCA patients. The results identified 29 TIIC-RNA in the TCGA dataset (Fig. 2A). Furthermore, we employed three ML algorithms for survival analysis, including CoxBoost (Fig. 2B), LassoCox (Fig. 2C, D), and random forest (Fig. 2E, F). 11 prognostic TIIC-RNA were determined (Fig. 2G). Subsequently, based on these 11 prognostic TIIC-RNA, we used 20 ML algorithms for scoring and determined the most reliable model based on the comprehensive C-index of external validation datasets, with the performance of the superPC algorithm being optimal among the 20 ML algorithms (Fig. 2H). TCGA-BLCA patients with high TIIC signature scores exhibited poorer survival outcomes in the TCGA dataset (p < 0.05, Fig. 2I). TimeROC curves quantifying 1–5 year OS with AUC values demonstrated the prognostic value of TIIC signature scores in TCGA-BLCA (0.719, 0.706, 0.713, 0.702, 0.693, respectively). AUC values for other datasets are shown in the figure (Fig. 2J).
3.3 Comparison of prognostic value between TIIC signature and previous features
In the TCGA dataset, we observed significant correlations between TIIC signature scores and survival status, tumor stage, and TNM staging system (Fig. 3A). Furthermore, in the TCGA dataset, TIIC signature scores showed better performance in terms of age, gender, tumor stage, and TNM staging system C-index (Fig. 3B). To further evaluate the prognostic performance of TIIC signature scores, we included 52 prognostic models reported in other literature and compared the C-indexes in TCGA-BLCA and other validation datasets (Fig. 3C). These 52 prognostic models demonstrated associations with various biological features. Our TIIC model exhibited better performance in both TCGA-BLCA and other validation datasets compared to most other published models.
3.4 Biological mechanisms associated with prediction of TIIC signature score
Given the upregulation of immune-related features observed in the low-scoring group, we aimed to delve deeper into potential biological mechanisms. The TIIC signature score exhibited strong positive correlations with most pathways, such as Adherens junction, TGF beta signaling pathway, hippo signaling (Fig. 4A). We selected eight pathways with significant differences between the two groups from the GOBP and KEGG databases and demonstrated their sample tSNE plots along with corresponding ssGSEA scores (Fig. 4B). Enrichment analysis of upregulated genes in the high TIIC group in Metascape revealed associations with inflammatory response, cell–cell adhesion, and other functions (Fig. 4C). GSEA results for predominant genes in both high and low TIIC groups indicated that the high TIIC group was associated with processes such as collagen fibril organization, external encapsulating structure organization, peptide encapsulating structure organization, and tissue development, while the low TIIC group was associated with growth, regulation of hormone levels, secretion, and signal release (Fig. 4D,E).
3.5 Significant correlation between TIIC signature and immune-related features
We employed the TIMER algorithm for six immune cell types, ssGSEA for 28 immune cell types, MCPcounter algorithm for 10 immune cell types, and Estimation of Stromal and Immune cells in ESTIMATE algorithm to quantify immune infiltrating cells. We observed a decreasing trend in the activity of most immune cells as the TIIC score increased (Fig. 5A). Additionally, we compared the correlation of TIIC signature score with immune regulatory factors, as shown in Fig. 5B.
3.6 Validating the predictive value of TIIC signature score for immunotherapy response across multiple datasets
Given the predictive ability of the TIIC signature score for the benefits of immunotherapy, we proceeded to validate its effectiveness in multiple immunotherapy datasets. In the IMvigor cohort, patients with high TIIC signature scores exhibited better survival outcomes (p = 0.00087, Fig. 6A). However, there was no significant difference in TIIC signature scores between the response and progression groups post-immunotherapy (p = 0.77, Fig. 6B). In the COAD and PAAD cohorts, the TIIC signature scores in the effective immunotherapy group were significantly higher than those in the progression group (p = 0.011, Fig. 6C). In the GSE35640 melanoma cohort, there was no significant difference in TIIC signature scores between the response and progression groups post-immunotherapy (p = 0.43, Fig. 6D). In the Braun (RCC cohort), patients with high TIIC signature scores exhibited better survival outcomes (p = 0.038, Fig. 6E). However, there was no significant difference in TIIC signature scores between the response and progression groups post-immunotherapy (p = 0.8, Fig. 6F). In the GSE91061 melanoma cohort, there was no significant difference in TIIC signature scores between the response and progression groups post-immunotherapy (p = 0.52, Fig. 6G). In the GSE103668 TNBC cohort, there was no significant difference in TIIC signature scores between the response and progression groups post-immunotherapy (p = 0.69, Fig. 6H). In the Nathanson melanoma cohort, there was no significant difference in survival outcomes between the high TIIC signature score group and the low TIIC signature score group (p = 0.25, Fig. 6I), and no significant difference in TIIC signature scores between the response and progression groups post-immunotherapy (p = 0.7, Fig. 6J). In the GSE78220 cohort, patients with high TIIC signature scores exhibited better survival outcomes (p = 0.03, Fig. 6K), but there was no significant difference in TIIC signature scores between the response and progression groups post-immunotherapy (p = 0.93, Fig. 6L). In the GSE126044 NSCLC cohort, there was no significant difference in TIIC signature scores between the response and progression groups post-immunotherapy (p = 0.44, Fig. 6M). In the GSE165252 ESCA cohort, there was no significant difference in TIIC signature scores between the response and progression groups post-immunotherapy (p = 0.83, Fig. 6N). In the TCGA-BLCA dataset, a higher proportion of non-responders was observed in the high TIIC grouping (p = 6.32e−08, Fig. 6O).
In summary, the TIIC score exhibits heterogeneity in prognosis and differences in immunotherapy response and progression status. High TIIC scores are more favorable for the prognosis of BLCA and RCC. However, in melanoma, the effect of high TIIC scores on prognosis remains undetermined. In COAD and PAAD, patients with higher TIIC signature scores exhibited a better response to immunotherapy. Conversely, in BLCA, higher TIIC scores were associated with poorer outcomes. For other tumor types, no significant correlation between TIIC scores and immunotherapy efficacy was observed.
3.7 Predicting metabolic features associated with TIIC signature scores
To investigate the extensive metabolic features between the two TIIC signature score groups, we conducted GSVA for metabolic pathways from the KEGG database. TIIC signature scores showed significant correlations with numerous metabolic pathways (Fig. 7A). Notably, Biosynthesis of unsaturated fatty acids and Sphingolipid metabolism exhibited significantly higher activation rates in the high TIIC signature score group, while Glycerophospholipid metabolism and Fatty acid degradation showed significantly higher activation rates in the low TIIC signature score group (p < 0.001, Fig. 7B). Additionally, TIIC signature scores were negatively correlated with pathways such as Fatty acid biosynthesis, Glycerophospholipid metabolism, and Sulfur metabolism, while positively correlated with Sphingolipid metabolism (p < 0.001, Fig. 7C).
4 SNV mutation differential analysis and CNV differential analysis
We observed different frequencies of chromosomal alterations between the two TIIC signature score groups (Fig. 8A). The waterfall plot illustrates the mutation status of the top 30 genes in both risk groups, revealing TP53 (49%), TTN (43%), and KMT2D (25.9%) as genes with higher mutation rates (Fig. 8B). The high TIIC signature score group exhibited higher chromosomal instability, characterized by FGA, FGG, and FGL, although no significant changes were observed statistically (Fig. 8C–E). Significant differences were observed in CNV mutations in chr3 between the two groups (Fig. 8F).
5 Discussion
BLCA is a common and highly heterogeneous malignant tumor. According to the World Health Organization (WHO), hundreds of thousands of people are diagnosed with BLCA every year, with some patients missing the optimal treatment window due to late diagnosis, leading to disease progression and deterioration [1]. Despite advances in TURBT or BCG instillation therapy, about 30% of non-muscle-invasive BLCA (NMIBC) patients will eventually progress to muscle-invasive BLCA (MIBC) [3]. Approximately 50% of MIBC patients will experience local or systemic recurrence after radical cystectomy, and some may even succumb to the disease [4]. In recent years, immunotherapy has garnered attention as a novel treatment strategy, particularly PD-1/PD-L1 inhibitors showing promise in the treatment of advanced BLCA [47]. However, many patients still experience immunotherapy failure, making accurate prediction of immunotherapy outcomes in BLCA patients a focus of current research.
Studies have shown that various factors influence the efficacy of immunotherapy, including tumor immunogenicity, TME, immune inhibitory factors, tumor mutation burden, patient immune status, and tumor genotype [48,49,50,51]. Among these factors, the TME plays a significant role in the efficacy of tumor immunotherapy. TIICs, as crucial components of the TME, play roles in immune surveillance, direct killing of tumor cells, regulation of immune responses, and involvement in complex immune regulatory networks [52,53,54]. Therefore, it is necessary to explore the success or failure of tumor immunotherapy from the perspective of TIICs.
In this study, we identified TIICs from BLCA scRNA-seq data and utilized optimized algorithms to filter out immune-related RNAs. BLCA TIIC-RNAs were determined based on calculated TSI scores. Subsequently, we employed various ML algorithms to screen and establish a TIICs signature consisting of 11 TIIC-RNAs (COL6A1, DOCK8, TNFRSF14, MAN2A1, ORMDL1, ADCY7, MAML2, SND1, SLFN11, UBAC2, FAM111A). The higher scores of TIIC signature were associated with poorer OS, and the predictive value of TIIC signature for prognosis was not inferior to clinical parameters such as age, gender, and TNM staging system, but even superior to them. Furthermore, we further confirmed the application value of the TIIC signature at multiple omics levels. Firstly, in terms of biological mechanisms, the TIIC signature showed strong correlations with various biological processes. Secondly, in immune cell scoring, the TIIC signature score directly represented the level of immune infiltration. Additionally, the TIIC signature also demonstrated significant predictive capabilities in metabolic processes. Finally, the TIIC signature accurately depicted the genetic mutation status of patients.
In this study, we employed over 20 machine learning algorithms categorized into four groups: decision trees and random forests (RSF, CForest, CTree, Ranger, ObliqueRSF), linear models (LassoCox, Enet, Ridge Regression, CoxPH, StepwiseCox, PlsRcox), boosting methods (BlackBoost, GBM, CoxBoost, GlmBoost), and other approaches (SurvReg, SurvivalSVM, SuperPC, Akritas, Rpart). Decision trees and random forests excel in handling nonlinear and high-dimensional data but suffer from poor interpretability and a tendency to overfit. Linear models offer high computational efficiency and ease of interpretation but assume linear relationships and are sensitive to outliers. Boosting methods provide high predictive accuracy and robustness against overfitting, albeit with high computational complexity and challenging parameter tuning. Other methods, such as SurvReg and SurvivalSVM, bring diversity and specific advantages but have limited applicability and complexity in understanding and application. By integrating these algorithms, we can leverage their respective strengths to develop more robust and powerful models.
Our constructed TIIC signature score is calculated based on the expression levels of COL6A1, DOCK8, TNFRSF14, MAN2A1, ORMDL1, ADCY7, MAML2, SND1, SLFN11, UBAC2, and FAM111A. It exhibited excellent performance in predicting survival outcomes, indicating potential utility of the TIICs signature score in the future. Existing studies have shown:TNFRSF14 suppresses BLCA proliferation by promoting apoptosis to inhibit tumor cells from escaping immune surveillance [55]. The MECT1-MAML2 fusion protein activates HES1 transcription to disrupt Notch signaling, and activates CREB, thus mimicking constitutive activation of cAMP signaling [56]. SND1, as an endoplasmic reticulum-associated protein, promotes immune escape of tumor cells by directing MHC-I heavy chain to ER-associated degradation pathway, interfering with antigen presentation [57]. FAM111A may influence immune response through the interferon-γ response pathway, associated with macrophages, CD8 T cells, and other immune cells, thereby playing a significant role in tumor immune response [58]. UBAC2 binds to BCRC-3 and subsequently affects the interaction between BCRC-3 and miR-182-5p, leading to inhibition of p27 expression and promoting BLCA proliferation [59]. COL6A1 activates multiple downstream signaling pathways including JAK-STAT, MAPK, NOTCH, p53, VEGF, and WNT signaling pathways, thereby promoting tumor cell proliferation. DOCK8 positively regulates IL-2-induced STAT5 phosphorylation, promoting regulatory T cell suppression [60]. SLFN11, as a key regulator of the DNA damage response, predicts response to various DNA damaging agents and PARP inhibitors in multiple tumors, serving as a crucial predictive biomarker and prognostic marker for various cancer types [61], [62]. Knockout of the Man2a1 gene enhances response to anti-PD-L1 therapy and promotes higher levels of cytotoxic T cell infiltration in tumors under anti-PD-L1 therapy [63]. ADCY7 is closely associated with immune cell infiltration and ICG expression in various tumors [64]. ORMDL1 promotes colorectal cancer cell migration in multiple ways and inhibits invasion and proliferation [65].
However, this study also has some limitations. Firstly, due to the high heterogeneity of tumor tissues, scRNAseq may not accurately define immune cells and BLCA cells, which may result in some inconsistency in the expression patterns of these 11 TIIC-RNAs. Secondly, further validation of the specific biological functions and mechanisms of these 11 TIIC-RNAs in immune cell activity is needed both in vitro and in vivo. Lastly, this study is based on datasets from public databases, making it retrospective in nature. Further validation of the model's accuracy requires multicenter cohort studies to verify the prognostic predictive ability of the model, and more BLCA immunotherapy data are needed to validate the potential of TIIC signature score in predicting immunotherapy response.
Overall, through comprehensive analysis of the transcriptome data of immune cells and LUAD cells, a TIIC signature score with great potential has been established, which can predict the prognosis of BLCA patients and predict the efficacy of immunotherapy. As a newly confirmed predictive biomarker, the TIIC signature score enables more accurate identification of BLCA patients who benefit from immunotherapy, and should be clinically validated and applied as soon as possible.
Data availability
The data could be obtained from the corresponding author.
References
Siegel RL, Miller KD, Wagle NS, Jemal A. Cancer statistics, 2023. CA Cancer J Clin. 2023;73:17–48.
Lopez-Beltran A, Cookson MS, Guercio BJ, Cheng L. Advances in diagnosis and treatment of bladder cancer. BMJ. 2024;384: e076743.
Flaig TW, et al. NCCN Guidelines® Insights: bladder cancer, Version 2.2022. J Natl Compr Canc Netw. 2022;20:866–78.
Sanli O, et al. Bladder cancer. Nat Rev Dis Primers. 2017;3:17022.
Patel VG, Oh WK, Galsky MD. Treatment of muscle-invasive and advanced bladder cancer in 2020. CA Cancer J Clin. 2020;70:404–23.
Massari F, et al. Immune checkpoint inhibitors for metastatic bladder cancer. Cancer Treat Rev. 2018;64:11–20.
Wang L, Wang Y, Bi J. In silico development and experimental validation of a novel 7-gene signature based on PI3K pathway-related genes in bladder cancer. Funct Integr Genomics. 2022;22:797–811. https://doi.org/10.1007/s10142-022-00884-2.
Wang L, Wang Y, Wang J, Li L, Bi J. Identification of a prognosis-related risk signature for bladder cancer to predict survival and immune landscapes. J Immunol Res. 2021. https://doi.org/10.1155/2021/3236384.
Xiao Y, Yu D. Tumor microenvironment as a therapeutic target in cancer. Pharmacol Ther. 2021;221: 107753.
Zhang Y, Zhang Z. The history and advances in cancer immunotherapy: understanding the characteristics of tumor-infiltrating immune cells and their therapeutic implications. Cell Mol Immunol. 2020;17:807–21.
Proto JD, et al. Regulatory T cells promote macrophage efferocytosis during inflammation resolution. Immunity. 2018;49:666-677.e6.
Wu Y, Yi M, Niu M, Mei Q, Wu K. Myeloid-derived suppressor cells: an emerging target for anticancer immunotherapy. Mol Cancer. 2022;21:184.
Sun S, Wang Y, Wang J, et al. Wnt pathway-related three-mRNA clinical outcome signature in bladder urothelial carcinoma: computational biology and experimental analyses. J Transl Med. 2021;19:409. https://doi.org/10.1186/s12967-021-03061-4.
Vamathevan J, et al. Applications of machine learning in drug discovery and development. Nat Rev Drug Discov. 2019;18:463–77.
Yamashita R, et al. Deep learning model for the prediction of microsatellite instability in colorectal cancer: a diagnostic study. Lancet Oncol. 2021;22:132–41.
Coudray N, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24:1559–67.
Kardos J, et al. Development and validation of a NanoString BASE47 bladder cancer gene classifier. PLoS ONE. 2020;15: e0243935.
Qin Y, et al. A cancer-associated fibroblast subtypes-based signature enables the evaluation of immunotherapy response and prognosis in bladder cancer. iScience. 2023;26: 107722.
Li H, et al. TNF family-based signature predicts prognosis, tumor microenvironment, and molecular subtypes in bladder carcinoma. Front Cell Dev Biol. 2021;9: 800967.
Rose TL, et al. Fibroblast growth factor receptor 3 alterations and response to immune checkpoint inhibition in metastatic urothelial cancer: a real world experience. Br J Cancer. 2021;125:1251–60.
Kim W-J, et al. Predictive value of progression-related gene classifier in primary non-muscle invasive bladder cancer. Mol Cancer. 2010;9:3.
Lee J-S, et al. Expression signature of E2F1 and its associated genes predict superficial to invasive progression of bladder tumors. J Clin Oncol. 2010;28:2660–7.
Riester M, et al. Integrative analysis of 1q23.3 copy-number gain in metastatic urothelial carcinoma. Clin Cancer Res. 2014;20:1873–83.
Sjödahl G, et al. A molecular taxonomy for urothelial carcinoma. Clin Cancer Res. 2012;18:3377–86.
Guo CC, et al. Assessment of luminal and basal phenotypes in bladder cancer. Sci Rep. 2020;10:9743.
Ritchie ME, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43: e47.
Mariathasan S, et al. TGF-β attenuates tumour response to PD-L1 blockade by contributing to exclusion of T cells. Nature. 2018;554:544–8.
Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol. 2018;36:411–20.
Yanai I, et al. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics. 2005;21:650–9.
TIMER: A Web Server for Comprehensive Analysis of Tumor-Infiltrating Immune Cells—PubMed. https://pubmed.ncbi.nlm.nih.gov/29092952/.
Charoentong P, et al. Pan-cancer immunogenomic analyses reveal genotype-immunophenotype relationships and predictors of response to checkpoint blockade. Cell Rep. 2017;18:248–62.
Becht E, et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. 2016;17:218.
Yoshihara K, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun. 2013;4:2612.
Rosario SR, et al. Pan-cancer analysis of transcriptional metabolic dysregulation using The Cancer Genome Atlas. Nat Commun. 2018;9:5330.
Zhou Y, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10:1523.
Nathanson T, et al. Somatic mutations and neoepitope homology in melanomas treated with CTLA-4 blockade. Cancer Immunol Res. 2017;5:84–91.
Ulloa-Montoya F, et al. Predictive gene signature in MAGE-A3 antigen-specific cancer immunotherapy. J Clin Oncol. 2013;31:2388–95.
Riaz N, et al. Tumor and microenvironment evolution during immunotherapy with nivolumab. Cell. 2017;171:934-949.e16.
Hugo W, et al. Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell. 2016;165:35–44.
Interplay of somatic alterations and immune infiltration modulates response to PD-1 blockade in advanced clear cell renal cell carcinoma—PubMed. https://pubmed.ncbi.nlm.nih.gov/32472114/.
Parikh AR, et al. Radiation therapy enhances immunotherapy response in microsatellite stable colorectal and pancreatic adenocarcinoma in a phase II trial. Nat Cancer. 2021;2:1124–35.
van den Ende T, et al. Neoadjuvant chemoradiotherapy combined with atezolizumab for resectable esophageal adenocarcinoma: a single-arm phase II feasibility trial (PERFECT). Clin Cancer Res. 2021;27:3351–9.
Birkbak NJ, et al. Overexpression of BLM promotes DNA damage and increased sensitivity to platinum salts in triple-negative breast and serous ovarian cancers. Ann Oncol. 2018;29:903–9.
Wu T, et al. ClusterProfiler 40: a universal enrichment tool for interpreting omics data. Innovation (Camb). 2021;2: 100141.
Liberzon A, et al. The molecular signatures database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1:417–25.
Mayakonda A, Lin D-C, Assenov Y, Plass C, Koeffler HP. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 2018;28:1747–56.
Stenehjem DD, Tran D, Nkrumah MA, Gupta S. PD1/PDL1 inhibitors for the treatment of advanced urothelial bladder cancer. Onco Targets Ther. 2018;11:5973–89.
Escobar G, et al. Tumor immunogenicity dictates reliance on TCF1 in CD8+ T cells for response to immunotherapy. Cancer Cell. 2023;41:1662-1679.e7.
Pitt JM, et al. Targeting the tumor microenvironment: removing obstruction to anticancer immune responses and immunotherapy. Ann Oncol. 2016;27:1482–92.
Chan TA, et al. Development of tumor mutation burden as an immunotherapy biomarker: utility for the oncology clinic. Ann Oncol. 2019;30:44–56.
Mellman I, Chen DS, Powles T, Turley SJ. The cancer-immunity cycle: indication, genotype, and immunotype. Immunity. 2023;56:2188–205.
Tay C, Tanaka A, Sakaguchi S. Tumor-infiltrating regulatory T cells as targets of cancer immunotherapy. Cancer Cell. 2023;41:450–65.
Cózar B, et al. Tumor-infiltrating natural killer cells. Cancer Discov. 2021;11:34–44.
Farhood B, Najafi M, Mortezaee K. CD8+ cytotoxic T lymphocytes in cancer immunotherapy: a review. J Cell Physiol. 2019;234:8509–21.
Zhu Y-D, Lu M-Y. Increased expression of TNFRSF14 indicates good prognosis and inhibits bladder cancer proliferation by promoting apoptosis. Mol Med Rep. 2018;18:3403–10.
Wu L, et al. Transforming activity of MECT1-MAML2 fusion oncoprotein is mediated by constitutive CREB activation. EMBO J. 2005;24:2391–402.
Wang Y, et al. Oncoprotein SND1 hijacks nascent MHC-I heavy chain to ER-associated degradation, leading to impaired CD8+ T cell response in tumor. Sci Adv. 2020;6:eaba5412.
Ji X, et al. Molecular and clinical characterization of a novel prognostic and immunologic biomarker FAM111A in diffuse lower-grade glioma. Front Oncol. 2020;10: 573800.
Gu C, et al. UBAC2 promotes bladder cancer proliferation through BCRC-3/miRNA-182-5p/p27 axis. Cell Death Dis. 2020;11:733.
Janssen E, et al. DOCK8 enforces immunological tolerance by promoting IL-2 signaling and immune synapse formation in Tregs. JCI Insight. 2017;2: e94298.
Raynaud CM, et al. Modulation of SLFN11 induces changes in DNA damage response in breast cancer. Cancer Cell Int. 2023;23:291.
Onji H, Murai J. Reconsidering the mechanisms of action of PARP inhibitors based on clinical outcomes. Cancer Sci. 2022;113:2943–51.
Shi S, et al. Inhibition of man2a1 enhances the immune response to anti-PD-L1 in human tumors. Clin Cancer Res. 2020;26:5990–6002.
Zeng Y, et al. A pan-cancer analysis of the prognostic value and expression of adenylate cyclase 7 (ADCY7) in human tumors. Int J Gen Med. 2021;14:5415–29.
Wang Q, et al. ORMDL1 is upregulated and associated with favorable outcomes in colorectal cancer. Transl Oncol. 2021;14: 101171.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Conceptualization: Xiangju Zeng; Methodology: Zhijie Lu; Formal analysis and investigation: Ziqi Liu and Caixia Dai; Writing—original draft preparation: Xiangju Zeng and Hao Su; Writing—review and editing: Shunhua Cheng; Funding acquisition: No; Resources:; Supervision: Shunhua Cheng. All authors commented on previous versions of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zeng, X., Lu, Z., Dai, C. et al. Establish TIIC signature score based the machine learning fusion in bladder cancer. Discov Onc 15, 368 (2024). https://doi.org/10.1007/s12672-024-01187-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12672-024-01187-7