Abstract
Pancreatic ductal adenocarcinoma (PDAC) has the worst prognosis of all common cancers. However, divergent outcomes exist between patients, suggesting distinct underlying tumor biology. Here, we delineated this heterogeneity, compared interconnectivity between classification systems, and experimentally addressed the tumor biology that drives poor outcome. RNA-sequencing of 90 resected specimens and unsupervised classification revealed four subgroups associated with distinct outcomes. The worst-prognosis subtype was characterized by mesenchymal gene signatures. Comparative (network) analysis showed high interconnectivity with previously identified classification schemes and high robustness of the mesenchymal subtype. From species-specific transcript analysis of matching patient-derived xenografts we constructed dedicated classifiers for experimental models. Detailed assessments of tumor growth in subtyped experimental models revealed that a highly invasive growth pattern of mesenchymal subtype tumor cells is responsible for its poor outcome. Concluding, by developing a classification system tailored to experimental models, we have uncovered subtype-specific biology that should be further explored to improve treatment of a group of PDAC patients that currently has little therapeutic benefit from surgical treatment.
Similar content being viewed by others
Introduction
Pancreatic ductal adenocarcinoma (PDAC) is the most lethal of all common solid tumors with reported 5-year survival rates below 8%1. This poor prognosis can be largely attributed to diagnosis at late disease stage, when surgical resection is no longer possible. In patients eligible for surgery, systemic chemotherapy, like gemcitabine and more recently FOLFIRINOX, only marginally prolongs survival2,3,4.
Clinical trials have typically failed in unselected PDAC cohorts, demonstrating that patient classification is essential for the efficacy of novel therapeutic approaches5. The expectation was that mutational profiling would better identify surgical candidates or predict favorable responses to (neo)adjuvant or palliative systemic treatment6. Although this approach identified many genetic alterations in PDAC7,8,9,10,11, clinical applicability remains limited12. Instead, capturing the intertumor heterogeneity that underlies poor outcomes or responsiveness is arguably best achieved by transcriptome analysis. Transcriptomic analyses capture both the intrinsic and extrinsic factors that drive tumor cell growth. We speculate that the interplay between these two variables drives the response to treatment. For pancreatic cancer, gene expression profiling and supervised class discovery studies have been performed13,14,15,16,17,18,19. In addition, effective unsupervised RNA-based classification studies, also in PDAC, have been completed20,21,22,23,24,25,26,27. In spite of the proposed molecular classifications thus far, the cellular and molecular mechanisms that cause the observed disease outcomes or therapeutic responses remain largely unknown. We still lack critical information to study the underlying disease-relevant mechanisms, which will allow the development of much needed new treatments.
To address this need, we first performed unsupervised classification on a unique single-center single-platform set of histopathologically revised PDAC-only samples from patients who underwent surgery. This revealed the existence of subgroups of PDAC with highly divergent outcomes, which bear high similarity to previously identified molecular subtypes as revealed by network analysis comparisons. Next, using these data and matching patient-derived xenografts, we established a molecular classification of experimental models for PDAC by constructing dedicated PDX- and cell line-optimized classifiers. We were able to experimentally demonstrate that the worst-outcome subgroup is defined by highly invasive tumor growth and that these mesenchymal features are tumor cell-intrinsic. The ability to identify those patients that will likely not benefit from surgery alone will dramatically aid clinical decision-making. In addition, the identification of experimental models for mesenchymal PDAC could be used to reveal subtype-specific vulnerabilities that could ultimately increase survival rates of those patients that do not benefit from resection of the tumor alone.
Results
Unsupervised class discovery in PDAC reveals four distinct subgroups
Previous classification studies in pancreatic cancer identified clinically relevant subtypes. However, functional assessment of the tumor biology that underlies subtype-specific outcomes in experimental models has lagged behind. This critical information should lead to more effective patient stratification and development of new therapeutic targets. To address this, we first assembled a large, unique single-center single-platform PDAC-only set with matching model systems. We selected tumor specimens from 230 retrospectively and 115 prospectively collected samples. After assessing tumor cellularity and histopathological confirmation of PDAC diagnosis, 90 samples were found to be appropriate for RNA-Seq. Following RNA-Seq, we performed unsupervised consensus clustering and determined the optimal number of clusters to be four (Fig. 1a,b and Supplementary Fig. 1). We then established a 159-gene classifier using PAM28 (Supplementary Table 1) and classified approximately equal patient numbers to the four PDAC subtypes (PDACS; Fig. 1c).
Gene set analysis revealed that each of the distinct PDAC subtypes was characterized by specific biological features (Fig. 2 and Supplementary Fig. 2a). Based on these analyses, we named the identified subtypes secretory, epithelial, compound pancreatic and mesenchymal. The secretory subtype was enriched for endocrine and exocrine functions of the pancreas, as indicated by Moffitt’s endocrine and exocrine factors and β-cell and pancreatic secretion signatures (Fig. 2 and Supplementary Fig. 2). As neuroendocrine transdifferentiation has recently been associated with poor outcome in PDAC29, these enrichments could indicate high lineage plasticity. Epithelial subtype tumors were characterized by Myc signaling, high expression of mitochondrial components and ribosome signatures. The mesenchymal subtype was enriched for signatures for epithelial-to-mesenchymal transition, and stroma and TGF-β signaling. The compound pancreatic subtype featured signatures similar to the mesenchymal subtype but with endocrine characteristics.
PDACS groups associate with outcome
Overall, clinical parameters, including age and histologic type, were similar among the subtypes (Fig. 1d and Table 1). Targeted mutation analysis for common PDAC driver genes (TP53, SMAD4, KRAS) revealed no correlation to the PDACS groups (P = 0.52; P = 0.15; P = 0.58 respectively, not shown in table). However, we found significant differences between radicality of resection (i.e. whether the resection margins were free of tumor) and tumor cell percentage. Epithelial subtype-patients more often featured radical resections (R0; Pearson χ2 test P = 0.013). Assessments of tumor purity and stromal and immune cell content by ESTIMATE30 revealed that the epithelial subtype (PDACS2) was relatively enriched in tumor cells (Fig. 1e). We confirmed this finding by histopathological assessment of tumor cellularity (Fig. 1f and Table 1). We also found that the mesenchymal subtype had the highest KRAS transcript levels (Fig. 1g), which is in line with recent work showing that mesenchymal, high grade PDAC, features genetic gains in oncogenic KRAS31.
Survival analysis with a 2-year follow-up revealed that the secretory and mesenchymal subtypes were associated with much poorer outcomes (median overall survival (OS) 14.7 and 14.0 months, respectively; Fig. 3a), compared with the epithelial and compound pancreatic subtypes (median OS 31.8 and 21.5 months). Pairwise comparison of OS between the subtypes showed that the epithelial and the compound pancreatic subtypes differed significantly from the mesenchymal subtype (Supplementary Table 2). Kaplan-Meier survival analysis revealed that radicality of resection (P = 0.007), lymph node status (P = 0.003) and degree of differentiation (poor vs well P = 0.011; poor vs moderate P = 0.002) also associated with outcome (Supplementary Fig. 3). We performed multivariate Cox proportional hazard regression analysis to test the statistical significance of the association of PDACS with OS, adjusting for potential confounders including sex, age at diagnosis, radicality of resection, lymph node metastasis and differentiation grade (Supplementary Table 3). The association of the secretory and mesenchymal subtypes with poor outcome remained significant (P = 0.018 and P = 0.036, respectively).
Of note, the correlation of PDAC subtypes with outcome was most apparent in patients with confounding variables associated with favorable outcomes: In the small group of patients with complete radicality of resection (R0), the mesenchymal subtype associated more strongly with poor outcome than any other subtype (median OS 17.2 months, P = 0.004 pairwise comparison with secretory subtype; Fig. 3b). In patients with no evidence for lymph node metastasis (N0), the mesenchymal subtype was also strongly correlated with poor outcome (Fig. 3c; median OS 14.0 months, P = 0.010 pairwise comparison with secretory subtype).
To assess whether our PDACS classifier could identify subgroups with poor survival in other cohorts, we merged published expression datasets and classified samples as mesenchymal (PDACS4) or non-mesenchymal (PDACS1–3). Survival analysis showed that the mesenchymal subtype was also associated with poor outcome in this merged validation cohort (Supplementary Fig. 4a,b)20,32.
Existing classification systems are interconnected
To further validate our classifier, we compared it to existing classifiers by applying it to samples pooled from our samples and previous studies (Supplementary Table 4). We summarized the statistical significance of association between subtypes in heatmaps (Fig. 4a–c). This approach revealed key correlations among secretory, epithelial, and mesenchymal PDACS with published subtypes. For example, the secretory subtype significantly correlated with subtypes identified in two other studies that share the exo- and endocrine features of the pancreas. The epithelial PDACS also strongly correlated with subtypes from two other studies that represented classical, epithelial tumors (Figs. 4a and 2b, respectively). We also tested all four pancreatic cancer classifications on the pooled samples in a combined network analysis. This analysis showed that our PDAC subtypes and those previously published are highly interconnected and revealed the existence of three major clusters (Fig. 4d). Given its strong association with poor outcome and consistent interconnection with similar identified subtypes, we focused on the mesenchymal subtype and its comparison to the other (non-mesenchymal) subtypes for further experimentation and analysis.
Classification of models for PDAC reveals that mesenchymal features are tumor-cell intrinsic
To study the molecular mechanisms that underlie the poor outcome of the mesenchymal tumor subtype, and discover potential vulnerabilities, we attempted to use the PDACS classifier to identify experimental models that mimic this specific subtype. We found that the patient PDACS classifier did not perform well on non-patient samples, as PDX samples and cell lines were not consistently assigned to subtypes, and that the classifier required modification for use on such models. We had expected this given the extrinsic factors that affect gene expression, especially in PDAC where extrinsic stroma factors have a large impact on tumor biology and substantially differ in these models from the host.
To generate a PDX classifier, we assembled a tumor cell-specific and a stroma-specific classifier using 14 PDXs derived from the patient cohort: RNA-Seq reads from these PDXs were mapped to the human and mouse genome to allocate the expression of genes to the tumor or stromal compartment (Supplementary Fig. 5a). The fraction of reads mapped to either genome is shown in Fig. 5a and Supplementary Fig. 5b. Correlations of gene expression in unmatched tumor-PDX samples as well as matched donor-PDX pairs are shown in Supplementary Fig. 5c,d). As expected, genes associated with stroma were almost exclusively found in the mouse reads (Fap, Acta2; Fig. 5b), and tumor marker expression was found in reads mapped to the human genome. GSEA with the Moffitt et al. and ESTIMATE stromal gene sets30 revealed strong enrichment in expression assigned to the mouse genome (Fig. 5c,d), providing further support for the validity of our species-specific transcript analysis.
We then used the human reads from the PDXs, in conjunction with the consensus clustered patient data, to train new epithelial classifiers and identify mesenchymal PDXs versus non-mesenchymal (secretory, epithelial, and compound pancreatic grouped; workflow shown in Supplementary Fig. 5a, classifier genes shown in Supplementary Table 5). We generated probability scores for mesenchymal subtype and a ranking of PDXs shown in Supplementary Fig. 5e. A heatmap comparison of PDX classification and donor subtype is shown in Supplementary Fig. 5f. Of note, this revealed a degree of incongruence between patient and PDX classification. We take this to imply high plasticity in pancreatic cancer tissue and cell states that is context dependent, and further underscores the need to apply experimental model-specific classification methods.
For the cell lines, we used a similar strategy and assembled a classifier for previously established cell lines33. We classified Hs766T, Panc89, PANC-1, and PSN-1 as mesenchymal, and HPAF-II, BxPC3, AsPC-1, Capan-2, and Capan-1 as non-mesenchymal (probability scores shown as heatmap in Fig. 5e, see also Supplementary Fig. 6 for classification of all cell lines). Mesenchymal classification strongly correlated with the expression of mesenchymal identity markers (Vimentin) and invasive growth (CXCR4) as determined by gene expression (Fig. 5e) and flow cytometry (Fig. 5f). Conversely, non-mesenchymal cells highly expressed E-cadherin, EpCAM and ERBB3/HER3.
To determine the functional phenotype of mesenchymal classification, we grew classified cell lines in organotypic cocultures with pancreatic stellate cells34. Importantly, we observed invasive growth patterns for all mesenchymal cell lines (Fig. 5g). In organotypic cultures with non-mesenchymal cell lines, we observed a relatively well-demarcated and differentiated non-invasive epithelial layer. Moreover, tumors grown in vivo from mesenchymal subtype cell lines exhibited poor differentiation compared to non-mesenchymal tumors (Fig. 5h). Using Transwell migration assays, we verified that this invasive growth was not due to enhanced chemotactic capacity (Fig. 5i). Since we showed that invasive growth is a key feature of mesenchymal PDAC cells, our experiments provide important evidence for the functional relevance of our gene expression-based classification. In addition, we did not find differences in the sensitivity to gemcitabine or paclitaxel between mesenchymal and non-mesenchymal cell lines (Fig. 5j), arguing that the invasive growth is responsible for poor outcome, rather than differential sensitivity to commonly used chemotherapeutics against PDAC.
Discussion
We have described gene expression analysis and unsupervised class discovery on a well annotated single-center PDAC-only RNA-Seq expression dataset. These analyses revealed the existence of four subgroups of PDAC that correlate with distinct clinical manifestations. By characterizing the associated tumor biology in silico, as well as in matching patient-derived models and classical cell lines in vitro and in vivo, we found that a highly invasive growth pattern, intrinsic to the tumor cell compartment, is associated with poor-prognosis PDAC.
Despite the known contributions of the mutational spectrum to tumor heterogeneity, we did not uncover enrichments for specific mutations in the PDAC subgroups. Given the large influence that the stroma exerts on tumor cell behavior, it is likely that tumor cell-extrinsic factors contribute significantly to gene expression-based analyses of PDAC outweighing the contributions of tumor cell-intrinsic mutations25. Furthermore, a recent publication describing the epigenetic determinants of PDAC subtypes suggests that a certain degree of plasticity exists between these molecular subgroups35. This plasticity was further supported by a limitation of our study: the discordance that we observed between PDXs and their donors at the gene expression level. We hypothesize that this is indeed due to the large stromal impact on tumor cell gene expression. We have previously observed that already within the first passage after grafting, nearly all stromal cells are of mouse origin36. The composition of this mouse stroma, and the influence it exerts on the grafted tumor cells, is different from the original human stroma. This affects the accuracy with which the gene expression profiles of tumor cells, grown in different host species, reflect those from patients. This is in line with a recent study showing that specific macrophage populations in the tumor microenvironment drive squamous subtype tumor biology37. In addition, it is possible that the subcutaneous site of PDX growth hampers comparison to the original site of growth in the human pancreas. Nonetheless, the use of species-specific transcript analysis of PDXs has allowed successful classification of experimental models of PDAC and revealed strong correlations of the mesenchymal subtype with invasive potential in vitro and high-grade tumor growth in vivo.
Additionally, the question rises why the secretory subtype is associated with poor outcome in our full cohort (including R1 and N1 patients). Based on previously published subtyping studies, the mesenchymal subtype would be expected to unequivocally predict poor outcome, but in our cohort this was only apparent when considering the R0 and N0 patients20,21,25. Network analysis revealed the secretory subtype to correlate with the previously described ADEX/exocrine-like subtypes and these are not known to associate with relatively poor prognosis. A tentative explanation is that poor outcome in our mesenchymal subtype becomes most apparent when clinical confounders for poor outcome are considered in depth, and that in these selected patients, the tumor cell-intrinsic properties of the ADEX/exocrine-like subtypes do not contribute to poor outcome. Tumors that grow in the tail of the pancreas have been suggested to be of mesenchymal/squamous subtype relatively often38. However, in the 90 samples available in our cohort, only two originated from the pancreas tail precluding meaningful analysis on clinical variables and associations to our molecular subtypes.
Our compound pancreatic subtype appeared to identify similar samples as did the previously published classical and ADEX/exocrine-like subtypes. However, the biology of this subtype as revealed by GSEA for KEGG and GO signatures was decidedly shared with the mesenchymal subtype (Fig. 2 and Supplementary Fig. 2), and we propose that the compound subtype does not result from inappropriate classification, but rather from intratumor heterogeneity, where the presence of more than one PDACS subtype (one of which mesenchymal) within the tissue analyzed results in a subtype that is characterized by features that differentially impact on its biology and classification.
The parallels between the subtypes identified by our own as well as previously published classifiers, suggest that a unifying transcriptome-based classification for PDAC can be accomplished. Whether this classification should be based solely on the transcriptome of tumor cells, or also include signatures derived from cells within the tumor microenvironment is currently still a matter of debate. Achieving consensus on the number and the identity of gene expression-based subtypes will aid future clinical implementation39. Such an effort in PDAC could also be extended to include other periampullary tumors, and could even cover the full width of gastrointestinal cancers40. This would greatly simplify the application of a molecular classifier, and with careful design, at a minimal cost for the detection of rare or organ-specific subtypes41.
We identified PDAC subtypes defined by distinct tumor biology and clinical manifestations. Our subtypes show high interconnectivity with previously published classifications. Our data indicate that the very limited surgical benefit for patients bearing the most aggressive tumor subtype argues against direct surgical resection of such tumors even if otherwise favourable clinical features are present. This poor prognosis following resection is caused by the infiltrative growth pattern rather than a difference in response to chemotherapy.
Methods
Clinical data, tissue collection, and ethical approval
Tumor tissue of patients who underwent a pancreaticoduodenectomy (PD) for a PDAC at the Amsterdam UMC, location Academic Medical Centre Amsterdam (AMC) between 1993 and 2015 was retrospectively collected from the fresh frozen tissue archive of the Department of Pathology (n = 230), and from the prospectively collected cohort of the Laboratory for Experimental Oncology and Radiobiology (BioPAN; n = 11536). Retrospective collection was conducted in accordance with ethical guidelines ‘Code for Proper Secondary Use of Human Tissue in The Netherlands’ (Dutch Federation of Medical Scientific Societies), approved by the Academic Medical Center’s institutional review board (Medisch Ethische Toetsingscommissie AMC) under METC_A1 15.0122. For prospectively collected material, informed consent was obtained from all patients in accordance with our hospital’s ethical guidelines (IRB code METC 2018_181). All specimens were snap-frozen in liquid nitrogen and stored at −80 °C. Clinicopathological data were obtained through the departments of Surgery and Pathology and expanded with parameters to include age, sex, type of surgery, chemo(radio)therapy regimen, radicality of surgery, size and differentiation grade of the tumor, and overall survival (Table 1). Total follow-up was over 120 months and median follow-up for living patients was 52 months (range 19–120), and 16 months (range 2–95) for deceased patients. For histopathological revision, selection, and processing of PDAC samples see Supplementary Methods.
Preparation of libraries and processing for RNA-seq
For RNA isolation, 30 sections of 20 μm were cut, and RNA was isolated using RNABee (Bio-Connect, Huissen, the Netherlands) and the RNeasy Mini kit (Qiagen, Hilden, Germany) according to manufacturer’s instructions. In most samples the RNA Integrity Number was 7 or higher as evaluated by BioAnalyzer (Agilent, Santa Clara, CA), median RIN = 8. Of 19 samples the RIN value was under 7 but this was not apparent from principle component analysis (PCA) of the gene expression profiles. Samples were DNAse-treated. RNA was amplified using the Total Prep RNA Amplification kit (Illumina, San Diego, CA). Poly-A enriched libraries were synthesized using TruSeq RNA Library Prep kit and sequenced in three batches (Illumina HiSeq2500). All sequencing data were quality-controlled using FastQC42 and found to be of high quality. RNA-Seq reads were aligned to the human reference genome (GRCh38) using Tophat2 (V2.1.043) with default parameters, retaining only uniquely mapped reads. Gene expression levels were estimated using Cufflinks (V2.2.1), with default parameters and Gencode V19 for gene annotation, masking rRNAs, tRNAs and chromosome M. The resulting gene expression profiles, measured by RPKM (reads per kilobase of transcript per million mapped reads) were log2-transformed. Non-biological batch effects were inspected using PCA, and corrections were made using Combat44. Subsequent analyses were done on the batch-corrected dataset.
Identification of PDAC subtypes
In order to identify PDAC subtypes, genes with average log2 (RPKM) > 1 and a median absolute deviation (MAD) > 0.5 across samples were retained and median-centered. Hierarchical clustering with agglomerative average linkage was used for unsupervised classification. To evaluate the stability of clustering, consensus clustering was employed45, with 1000 iterations and 95% subsampling ratio. A significant increase in clustering stability was observed from k = 2–4, but not for k > 4 (Supplementary Fig. 1a–c). Gap statistics46 were calculated for k = 1–8, and a peak was found at k = 4, confirming four robust clusters (Supplementary Fig. 1d).
Generation of the PDACS classifier
To build the PDAC Subtype (PDACS) classifier, we applied two filtering steps to select the most differential and discriminative genes. First, we identified genes significantly differentially expressed (false discovery rate, FDR < 0.01) between each PDAC subtype and the other three using significance analysis of microarrays (SAM; R package ‘siggenes’, V1.44.047). Second, for each subtype we selected top 40 discriminative genes based on AUC (area under receiver operating characteristics, ROC curve, R package ‘ROCR’ V1.0–7;48. The resulting 159 genes in total were trained by prediction analysis for microarrays (PAM;28) to build a classifier. The classifier was used for classification of samples in the other data sets (Bailey, PACA-AU and TCGA), where we regarded the subtype with the highest posterior probability as being indicative of association with that group. To facilitate classification of gene expression data sets generated from other platforms, we filtered gene expression data by taking the commonly annotated genes between the sets analyzed. For analysis of public patient data sets, cross comparisons between PDAC classification systems, gene set analysis see Supplementary Methods.
Cross comparisons between PDAC classification systems
The strength of pairwise associations between subtypes of different PDAC classification systems was statistically assessed. We first reproduced the classifier used by each subtyping system based on corresponding signature genes and the discovery data set. Subsequently, we performed cross-classifications, i.e. to use each classifier on the discovery data sets of all reported subtyping systems. For each two classification systems, sample enrichment analysis was performed using hypergeometric tests to compare their corresponding classification results. Benjamini-Hochberg (BH) corrected P-values were derived from these tests, indicating the strength of association between the studied two classification systems.
To further systematically elucidate the interrelations between all PDAC classification systems, we employed a network-based meta-analysis approach that was established by us previously39. The network encodes on nodes the information of subtype prevalence and on edges their association calculated by Jaccard similarity coefficient, which is defined by the size of the intersection between two sample sets over the size of their union. To quantify the statistical significance of subtype associations, we performed hypergeometric tests for overrepresentation of samples classified to one subtype in another. The resulting P-values were adjusted for multiple hypotheses testing using the BH method. Using this approach, we built a network consisting of subtypes defined in all subtyping systems, interconnected by statistically significant (BH-corrected, P < 0.001) edges.
Statistical analysis
Clinicopathological parameters were collected and analyzed in SPSS V24 (IBM, Armonk, NY). Statistical testing on continuous variables (>2 groups) were done by ANOVA, and Pearson Chi-square (χ2) tests for categorical variables. Hypergeometric tests were used to quantify the statistical significance of association between subtypes of different classification systems. To analyze OS, we used the Kaplan-Meier (KM) method, with log-rank tests for calculation of p-values. Multivariate Cox proportional hazards regression analysis was used to test the statistical significance of PDACS subtypes with survival, adjusting for potential confounders (STATA, StataCorp, College Station, TX). Statistical testing on in vitro data was performed using Graphpad Prism 7. Comparing mesenchymal versus non-mesenchymal groups with continuous variables were either tested by t-test for Gaussian distributed values or by Mann-Whitney U test for non-Gaussian distributed values.
PDX sequence analysis
PDX samples (n = 14, from 13 patients) were processed and sequenced as the patient biopsies. Sequence reads from PDX samples were mapped to the mouse (mm10) and human genomes (hg38) and assigned to either one using XenofilteR49. Xenograft-donor matching was confirmed by short tandem repeat (STR) profiling (Promega, Madison, WI) of the donor patients’ gDNA (isolated from blood) and gDNA isolated from the PDX. Gene set enrichment analysis on merged mouse and human sets (Fig. 4c,d) was performed using the software available through the Broad institute website, using 1000 permutations on the phenotype50. For generation of patient-derived xenografts and cell lines see Supplementary Methods.
Donor-PDX expression comparison
To be able to compare expression profiles from PDX models (which are a mix of expression originating from the transplanted human tumor and murine stromal infiltration from the host) to donor tumors we combined the expression originating from both compartments. First, using the ‘biomaRt’ R package we mapped human genes to their best mouse homologs. Secondly, using this mapping, we summed reads from mouse and human origin to obtain a combined expression value per gene. Finally, these expression values were normalized to reads per million and log2 transformed. For the calculation of the donor-PDX correlation coefficients, genes were selected that showed similar expression behavior in patient tumors and PDX models. For this we used a correlation of correlations approach, only selecting genes with a coefficient above 0.25 (n = 5039)39. This selection was further refined by only including sufficiently variable genes with a standard deviation above 1.7 in both the donor and the PDX dataset (n = 113).
Epithelial and stromal PDX classifiers
Linnekamp et al. showed that they were able to subtype PDX models derived from colorectal tumors using an epithelial classifier51. Using a similar approach we built an epithelial and stromal PDACS classifier. To identify tumor-specific genes that were expressed in the epithelial compartment, but not or much less in the stromal compartment (and vice versa for stroma-specific genes), we used the RNA-Seq profiles of our PDAC PDX models (using the human-mouse homology mapping described in the previous paragraph). We calculated the minimum and maximum expression in the epithelial compartment (min.ep and max.ep) and stromal compartment (min.str and max.str) across all samples for each gene and selected genes based on the following criteria: epithelial genes = (min.ep > 1.5 & max.str < 1.5 & min.ep - max.str > 1.25) or min.ep - max.str > 5; stromal genes = (min.str > 1.5 & max.ep < 1.5 & min.str - max.ep > 1.25) or min.ep - max.str > 8.
These two sets of genes were subsequently used to train an epithelial and a stromal PDACS classifier, able to distinguish mesenchymal samples (PDACS4) from the other subtypes (PDACS1–3) (Supplementary Table 5). Input for this procedure was the AMC patient dataset, excluding PDX-donor patients to avoid bias. Before classifier construction the expression was normalized using the method described in Linnekamp et al. to counteract the tumor and stromal dilution effects, after which a support vector machine (SVM) classifier was constructed with the e1071 R package using a linear kernel. The PDX models were then classified using the human and mouse expression components for the epithelial and stromal classifier respectively. Of note, a 4-tier classification of PDAC cell lines was not successful. For cell line classification, reagents, in vitro assays, flow cytometry, organotypic cocultures and migration assays, see Supplementary Methods.
Data availability
RNA-Seq data have been deposited at EMBL-EBI ArrayExpress (E-MTAB-6830).
References
Siegel, R. L., Miller, K. D. & Jemal, A. Cancer Statistics, 2017. CA Cancer J Clin 67, 7–30, https://doi.org/10.3322/caac.21387 (2017).
Burris, H. A. 3rd et al. Improvements in survival and clinical benefit with gemcitabine as first-line therapy for patients with advanced pancreas cancer: a randomized trial. J Clin Oncol 15, 2403–2413 (1997).
Conroy, T. et al. FOLFIRINOX versus gemcitabine for metastatic pancreatic cancer. N Engl J Med 364, 1817–1825 (2011).
Goldstein, D. et al. nab-Paclitaxel plus gemcitabine for metastatic pancreatic cancer: long-term survival from a phase III trial. J Natl Cancer Inst, 107 (2015).
Bijlsma, M. F. & van Laarhoven, H. W. The conflicting roles of tumor stroma in pancreatic cancer and their contribution to the failure of clinical trials: a systematic review and critical appraisal. Cancer Metastasis Rev 34, 97–114 (2015).
Pishvaian, M. J. & Brody, J. R. Therapeutic Implications of Molecular Subtyping for Pancreatic Cancer. Oncology (Williston Park) 31(159-166), 168 (2017).
Biankin, A. V. et al. Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes. Nature 491, 399–405 (2012).
Connor, A. A. et al. Association of Distinct Mutational Signatures With Correlates of Increased Immune Activity in Pancreatic Ductal Adenocarcinoma. JAMA. Oncol 3, 774–783 (2017).
Jones, S. et al. Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science 321, 1801–1806 (2008).
Waddell, N. et al. Whole genomes redefine the mutational landscape of pancreatic cancer. Nature 518, 495–501 (2015).
Witkiewicz, A. K. et al. Whole-exome sequencing of pancreatic cancer defines genetic diversity and therapeutic targets. Nat Commun 6, 6744 (2015).
Chantrill, L. A. et al. Precision Medicine for Advanced Pancreas Cancer: The Individualized Molecular Pancreatic Cancer Therapy (IMPaCT) Trial. Clin Cancer Res 21, 2029–2037, https://doi.org/10.1158/1078-0432.CCR-15-0426 (2015).
Donahue, T. R. et al. Integrative survival-based molecular profiling of human pancreatic cancer. Clin Cancer Res 18, 1352–1363 (2012).
Haider, S. et al. A multi-gene signature predicts outcome in patients with pancreatic ductal adenocarcinoma. Genome Med 6, 105 (2014).
Kirby, M. K. et al. RNA sequencing of pancreatic adenocarcinoma tumors yields novel expression patterns associated with long-term survival and reveals a role for ANGPTL4. Mol Oncol (2016).
Perez-Mancera, P. A. et al. The deubiquitinase USP9X suppresses pancreatic ductal adenocarcinoma. Nature 486, 266–270 (2012).
Stratford, J. K. et al. A six-gene signature predicts survival of patients with localized pancreatic ductal adenocarcinoma. PLoS Med 7, e1000307 (2010).
Zhang, G. et al. Integration of metabolomics and transcriptomics revealed a fatty acid network exerting growth inhibitory effects in human pancreatic cancer. Clin Cancer Res 19, 4983–4993 (2013).
Badea, L., Herlea, V., Dima, S. O., Dumitrascu, T. & Popescu, I. Combined gene expression analysis of whole-tissue and microdissected pancreatic ductal adenocarcinoma identifies genes specifically overexpressed in tumor epithelia. Hepatogastroenterology 55, 2016–2027 (2008).
Bailey, P. et al. Genomic analyses identify molecular subtypes of pancreatic cancer. Nature 531, 47–52 (2016).
Collisson, E. A. et al. Subtypes of pancreatic ductal adenocarcinoma and their differing responses to therapy. Nat Med 17, 500–503 (2011).
Gutierrez, M. L. et al. Identification and characterization of the gene expression profiles for protein coding and non-coding RNAs of pancreatic ductal adenocarcinomas. Oncotarget 6, 19070–19086 (2015).
Janky, R. et al. Prognostic relevance of molecular subtypes and master regulators in pancreatic ductal adenocarcinoma. BMC Cancer 16, 632 (2016).
Kim, S. et al. Identifying molecular subtypes related to clinicopathologic factors in pancreatic cancer. Biomed Eng Online 13(Suppl 2), S5 (2014).
Moffitt, R. A. et al. Virtual microdissection identifies distinct tumor- and stroma-specific subtypes of pancreatic ductal adenocarcinoma. Nat Genet 47, 1168–1178 (2015).
Puleo, F. et al. Stratification of pancreatic ductal adenocarcinomas based on tumor and microenvironment features. 155, 1999–2013. e1993 (2018).
Maurer, C. et al. Experimental microdissection enables functional harmonisation of pancreatic cancer subtypes. gutjnl-2018-317706, https://doi.org/10.1136/gutjnl-2018-317706%J, Gut (2019).
Tibshirani, R., Hastie, T., Narasimhan, B. & Chu, G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 99, 6567–6572, https://doi.org/10.1073/pnas.082099299 (2002).
Farrell, A. S. et al. MYC regulates ductal-neuroendocrine lineage plasticity in pancreatic ductal adenocarcinoma associated with poor outcome and chemoresistance. Nat Commun 8, 1728, https://doi.org/10.1038/s41467-017-01967-6 (2017).
Yoshihara, K. et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun 4, 2612 (2013).
Mueller, S. et al. Evolutionary routes and KRAS dosage define pancreatic cancer phenotypes. Nature 554, 62–68, https://doi.org/10.1038/nature25459 (2018).
Network, C. G. A. R. Integrated Genomic Characterization of Pancreatic Ductal Adenocarcinoma. Cancer Cell 32, 185–203 e113 (2017).
Deer, E. L. et al. Phenotype and genotype of pancreatic cancer cell lines. Pancreas 39, 425–435 (2010).
Kadaba, R. et al. Imbalance of desmoplastic stromal cell numbers drives aggressive cancer processes. J Pathol 230, 107–117 (2013).
Lomberk, G. et al. Distinct epigenetic landscapes underlie the pathobiology of pancreatic cancer subtypes. Nature communications 9, 1978, https://doi.org/10.1038/s41467-018-04383-6 (2018).
Damhofer, H. et al. Establishment of patient-derived xenograft models and cell lines for malignancies of the upper gastrointestinal tract. J Transl Med 13, 115 (2015).
Candido, J. B. et al. CSF1R(+) Macrophages Sustain Pancreatic Tumor Growth through T Cell Suppression and Maintenance of Key Gene Programs that Define the Squamous Subtype. Cell reports 23, 1448–1460, https://doi.org/10.1016/j.celrep.2018.03.131 (2018).
Dreyer, S. et al. Defining the molecular pathology of pancreatic body and tail adenocarcinoma. British Journal of Surgery 105, e183–e191 (2018).
Guinney, J. et al. The consensus molecular subtypes of colorectal cancer. Nat Med 21, 1350–1356 (2015).
Liu, Y. et al. Comparative Molecular Analysis of Gastrointestinal Adenocarcinomas. Cancer Cell 33, 721–735.e728, https://doi.org/10.1016/j.ccell.2018.03.010 (2018).
Bijlsma, M. F., Sadanandam, A., Tan, P. & Vermeulen, L. Molecular subtypes in cancers of the gastrointestinal tract. Nat Rev Gastroenterol Hepatol 14, 333–342 (2017).
Andrews, S. FastQC: A Quality Control Tool for High Throughput Sequence Data (2010).
Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14, R36, https://doi.org/10.1186/gb-2013-14-4-r36 (2013).
Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127, https://doi.org/10.1093/biostatistics/kxj037 (2007).
S. Monti, P. T. J. Mesirov, T. Golub. In Machine Learning Vol. Volume 52 pp 91–118 (Kluwer Academic Publishers, 2003).
Yan, M. & Ye, K. Determining the number of clusters using the weighted gap statistic. Biometrics 63, 1031–1037, https://doi.org/10.1111/j.1541-0420.2007.00784.x (2007).
Tusher, V. G., Tibshirani, R. & Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 98, 5116–5121, https://doi.org/10.1073/pnas.091062498 (2001).
Sing, T., Sander, O., Beerenwinkel, N. & Lengauer, T. ROCR: visualizing classifier performance in R. Bioinformatics 21, 3940–3941, https://doi.org/10.1093/bioinformatics/bti623 (2005).
Krijgsman, O., Kluin, R. & Peeper, D. XenofilteR. GitHub.
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 102, 15545–15550 (2005).
Linnekamp, J. F. et al. Consensus molecular subtypes of colorectal cancer are recapitulated in in vitro and in vivo models. Cell death Diff 25, 616–633 (2018).
Acknowledgements
The authors would like to sincerely thank the patients for participating in the study. Furthermore, they thank Drs Roel Kluin and Iris de Rink from the Netherlands Cancer Institute for technical support. This work was supported by a KWF Dutch Cancer Society grant to M.J.V., O.R.B., and H.L. UVA 2014-6803 and to M.F.B. and H.L. UVA 2012-5607, UVA 2013-5932, as well as grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. CityU 11102317, 11103718), a grant supported by the Young Scientists Fund of the National Natural Science Foundation of China (81802384) awarded to X.W. None were involved in the study design or drafting of the manuscript. We thank Life Science Editors for editorial assistance.
Author information
Authors and Affiliations
Contributions
F.D., H.D., C.W., O.R.B., M.G.B., J.A.T., L.W., L.B.R., H.W.W. provided biological materials and/or follow-up information. F.D., V.L.V., E.C.S., L.Z., J.B.H., S.R.v.H., J.K., X.W. produced and analyzed RNA-Seq data. F.D., V.L.V., E.C.S., J.B.H., G.K.H., S.K., L.V., H.L., S.R.v.H., J.K., X.W. contributed to the data analysis. V.L.V., M.P.D., M.M., A.S., C.W., M.H., M.F.B. performed experiments. J.V., M.J.V. provided pathological expertise. F.D., H.L., J.P.M., M.J.V., X.W., M.F.B. supervised the project.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests. H.L. has acted as a consultant for BMS, Eli Lilly and Company, and Nordic Pharma Group, and has received unrestricted research grants from Bayer Schering Pharma AG, BMS, Celgene, Eli Lilly and Company, Nordic Pharma Group, Philips, and Roche Pharmaceuticals. M.F.B. has received research funding from Celgene, and has acted as a consultant for Servier. None of these were involved in drafting of the manuscript.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Dijk, F., Veenstra, V.L., Soer, E.C. et al. Unsupervised class discovery in pancreatic ductal adenocarcinoma reveals cell-intrinsic mesenchymal features and high concordance between existing classification systems. Sci Rep 10, 337 (2020). https://doi.org/10.1038/s41598-019-56826-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-019-56826-9
- Springer Nature Limited
This article is cited by
-
Relative expression of hormone receptors by endothelial and smooth muscle cells in proliferative and non-proliferative areas of congenital arteriovenous malformations
European Journal of Medical Research (2023)
-
RBFOX2 deregulation promotes pancreatic cancer progression and metastasis through alternative splicing
Nature Communications (2023)
-
Single-cell profiling to explore pancreatic cancer heterogeneity, plasticity and response to therapy
Nature Cancer (2023)
-
Analysis of the glyco-code in pancreatic ductal adenocarcinoma identifies glycan-mediated immune regulatory circuits
Communications Biology (2022)
-
Selective multi-kinase inhibition sensitizes mesenchymal pancreatic cancer to immune checkpoint blockade by remodeling the tumor microenvironment
Nature Cancer (2022)