Introduction

Lung cancer is the fourth most common cancer and the leading cause of cancer deaths worldwide. Smoking-related non-small cell lung cancer (NSCLC) accounts for over 80% of all lung cancers. NSCLC types include lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) which arise from cells of distinct origin and are characterized by different morphological and molecular properties [1].

Basal cells (BCs) are a class of stem/progenitors of the tracheal and bronchial airways that can replenish and repair injured or denuded epithelium [2, 3]. Although a fairly heterogenic subset [4, 5], canonical BC lineage markers include putative stem cell, squamous, and epithelial proteins, including adhesion molecule (EpCAM/CD326), TP63, cytokeratins, integrin α6 (ITGA6/CD49f), nerve growth factor receptor (NGFR) and podoplanin (PDPN), with a reported population of clonogenic and unrestricted LUSC and LUAD cancer stem cells co-expressing the CD24 protein [6,7,8,9]. Notch pathway specificity and downstream signaling, which include genes such as Hairy/enhancerofsplit related with YRPW motif 1 (HEY1), and secretoglobin family 1 A member 1 (SCGB1A1), have also been shown to play important roles in BC differentiation, proliferation, and carcinogenesis [10, 11]. In NSCLC, NOTCH1 is considered to promote and NOTCH2 mediated-transduction to inhibit tumor growth and progression [11, 12].

Increasing in a step-wise fashion with age and cigarette smoke, healthy BCs can acquire progressive cellular and genomic aberrations to transform into LUSC and LUAD tumor-initiating cells [13, 14]. Genomic and cellular mutations in “epithelial field” BCs, far removed from any primary tumor, may explain the development of secondary synchronous or metachronous lesions in situ and may display progressive programs of lung cancer development [15,16,17]. To date, understanding of the earliest events that drive carcinogenesis in niches of the broad epithelial field remain incomplete.

In this study, we set out to evaluate long-term smoking influences on cytologically normal BCs in the cancer field, considering potential contributions of age and smoking dose. Morphological, proliferative, and Notch-related gene expressional changes that precede lung cancer were investigated. As BCs may ostensibly transform into tumors, understanding early incremental changes and a role for Notch signaling in these (cancer) stem cells may help pave the way to improved lung cancer risk prediction, detection, and a next generation of preventive therapies.

Materials and methods

Donor recruitment and biopsies

Endobronchial brushings were collected from sites contralateral, or remote (> 5 cm) from any suspected cancer lesion or other known pathology, under an Einstein-Montefiore IRB approved protocol. Patient data including age, tobacco smoking history (pack-years), and other information were collected by standard face-to-face research coordinator interview pre-procedure, and electronic medical record chart verification. The final pathologic, bronchoscopic, and if relevant subsequent surgical procedure diagnoses were available after operation, per clinical routine and IRB approval. NSCLC types studied included LUAD and LUSC which arise from cells of distinct origin and are characterized by different morphological and molecular properties [1].

Cell culture

BCs were harvested from brush-exfoliated bronchial epithelium and cultured according to a previously reported culture/selection protocol [18]. In brief, bronchial cytologic brushes taken from white-light normal areas, were immersed into BEGM media supplemented with growth factors (Lonza, Morris Township NJ) and incubated at 37 °C in a 5% CO2 incubator with media changed every other day. This method was reported to result in a pure culture of airway basal cells by day 7 in culture [19], with studied cells expressing TP63, KRT14, KRT5, podoplanin (PDPN), and nerve factor growth factor receptor (NGFR) BC lineage markers (Supplemental Fig. 1). All experiments were conducted on patient cells of low Passage (2–4), with individual figures representing donor subgroups from each category.

Gene expression

Quantitative Reverse Transcription Polymerase Chain Reaction (qRT-PCR) was performed as previously reported [20]. In brief, RNA was purified using RNeasy (Qiagen, Valencia, CA) and first-strand cDNA synthesis was performed using SuperScript IV (Life Technologies). Conventional PCR reactions were performed using SYBR green in a QuantStudio 3 thermocycler system (Life Technologies). qRT-PCR primer sequences can be found in Supplemental Table 1. Relative changes in gene expression (to glyceraldehyde 3-phosphate dehydrogenase, GAPDH) are provided as -dCt (directly correlating with the observed expression changes) or 2−ΔΔCt (fold difference to never smokers) [18].

RNA sequencing: RNA seq expression data was extracted from 39 donors at dbGAP (accession number: phs003317.v1.p1). Initial fastq files were trimmed of flanking adapter sequences using trim galore (https://github.com/FelixKrueger/TrimGalore/issues/25); the resulting fastq files were aligned to the human genome (hg38) using the splice-aware aligner STAR (https://www.ncbi.nlm.nih.gov/pubmed/23104886). Directional read counts were obtained using htseq-count with parameter stranded set to reverse (https://htseq.readthedocs.io/en/release_0.11.1/count.html). RNA count reads for individual genes were normalized by total count of reads.

Immunofluorescence

Cells were grown on a coverslip, rinsed, fixed in 1% paraformaldehyde, and labeled with mouse anti-keratin 14, keratin 5, E-cadherin/CDH1, N-cadherin/CDH2, EpCAM (Invitrogen, Eugene OR); rabbit anti-TP63 (Santa Cruz Biotechnology, Santa Cruz CA), CCND1, NKX2-1 (Invitrogen), and/or goat anti-vimentin (Sigma -Aldrich, St Louis MO), washed and treated with either goat anti-mouse or goat anti-rabbit Alexa Flour 488 or 568, and/or Donkey anti-goat Alexa Fluor 488 (Invitrogen). DAPI (4’6-diamidino-2-phenylindole; Sigma-Aldrich) was applied and cells were covered with Fluoromount-G (Southern Biotech, Birmingham AL). Cell Nuclear morphology/morphometry, the fraction of spindle shaped or Click –iT EdU Alexa Fluor 488 (Invitrogen) positive cells were captured under identical exposures using a motorized Axio Imager M2 with apotome system (Zeiss, Germany) and analyzed in Fiji [21]. Data counts were performed by two independent observers blinded to patient diagnosis.

Immunoblot

Western blots were performed as previously reported [22]. In brief, cells were rinsed, solubilized, sheared and protein concentrations determined (DC Protein Assay, Bio-Rad, Hercules, CA). 30 ug of protein was loaded and electrophoresed in SDS-polyacrylamide gels (Pierce, Rockford, IL), transferred onto polyvinylidene fluoride membranes (Millipore), and probed overnight at 4 °C. Primary antibodies included mouse anti-human ACTB (1:1,000), EpCAM (1:500); Rabbit anti-human NKX2-1 (1:1000), and Goat anti-VIM (1:500). Goat anti-mouse, donkey anti-rabbit or anti-goat horseradish peroxidase secondary antibodies were used (1:10,000; Bio-Rad). Protein detection was performed using a ChemiDoc imaging system (Bio-Rad). Signal intensities were normalized to ACTB.

Flow cytometry and FACS analyses

All experiments were performed as previously published by our laboratory [23]. Briefly, cells were dissociated, washed, and treated with: Alexa Fluor 488 conjugated rat anti-human/mouse CD49f/ITGA6 (Biolegend, San Diego CA) and CD271 (eBioscience), phycoerythrin (PE) conjugated mouse anti-human CD326/EpCAM (Biolegend) and podoplanin (BD Biosciences), and Allophycocyanin (APC) or peridinin-chlorophyll-protein (Per-CP) conjugated mouse anti-human CD24 (Invitrogen). For cell cycling analysis we used the Click –iT Plus EdU Alexa Fluor 488 Flow cytometry assay kit (Invitrogen) according to manufacturer’s instructions. Acquisition was performed on an Attune NxT flow cytometer (Thermofisher) or a FACSAria III (BD Biosciences). All analyses were performed using FlowJo software (BD Biosciences, Franklin Lakes NJ).

Statistical analysis

Experimental data was examined for normal (Gaussian) distribution by normality tests with T-test and ANOVA performed on normally distributed and Mann-Whitney U and Kruskal-Wallis conducted on nonparametric skewed data using Jamovi [24] with a p-value cutoff set at 0.05. At least two technical/biological replicates were performed on each studied sample. Unless indicated otherwise, data are presented as median and quartiles.”

Results

Patient study information and BC characteristics

General patient data was collected concomitant with each bronchoscopic sample (Table 1). The following groups were included in the study: (1) Never-smoking/non-cancer controls; (2) Current or former smokers/non-cancer controls; (3) LUAD cases; and (4) LUSC cases. The average age (± SD) of the total never smoking population was 58.0 ± 15.6 years and significantly lower than the other groups (smoker 66.0 ± 12.6; LUAD 68.5 ± 10.6; LUSC 69.8 ± 10.1; n ≥ 21; P < 0.05 for all). The fraction of smokers (former and current) among cancer cases was 89% (24/27) in LUAD and 95% (20/21) in those patients diagnosed with LUSC (P < 0.001). Initial cytopathological analysis to confirm BC lineage and non-malignant phenotype of the acquired cells was performed, with 86% (12/14) of collected BCs characterized as benign and 14% (2/14) as atypical. Morphologically, none of the assessed cells from cases could be classified as malignant.

Table 1 Descriptive statistics of the enrolled subject population. Smoking status encompasses patients that never smoked (never); former smokers that included patients that quit smoking greater than one year (former); and current smokers (current). Pack years refers to packs per day multiplied by years smoked and quit years the number of years since quitting smoking. Never – never smoker; LUAD – lung adenocarcinoma; and LUSC– lung squamous cell carcinoma. Data is shown as mean ± SD. Never smokers were of a significantly lower age than the other groups and the fraction of smokers within the cancer cases was statistically higher than chance (*P < 0.05 and **P < 0.01). #Total number of donors per category aggregating all experiments, with individual figures in the text representing subgroups from each donor category. &Self reported; ^Medical records/physician/chart verified

Contrasting proliferative rates and S-phase transition among donor patient BCs

As stem cells of the lung, we initially followed BC growth over a period of two weeks in culture (Fig. 1A). In this experiment, we observed accelerated LUAD donor cell growth manifested as over a 3-fold higher cell count at day 10 than all other groups; with (never smoker) 0.9 ± 0.5 × 105, (smoker) 0.5 ± 0.4 × 105, (LUAD) 3.2 ± 1.0 × 105, and (LUSC) 0.6 ± 0.2 × 105. At day 14, never smoker and LUSC donor cell numbers reach those of LUAD, with the number of procured smoker cells at this day remaining significantly low (n ≥ 4; P < 0.05).

Fig. 1
figure 1

Reduced smoker and enhanced LUAD donor BC proliferation. (A) BC counts over time in culture. LUAD cell numbers were significantly higher at day 10, and smoker cells lower at day 14. Data are presented as mean ± SEM (n ≥ 5; P < 0.05 for both by ANOVA). (B) Box and whisker plots showing differences in the fraction of EdU incorporating nuclei comparing control (never smoker and smoker) and cases (LUAD and LUSC) cells at day-7 (n ≥ 5, **P < 0.01 by T-test); (C) Representative immunofluorescent micrographs depicting cycling EdU-positive (green) and negative (DAPI-stained; blue) nuclei in BCs from the four groups. (D) Box and whisker plots showing percent of EdU incorporating nuclei from the studied groups in C (median and quartiles; n ≥ 5; *P < 0.05 and **P < 0.01 by ANOVA and T-test comparing controls and cases, respectively). (E) Representative flow cytometry dot-plots of cells treated with EdU for 1.5 h and counterstained with 7-AAD. Cells in the G0/G1-phase can be seen within the bottom left box; S-phase cells in the top elongated box, and G2/M in the bottom right box. (F) Median and quartiles of cells in the S-phase of the cell cycle from panel E by group (n ≥ 5; *P < 0.05 and **P < 0.01 by ANOVA). Passage 2 never-smoker (Never) and smoker controls, and lung adenocarcinoma (LUAD), and squamous cell carcinoma (LUSC) cells were used in these experiments. Scale bar = 20 μm

To validate cell growth differences, we compared DNA replication by measuring uptake of the DNA thymidine analogue 5’ethynyl-2’-deoyuridine (EdU) across the groups. Immunofluorescence of EdU uptake at day 7 in vitro demonstrated over a two-fold increase of labeled cells in lung cancer-case donor BCs (LUAD + LUSC; 33.1 ± 22.5%) as compared to non-cancer (never smoker + smoker; 14.0 ± 12.9%) (n ≥ 5, P < 0.01; Fig. 1B). As expected, this effect seemed to be driven by a > 2-fold increase in replicating LUAD donor cells compared to non-cancer donors (never smoker − 15.8 ± 12.8%; smoker − 12.1 ± 13.9%; LUAD − 40.4.1 ± 29.8%; LUSC − 24.9 ± 15.9%; Fig. 1C and D; n ≥ 5; P < 0.05). We next analyzed the cell cycle distribution of the groups by flow cytometry. While no major differences were observed in the sub-G, G0/G1, G2, and M phases of the cell cycle, a significantly increased proportion of LUAD donor cells were found in the S phase (never smoker − 14.1 + 10.5%; smoker − 17.3 + 12.2%; LUAD − 29.9 + 8.0%; and LUSC − 13.7 + 12.5%; Fig. 1E and F; n ≥ 5; P < 0.05). These findings indicate reduced BC growth in smokers and the accelerated proliferation of LUAD donor cells, which may correspond with a defective G1/S-phase checkpoint.

ITGA6 positivity and reduction of the EpCAM pos/ITGA6 pos/CD24pos stem cell fraction in smoker BCs

To identify clonogenic cells of the selected population, we sorted BCs for EpCAM, ITGA6, and CD24 surface markers. Within the BC population two epithelial subgroups were identified, those positive for EpCAM (over 57%), and the remainder, negative for this pan-epithelial marker (Fig. 2A). Within this population the percentage of the EpCAMpos/ITGA6pos/CD24pos (triple positive) clonogenic cells displayed a marked decrease from 32.4 ± 30.0% in never-smokers to 12.3 ± 12.0% in ever smokers (n ≥ 6; P < 0.05; Fig. 2A and B). There was an increase in the percentage of ITGA6 decorated BCs in never, compared to former and current smoker groups combined (n ≥ 6; P < 0.05), driven intriguingly by lower surface ITGA6 expression in former smokers (43.6 ± 27.8%) as compared to never- and current-smokers (73.8 ± 18.7% and 80.9 ± 18.7%, respectively; n ≥ 5; P < 0.05; Fig. 2C and D). ITGA6 cell positivity was also lower in former smokers with patient smoking pack years to imply a distinct cellular identity found in former smokers (n ≥ 6; P < 0.05; Fig. 2E). No major differences in the overall proportion of EpCAM, ITGA6, and CD24 triple positive expression was observed between the non-cancer and cancer groups (Supplemental Fig. 2).

Fig. 2
figure 2

Reduction in EpCAMpos/ITGA6pos/CD24pos stem cell and ITGA6pos fractions of donor smoker BC groups. (A) Smoking reduces the fraction of EpCAM/ITGA6/CD24 triple-positive stem cells in the BC population. Representative dot plots demonstrating data from never-smokers (top) and smokers (bottom). From left to right: isotype control, representative EpCAM and ITGA6 staining, and histogram depicting the CD24pos subset from the EpCAMpos/ITGA6pos (upper right) populational quadrant. Note, that the homogenous cellular population differs by EpCAM (epithelial) marker expression. (B) Box and whisker plots depicting triple positive, EpCAMpos/ITGA6pos/CD24pos fractions in the BC population of never and ever smokers. Differences between never and ever smokers are statistically significant for the triple positive population (n ≥ 6; *P < 0.05 by T-test). (C) Representative histograms illustrating the percent of ITGA6 positive cells in never, former, and current smokers. transparent–unstained; light grey – isotype control; dark grey- ITGA6 labeled cells. (D) Box and whisker plots depicting median and quartiles of the percentage of ITGA6 labeled donor BCs with, intriguingly, never and current smokers exhibiting higher ITGA6 membrane expression than former smokers (n ≥ 6; *P < 0.05 by ANOVA). (E) Scatterplot depicting a significant reduction in the percentage of ITGA6 expressing donor BCs from former smokers by patient pack-years. No differences were found in the fraction of EpCAM expressing cells. Color-coded regression lines ± SD (center) and boxplots for each parameter are shown on the margins (n ≥ 6; *P < 0.05 by ANOVA). Passage 2 and 3 never smoker (Never), smoker, lung adenocarcinoma (LUAD), and squamous cell carcinoma (LUSC) cells were used in these experiments

Varying morphological, morphometric, and mesenchymal properties among cancer BCs

To better understand distinct BC fractions we studied morphological differences among the groups, with the proportion of cells with an elongated/spindle shape evident among donor-LUSC BCs. As compared to never smokers and smokers, LUSC donors displayed > 2.5-fold increase in spindle-shaped BCs with a more modest increase compared to LUAD cells, which reached statistical significance when accounting for patient smoking pack years (n ≥ 3; P < 0.05; Fig. 3A). The percentage of elongated cells by group were 5.4 ± 1.7% (never smoker), 5.1 ± 1.4% (smoker), 10.0 ± 2.3% (LUAD), and 14.3 ± 2.9% (LUSC). To test if the spindled phenotype might represent epithelial to mesenchymal transition (EMT), we stained for the intermediate filament protein vimentin (VIM), a canonical marker of EMT, and the mesenchymal lineage protein CDH2. VIM and to a lesser extent CDH2 labeling could be better detected in a larger proportion of LUSC-donor cells to suggest an increased mesenchymal tendency for this group (Fig. 3B, and Fig. 4). Labeling cells for cytokeratin-14 (KRT14) clearly defined the spindle morphology, while the proliferative protein, cyclin-D1 (CCND1) demonstrated both nuclear and cytoplasmic localization in LUSC-donor cells, previously reported to indicate a cellular migratory and invasive state [25]. Performing nuclear morphometry using the DNA minor groove binding dye DAPI, while we found no differences in chromatin compaction (derived from the mean nuclear gray intensity of the same images [26]) the average nuclear area of LUAD donor cells was determined to be ≥ 24% larger than all other groups, in (never smoker) 111.0 ± 38.0, (smoker) 123.0 ± 31.5, (LUAD) 153 ± 21.8, and (LUSC) 92.1 ± 42.4 micron2, (n ≥ 4 for each group; P < 0.05; Supplemental Fig. 3). Previously, nuclear size was reported to increase with transition from benign to carcinoma cells [27]. These data establish distinct cancer-related BC phenotypes manifested as mesenchymal (donor-LUSC) and proliferative (donor-LUAD) subset properties.

Fig. 3
figure 3

Elevated presence of dysmorphic cells with spindle phenotype in BCs from donor LUSC patients. (A) Median and quartiles showing the percentage of spindled cells in all four groups as quantified by two independent double blinded individuals. Data from passage 2 never-smoker (Never), smoker, lung adenocarcinoma (LUAD), and squamous cell carcinoma (LUSC) cells are shown (n ≥ 3; *P < 0.05 by ANOVA). (B) Immunofluorescent micrographs from never smoker and lung squamous cell carcinoma (LUSC) patient cultured cells, labeled from left to right with; Top: DAPI nuclear counterstain (blue), vimentin (VIM; green) and N-cadherin (CDH2; red), and merge. Below: BCs from separate never and LUSC patients labeled from left to right with; DAPI (blue), cyclin D1 (CCND1; green) and cytokeratin 14 (KRT14; red), and merge. Labeling in cells from donor smokers and LUAD did not differ significantly from never smokers

Fig. 4
figure 4

Epithelial and mesenchymal gene expression properties differ among BC groups. (A) normalized RNA count (nRNAc) and (B) relative protein levels (rPL) of the intermediate filament gene vimentin (VIM), the epithelial cellular adhesion molecule (EpCAM), and the stem cell gene transcription termination factor (NKX2-1) between BCs from the different groups (normalized to b-actin; ACTB). Violin plots are shown (n ≥ 3). (C) Representative immunoblots depicting protein expression within the BC groups. Note the presence of multiple VIM bands in cells from smokers and cases (LUAD, LUSC). (D) Representative micrographs of cells harvested, cultured, and prepared for immunofluorescence demonstrating abnormal cytoplasmic NKX2-1 expression in smokers and the presence of nuclear EpCAM expression in LUAD BCs. Distinct labeling patterns between individual cells of the same group may emphasize populational diversity. From left to right; DAPI nuclear counterstain (grey), VIM (green), EpCAM (red), NKX2-1 (blue) and merge. Passage 2 never-smokers (Never), smokers, lung adenocarcinoma (LUAD), and lung squamous cell carcinoma (LUSC) cells were tested. (n ≥ 3)

Differential gene expression and protein localization correspond with differing BC growth characteristics

To better understand BC growth and transitional phases, we tested the expression and distribution of EpCAM and VIM in reference to the stem cell proliferation and differentiation and LUAD pathological biomarker homeobox-containing transcription termination factor (TTF1/NKX2-1) [28]. As demonstrated in Fig. 4A, EPCAM and NKX2-1 expression differed between the groups, with LUSC donor cells displaying over a 2.8-fold reduction in normalized NKX2-1 transcript levels of 8.9 ± 6.2 for LUSC as compared to 56.4 ± 12.1, 25.0 ± 14.4, and 54.7 ± 18.5 in never smoker, smoker, and LUAD cells respectively (mean ± SEM; n ≥ 3; P < 0.05). While immunoblot did not show significant differences in relative protein quantities (normalized to ACTB; n ≥ 3; Fig. 4B), VIM expression demonstrated distinct molecular bands in donor smoker and cancer cells suggesting the presence of additional variants and/or modifications (n ≥ 3; Fig. 4C). Immunofluorescence experiments corroborated expression differences among the groups, with enhanced EpCAM nuclear localization in LUAD donor cells (n = 3; Fig. 4D). Of note, while most cells were positive for two of the above three markers by immunofluorescence, a select cellular fraction was observed to express all three proteins to strengthen transitional properties of the studied cells.

Distinct notch-related gene expression patterns in cancer, ageing, and smoker BCs

We next set out to determine if the observed differences in gene expression could be influenced by age and smoking pack years. As shown in Fig. 5A, normalized gene counts of CDH1, TP63, SCGB1A1 and the Notch downstream mediator, HEY1, all significantly declined with age (n ≥ 3; P < 0.05). To determine if the reduction in gene expression with age was dependent on group, we repeated this experiment on individual qRT-PCR samples and plotted the results by group. While the trend of TP63, SCGB1A1, and HEY1 transcripts decreased with age independent of group, the reduction in CDH1 levels was perturbed in cells from donor-smoking and cancer-patients (Fig. 5B; n ≥ 4; P < 0.05).

Fig. 5
figure 5

BC gene expression level changes with age can be disrupted by smoking and cancer. Scatterplots depicting the relationship between gene expression, age, and patient group in Passage 3 cells. (A) CDH1, TP63, SCGB1A1, and HEY1 transcript levels decrease with age when pooling normalized gene counts of all groups. (B) Correcting for examined groups, in comparison to a decline in never smoker levels of CDH1, expression remains flat in smokers and cancer cases as indicated by patient BC cycle threshold (portrayed as negative, -Ct; normalized to GAPDH). Patient data points, regression line, and standard error of the mean (shaded and in respective colors), as well as parameter boxplots are shown. Never-smokers (Never), smokers, lung adenocarcinoma (LUAD), and lung squamous cell carcinoma (LUSC); n ≥ 3;*P < 0.05 and **P < 0.01

Plotting selected gene expression with smoking behavior among the pooled groups, only normalized NKX2-1 expression levels declined with pack years. In contrast, between never and ever smokers, gene expression of SCGB1A1, the immune checkpoint, CD274, and Notch pathway genes, NOTCH2 and HEY1 significantly decreased, while normalized KRT14 transcript levels increased in ever smokers with pack years by an average difference of ≥ 2 dCt values (Fig. 6A and B; n ≥ 5; P < 0.05 for all). When comparing expression in never, former, and current smokers with pack years, while KRT14, was significantly higher in the current smoker group, NOTCH1 and HES1 gene levels were reduced in current smokers by an average difference of ≥ 1.5 and ≥ 2.5 dCt values, respectively (Fig. 6A and B; n ≥ 5; P < 0.05). Differences in Notch pathway gene expression is not surprising as Notch signaling was shown to regulate BC differentiation and proliferation and promote EMT during oncogenic transformation [29,30,31]. Indeed, performing pairwise correlation analyses between the pooled groups for selected Notch signaling transcripts with candidate S-phase cell division cyclin, cancer proto-oncogenes, and BC lineage markers, we found that while NOTCH1 and 3 expression were significantly associated with HEY1 in never smoker donor cells, NOTCH1 correlated negatively with NOTCH2 and 3, and CCND1 in smoker donor cells (Supplemental Fig. 4). In comparison, with a strong positive correlation between EPCAM, ITGA6 and KRAS observed in both cancer case-donor groups, NOTCH2 and 3, were positively associated with MYC expression, and further correlated with CCND1 in LUAD, with the association between Notch signaling and ITGA6 and CCND1 lost in LUSC donor cells (n ≥ 9; P < 0.05; Supplemental Fig. 4). These results indicate distinctive Notch pathway activity among the cell groups, emphasized by an intense positive correlation with epithelial and proto-oncogenes in LUAD-donor BCs.

Fig. 6
figure 6

Epithelial and Notch pathway-related gene expressional changes with smoking dose and smoking status. Scatterplot depicting relative patient transcript levels by smoking pack years and where indicated grouped by smoking status (Ct normalized to GAPDH; negative value). (A) Pooling all groups, only the stem cell protein NKX2-1 demonstrated a decline in BC expression with smoking pack years. Among never and ever smokers, KRT14 transcript levels are higher and SCGB1A1, and CD274 are lower in smokers as compared to that of never smokers. KRT14 levels significantly increase in current smokers with smoking dose. (B) NOTCH2 and HEY1 expression is lower in ever smokers, while NOTCH1 and HES1 levels are specifically reduced in current smokers. Color-coded regression lines ± SEM (shaded) and boxplots representing each group are shown (n ≥ 5; *P < 0.05 and **P < 0.01). Results from passage 3 never-smoker (Never), smoker, lung adenocarcinoma (LUAD), and lung squamous cell carcinoma (LUSC) cells are shown

Discussion

In this study we set out to identify plausible changes of early lung carcinogenesis by examining BCs of smokers, and those with extant but anatomically remote tumors. We further correlated differences with patient age and smoking pack years. Our findings are consistent with reported molecular and functional changes in BCs from smokers [3, 32, 33], and expand on these changes with age- and smoking pack-years and in smoking-related cancers. We also identify Notch signaling trends that presumably precede retarded BC growth in smokers, dysplasia in LUSC, and hyperproliferation in LUAD donor cells. These findings may help reveal distinct biological processes active in early smoking-related NSCLC carcinogenesis, perhaps allowing future risk assessment, earlier cancer identification, and targeted engagement of preventive therapies.

Distinguished by EpCAM cell surface marker expression, our results indicate that the cultured BC population exists in a transitional epithelial/mesenchymal state, with the fraction of ITGA6 and clonogenic EpCAM /ITGA6 /CD24 expressing cells (which increase in never smokers with donor-age and smoking pack-years) largely reduced in smoking donors. We posit that smoking impairs BC proliferation and differentiation and disrupts populational steady-state further supported by a decrease in NOTCH1, HES1, and SCGB1A1 transcript levels, the latter previously reported in smoker BCs [33]. Correlated with airway epithelial injury and metaplasia [34], we also identified a smoking-specific populational elevation of KRT14 expression in BCs (cancer cases included) to imply the use of this transcript as a potential biomarker of smoking-related damage and cancer.

Observed to undergo changes in our study, EpCAM (epithelial) and VIM (mesenchymal) protein expression levels and cellular localization patterns were previously shown to serve as important cancer prognostic markers. EpCAM to induce target genes that include CCND1 and the MYC proto-oncogene, and VIM to destabilize Notch-mediated pathway signaling [35,36,37]. Indeed, defective Notch signaling coupled with a large dysplastic spindled subset in LUSC donor BCs may indicate acquisition of neoplastic properties as spindled cells have been shown to be linked to cancer cell stemness and pleomorphy seen in lung carcinomas in situ [38, 39]. Thus, our findings may elucidate early molecular patterns that confer cancer-related growth properties upon smoker lung epithelial progenitors.

Also strongly associated with tobacco smoke exposure, LUAD donor BCs were shown to exhibit robust cellular hyperproliferation, with lines of evidence pointing toward a defect in G1/S checkpoint regulation, premature entry into the S-phase, and an average increase in nuclear size, a phenomenon reported during the transition of benign breast disease cells to carcinoma [27]. As LUAD is considered to arise in the distal lung, the reason for these findings in cells procured from the proximal lung remain obscure but may be explained by confounding observations that include aberrant BC transition and/or Notch programming changes, shown to occur as early as the atypical adenomatous hyperplasia stage of LUAD carcinogenesis [34, 40]. While others indicate NOTCH1 to be active in smoker BCs and carcinogenesis [2, 41], we identify an association between Notch with its mediator, MYC and KRAS proto-oncogenes, and CCND1, a key regulator of S-phase entry (and second most frequently amplified gene in solid cancers). In fact, KRT14 and SCGB1A1 expressional defects, reported in adenosquamous cancer progenitor cells, can also be directly related to Notch signaling [37, 42, 43]. Taken together, atypical Notch and cancer gene expression, select lineage anomalies, and growth abnormalities support our hypothesis that “cancer field” BCs exhibit pre-cancer or cancer-related properties. This assumption and the precise signals and sequence of events that can promote BC oncogenic transformation warrant future studies.

These findings should be interpreted in the context of the study design which selects for cytologically normal bronchiolar BCs in smoking and cancer patients and compares them to never-smoking controls. One limitation obligate in human invasive studies such as this one is the cross-sectional, single timepoint model of events that, in actuality, unfold over time. Another is the small sample size, coupled to clinically relevant covariates inherent in such a study (age, smoking, cancer subtype, etc.) that unfortunately did not permit staunch multivariate analyses. As for bias, the average age of the control group was lower than case individuals, and the smoking dose (pack years) was found to be significantly higher in cancer cases, introducing possible age- and a smoking dose-related bias. Similarly, within the studied cancer groups, patient cancer stages from IA-IV were pooled, perhaps weakening our findings of cancer stem cell development.

Because BC cultures have been shown to accumulate mutations and select for an activated state of injured or airway repairing cell derivatives [44], we worked to minimize driver gene somatic mutations and clonal selection in vitro by performing experiments on low passage cells in short term cultures, which our group had previously demonstrated to reduce confounding factors in the single cell [17]. Moreover, as BCs were procured remote from any tumor, cytologically classified as non-malignant, and detected in non-cancer groups, these cells do not plausibly represent tumor or tumor-disseminating cell lineages. Next, while we refer to the EpCAMneg BC subset as a product of EMT, we cannot rule out that this population is a result of changes in protein stability/folding associated with cultivation or cell dissociation methods used. Alternatively, these cells may represent a contaminating population of stromal or hematopietic lineage, previously shown by our laboratory to accompany lung stem cells in primary culture [45]. While in such a case our results would not directly reflect on BC traits, these findings could help illuminate properties of the immediate (stem cell) niche. Finally, it is possible that the EpCAMneg fraction of the BC population represents a previously reported dormant stem cell population, protected from smoking-triggered mutations, that upon smoking cessation can revert to (and outcompete mutated) epithelial BCs to repair the lung [2]. Future studies focusing and expanding on selective pressures, immunophenotype, and malignant potential, which include a viable equilibrium between EpCAMpos and EpCAMneg lineages, changes in nerve growth factor receptor pathway activity, and BC clonogenicity and invasion are warranted to facilitate our understanding of smoking-related lung injury and “low mutational” progenitor cell carcinogenesis.

In summary, our findings uncover smoking-, age-, and cancer-related phenotypic and molecular footprints of broad field BCs, some of which may be driven by early changes in Notch signaling. Understanding transitional developments in smoker BCs can potentially be leveraged into strategies leading to earlier lung cancer detection and the development of prevention therapies.