Introduction

There is good evidence that genetic factors strongly influence the risk of asthma, and associations between numerous genes and asthma have been evaluated in the past decades [1, 2]. Recent genome wide association studies (GWAS) of asthma have identified several additional asthma susceptibility genes [310]. Little is known about the role of most asthma susceptibility genes during human lung development.

The "developmental origins" hypothesis [11] proposes that specific in utero events at critical periods during organogenesis and maturation result in long-term physiological or metabolic changes, ultimately contributing to disease in later life [12, 13]. Our group previously showed that Wnt signaling genes that were differentially expressed during fetal lung development were associated with impaired lung function in two cohorts of school-aged asthmatic children [14]. These results suggest the importance of early life events in determining lung function. They also highlight the benefit of integrating gene expression and genetic association data to connect transcriptomic events in the early developing lung to genetic associations of lung function in later life.

Asthma is a disease characterized by both airway inflammation and smooth muscle contraction, leading to airway obstruction. Dendritic cells, mast cells, and T-lymphocytes, as well as airway smooth muscle cells, all begin to appear within the lung parenchyma during the pseudoglandular stage of lung development. We therefore hypothesized that genes influencing normal airways development, especially during the branching morphogenesis stage of human lung development, would be over-represented by genes associated with asthma. To test this hypothesis, we investigated the role of a well-defined set of asthma susceptibility genes during human and murine lung development. 96 asthma genes were first identified via comprehensive search of the current literature. Next, we analyzed their expression patterns in the developing human lung during the pseudoglandular (gestational age, 7-16 weeks) and canalicular (17-27 weeks) stages of development, and in the complete developing lung time series of 3 mouse strains: A/J, SW and C57BL6.

We show that overall, there was no over-representation of the asthma genes among genes differentially expressed during lung development, which may reflect the diverse ontological contexts of the asthma genes. However, some genes showed a consistent pattern of differential expression in all developing lung data sets, e.g. NOD1, EDN1, RORA, CCL5 and HLA-G, which suggests that these genes play a fundamental role in normal lung development.

Methods

Tissue samples

The human fetal lung tissues were obtained from National Institute of Child Health and Human Development supported tissue databases and microarray profiled as previously described [14, 15]. Creation of the tissue repository was approved by the University of Missouri-Kansas City Pediatric Institutional Review Board. 38 RNA samples from 38 subjects (estimated gestational age 7-22 weeks or 53-154 days post conception) were included in the analysis (Table 1). The murine data have previously been described and their microarray data are available at NCBI Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo); A/J [16], n = 24 samples; SW [17], n = 11; and C57BL6 mice [18], n = 5, Table 1.

Table 1 Summary characteristics of included human and murine lung data sets

Microarray analysis

The developing human lung time series data is available at NCBI Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo), GSE14334 (Affymetrix Human Genome GeneChip U133 Plus 2.0 microarray platform). Expression values were extracted and normalized from .CEL files using the Affy package and the Robust Multi-array Average (RMA) method in R/BioConductor (http://www.bioconductor.org) which returns the measured expression signal of each micrroarray gene probe in logarithmic base 2 scale. Validation of the human microarray analysis by qPCR for genes differentially expressed during lung development has been performed earlier and this demonstrated that 83% of individual gene expression trajectories could be replicated [15]. The developing whole mouse lung transcriptome data from three different mouse strains were extracted and normalized, separately, using RMA in R/BioConductor; 24 samples from A/J (Affymetrix Mu74Av2 platform); 11 samples from SW (Affymetrix Mu11K A and B platforms); and 5 samples from C57BL6 (Affymetrix Mouse 430 Plus 2.0 platform).

Literature search

A PubMed (http://www.ncbi.nlm.nih.gov/pubmed) search was performed on March 8, 2010 using the terms 1) "asthma" together with 2) "genetic association" or "case control" in order to cover all published papers between July 1, 2008 and December 31, 2009. We applied the following inclusion criteria for an asthma gene: 1) significant association with asthma affection status in at least two populations and 2) at least one significant association study with no fewer than 150 cases and 150 controls or 150 trios. Genes identified through three earlier literature searches based on papers published before July 1, 2008 were also included if they met our two predefined criteria [1, 19, 20]. In addition, all GWAS of asthma published through September 2010, were also evaluated and asthma genes were included if our criteria were met. Please see Supplemental data for details about the asthma genes included in our analyses. Mouse orthologues of human genes were identified using NCBI's HomoloGene database (http://www.ncbi.nlm.nih.gov/homologene).

Statistical analysis

Differential gene expression analysis relative to gestational age was performed using a linear regression model (lmFit) as implemented in the Limma package in R/BioConductor. Each microarray gene probe's logarithmic base 2 expression signal was regressed against the gestational age as a continuous variable representing days of the developing lung. We adjusted for multiple testing using the Benjamini and Hochberg method, which controls the false discovery rate (i.e. the expected proportion of false discoveries amongst the rejected hypotheses), and the adjusted p-values were used to declare a significant gene expression pattern over age [21]. "Differentially expressed" refers to an adjusted p-value of <0.05 in the linear regression model. Fisher's exact test was next performed in Stata Statistical Software (Collage Station, Tx) to test whether microrarray probes representing predefined asthma genes were over-represented among differentially expressed probes relative to probes representing "non-asthma genes". This analysis was restricted to microarray probes that were gene annotated because the asthma gene probes were all annotated. The same analysis steps were performed in human and murine data sets. Gene ontology (GO) enrichment analysis was performed using DAVID (The Database for Annotation, Visualization and Integrated Discovery) [22, 23].

Results

In total, 96 asthma susceptibility genes were identified in the literature (Additional file 1, Table E1 [1, 310, 2496]). All genes show significant association with asthma in at least two human populations, one of which has no fewer than 150 cases and 150 controls or 150 trios. The 96 genes were represented by 220 probes on the human microarray (Table 1). Not all human genes have a mouse orthologue and the mouse microarray data sets have slightly lower numbers of asthma genes and their corresponding microarray probes.

We found that 28% of all microarray probes in the human data set were differentially expressed during the analyzed lung development period (human estimated gestational age 7-22 weeks), Table 2. A similar figure was seen in the A/J mouse and somewhat lower figures in the SW and C57BL6 mouse strains. Gene ontology (GO) enrichment analysis using DAVID of the human list of differentially expressed genes returned 879 significant GO terms, of which 6 terms pertain directly to the lung development. Among the asthma gene probes, 32% were differentially expressed during early human lung development. While there was a trend towards over-representation (Odds ratio, OR 1.22, CI 0.90-1.62) this was not statistically significant in comparison to the non-asthma gene probes (28%). In agreement with the human data, no over-representation of asthma gene probes was found among probes differentially expressed during lung development in mice strains, although there was a trend in the C57BL6 strain (OR 1.41, CI 0.92-2.11), Table 2.

Table 2 Proportion of the asthma gene probes among probes differentially expressed during lung development in human and mouse data sets

Although asthma genes as a group was not differentially expressed more than non-asthma genes during early lung development, some genes were consistently differential expressed, as listed in Table 3 (see full list in Additional file 1, Table E2). Expression of NOD1, EDN1 and IL4R were positively correlated with gestational age in the human data, whereas ROBO1 and PLAUR were negatively correlated (i.e. lower expression levels the higher gestational age). Among the asthma genes identified in GWAS, ROBO1, RORA, HLA-DQB1, IL2RB and PDE10A showed most significant evidence of involvement in lung development (all adjusted p < 0.001 for differential expression). Analyses were also done comparing gene expression patterns between the pseudoglandular (primary branching morphogenesis stage) and canalicular stages (with 112 days post conception as the dividing time point between the 2 stages). The list of top genes differentially expressed between these two stages (Additional file 1, Table E3 and Figure E1) corresponds well with the list of top genes using time as a continuous variable (Table 3).

Table 3 Gene expression analysis of specific asthma genes and evidence for differential expression during human lung development (adjusted p < 0.001 cut off)

Next, we evaluated all differentially expressed asthma genes in the human data set to see which genes showed a consistent expression pattern across human and murine data sets. Table 4 shows all genes with at least one significant probe per gene in the human data and at least one significant probe in a mouse data set (n = 19 with adjusted p-value <0.05). Eight genes had one or more significant probes in all data sets, with NOD1, EDN1, CCL5, RORA and HLA-G showing the most consistent expression patterns across human and mouse (see detailed EDN1/Edn1 expression over time in human and mouse lung tissue; Figure 1 and 2). In terms of bio-ontologic enrichment, the 19 asthma genes consistently differentially expressed in human and mouse lung development were enriched for ontological attributes "Regulation of cytokine production" (IRAK3, CD86, NOD1, TNF, IL18, SCGB1A1) and "Regulation of cell activation" (STAT6, CD86, IL18, IL4R, RORA, SCGB1A1) (Additional file 1, Table E4.) In terms of gene product characteristics, "Disulfide bond", "Secreted" and "Signal peptide" are attributes of a majority of the genes. 15 of the 19 genes in Table 4 have been extensively studied in human and murine experiments that support their involvement in asthma pathogenesis (Additional file 1, Table E5).

Table 4 Genes with at least one significant probe per gene in the human data and at least in one mouse data set (adjusted p-value <0.05)
Figure 1
figure 1

Expression of EDN1 over time in human lung tissue in relation to time (days post conception), p = 1.6E-6 for differential expression. The fitted line through the data represents the beta coefficient from linear regression analysis.

Figure 2
figure 2

Expression of Edn1 over time in mouse whole lung tissue in relation to time (days post conception). Solid circles represent the A/J data (p = 0.007 for differential expression), open squares represent the C57BL6 data (p = 0.02) and solid triangles represent the SW data (p = 0.001). The fitted line through each data set represents the beta coefficient from linear regression analysis.

In order to disentangle pre- and postnatal expression patterns in the murine data sets, separate pre- and postnatal analyses were attempted. However, this subgroup analysis was not meaningful for the SW and C56BL6 data sets because of substantially reduced sample size. The A/J data contains two prenatal time points (day 11 and 17), each with 4 unique samples and Table E6 shows overlapping results for human and prenatal A/J data. Eight of the previously identified 19 genes with consistent expression pattern across human and murine data sets (Table 4) were also identified when prenatal A/J data was used (including Edn1).

Discussion

Little is known about the role of most asthma susceptibility genes during human lung development. Here we present a thorough evaluation of gene expression patterns of current published asthma genes in the developing human and murine lung. While there was no general over-representation of asthma genes among differentially expressed genes, some asthma genes were consistently differentially expressed in multiple developing lung transcriptomes, e.g. NOD1, EDN1, CCL5, RORA and HLA-G suggesting key functional roles in lung development.

Determinants for a normal lung development are critical not only early in life, but also for later lung function. Longitudinal studies have shown that infants with reduced lung function have an increased risk of developing asthma and respiratory illness later in life [97, 98]. Shared genetic factors for reduced lung function in children with asthma and adults who smoke (e.g. MMP12 variants) emphasize the role of genetics on long term lung function [99]. Wnt signaling genes (e.g. Wif1, Wisp1) were not identified as asthma genes in our literature search, and were thus not included in our analyses. In our previous article by Sharma et al, Wif1 and Wisp1 were differentially expressed during fetal lung development and polymorphisms in these genes also showed association with lung function measured as FEV1 and FVC, but association to asthma per se was not tested [14].

The transcriptional control of lung morphogenesis is key for normal development from primordium to a fully differentiated, functioning organ [100, 101]. Human lung growth has historically been categorised into five stages based on histological and anatomical characteristics: embryonic (26 days to 5 weeks), pseudoglandular (5-16 weeks), canalicular (16-26 weeks), saccular (26 weeks to birth), and alveolar (birth to 6 months) [100]. Additional "molecular" phases within the pseudoglandular stage have been observed, which extends our knowledge of lung development beyond traditional embryology [15].

GWAS have contributed to important knowledge about underlying functional genetics in many complex diseases [102]. The majority of trait associated SNPs show weak to moderate effect sizes, which supports earlier evidence that complex diseases result from several genetic and, often, environmental factors. Evidence of a functional role is also lacking for most identified genes. In order to increase our understanding of the mechanism and potential function of asthma susceptibility genes identified in published GWAS and "classic" asthma candidate genes, we evaluated their gene expression patterns in the developing human lung. Comparative analyses also showed that many of the differentially expressed genes in the human data set were also differentially expressed during murine lung development. Among the GWAS asthma genes, ROBO1, RORA, HLA-DQB1, IL2RB and PDE10A were differentially expressed in the human data. These genes represent a wide range of structural and ontological families with different assumed functions, but their potential involvement in lung development has previously not been thoroughly evaluated. Regulation of cytokine production and cell activation were the most significant bio-ontologic attributes to genes differentialy expressed during lung development.

Using the murine data sets for comparative analyses, RORA, which encodes for a nuclear hormone receptor, showed the most consistent expression pattern (expression positively correlated with gestational age in all data sets). ROBO1 expression was on the other hand negatively correlated with gestational age in all tested data sets (albeit significant in only 2/3 sets), which indicates an important effect early in the developing lung and then a diminishing effect over time. The ROBO1 protein is involved in axon guidance and neuronal precursor cell migration. PTGDR, WDR36, PRNP, DENND1B, PDE4D, TLE4 and TSLP also showed weak evidence of differential expression in the human data using adjusted p < 0.05 as cut off (Additional file 1, Table E2), but none showed consistent gene expression patterns in the murine data sets.

NOD1 showed the strongest evidence for differential expression in the human data and this pattern was consistent in the C57BL6 strain. However, Nod1 was not represented on the platforms used for analyses on the A/J and SW strains and could thus not be evaluated in these data sets (also true for another asthma gene with consistent expression patterns, PCDH1 [52]). NOD1 encodes for a cytosolic protein which contains an N-terminal caspase recruitment domain (CARD) and plays an important role for recognition of bacterial compounds and initiation of the innate immune response [103]. Little is known about the role of NOD1 during lung development and our findings indicate that NOD1 could have important contribution.

EDN1 was the second most differentially expressed asthma gene in the human data set and very consistent expression patterns were found in all murine data sets. Also for the embryonic stage analyses (pseudoglandular vs canalicular), EDN1 was among the most highly differentially expressed genes. In general, embryonic stage results were very similar to the results using time as a continuous variable. EDN1 belong to a family of secreted peptides produced by vascular endothelial cells with multiple effects on cardiovascular, neural, pulmonary and renal physiology [104, 105]. EDN1 shows involvement in pulmonary hypertension, fibrosis, obstructive diseases and acute lung injury, and is also required for the normal development of several tissues. Mice lacking the Edn1 gene die of respiratory failure at birth and show severe craniofacial abnormalities, as well as cardiovascular defects [106, 107]. Transgenic mice with lung-specific over-expression of the human EDN1 gene develop, on the other hand, chronic lung inflammation and fibrosis [108]. Edn1 heterozygous knockout mice also show increased bronchial responsiveness and these result link EDN1 functionally to asthma and obstructive diseases [72]. To date, three studies report significant association between EDN1 and asthma [41, 109, 110]. Our data, as well as previous studies, point to an important role for EDN1 in normal lung development, which warrants further studies.

Our study has several limitations. Our 38 human lung tissue samples were restricted to the pseudoglandular and canalicular stages. Information about key exposures that could influence gene expression patterns, such as maternal smoking, residential area, and parental allergy is not available. Thirty-eight samples are a relatively small sample size for expression analyses due to human biological variation and fetal lung tissue during the later stages of gestation was not available. It is possible that some asthma genes are important for human lung development during the later stages of gestation, but we were not able to evaluate this with our current data set. To complement the human data, we analysed expression patterns from early gestational to postnatal stages of lung development in three different murine strains. We used this murine data to replicate, in silico, the human results in the early stages and to infer human gene expression pattern in the later stages of the developing lung. Also, the microarray platforms used in the included data sets do not entirely cover the human (and murine) transcriptome and important genes may have been missed (e.g. GPRA/NPSR1 [111] is not represented on the U133 Plus 2.0 microarray chip and could not be evaluated). Protein analyses could provide a better view to understand specific gene functions and the post-transcriptional regulation level, but such data was not available in our study. Our asthma gene list represents genes that met our predefined criteria for asthma association, and some genes genes may have been missed (e.g. those only captured by the search terms "family based study" AND "asthma"). Given the rapid rate at which novel asthma susceptibility loci are being discovered, some of the most recent asthma genes may have been missed. These may introduce a potential null bias in the analysis.

Conclusions

We have evaluated gene expression patterns of asthma susceptibility genes identified via a comprehensive literature search of candidate gene studies and GWAS published to date. We found strong and consistent evidence of differential expression of several asthma genes in the developing human and murine lung. Among genes identified in asthma GWAS, ROBO1, RORA, HLA-DQB1, IL2RB and PDE10A showed most consistent expression patterns and from asthma candidate genes, e.g. NOD1, EDN1, CCL5 and HLA-G were identified. Our analyses provide functional insight about asthma susceptibility genes during normal lung development, which improves our understanding about normal and pathological processes related to respiratory diseases in children and adults.