Background

Skeletal muscle and hepatic insulin resistance are key elements in the pathogenesis of type 2 diabetes mellitus (T2D) [1]. However, T2D is caused by not only insulin resistance [2], but also a heterogeneous cluster of conditions rather than a uniform entity [3]. Due to both environment and heredity heterogeneity, gene expression profiling is limited in exploring molecular mechanism of type 2 diabetes [4, 5].

As a comprehensive indicator, plasma free fatty acids were assumed to mediate the insulin resistance. Lipid profiling has already been applied in type 2 diabetes studies [6, 7], such as free fatty acids built linkage between the resistance and obesity [8]. However, the relationship between lipid and glucose disposal remains to be demonstrated across liver, skeletal muscle, and blood [9, 10]. Here, we have integrated lipidomic analysis with gene expression profiling to discover the relationship between versatile lipid species and bioprocesses that are associated with type 2 diabetes. Using our model analysis, the statistically significant biological pathways were retrieved, and the findings provide a new strategy to link blood lipid species and illuminate the mechanism of insulin resistance associated with lipid and gene expression in blood.

Results

Study subjects

This study comprised a balanced distribution of the studied subjects in gender and race: among 60 controls, 28 were African American (AA) including 14 females and 14 males; 32 were Caucasian (CAU) including 14 females and 18 males. Among 84 patients with T2D, 44 were AA including 22 females and 22 males; 40 were CAU including 23 females and 17 males. As compared to AA, CAU had a significantly higher level of blood triglycerides (TG) in both the controls (106 ± 54.3 mg/dl in AA vs. 153 ± 77.8 mg/dl in CAU, p = 0.0009), and the patients (157 ± 128 mg/dl in AA vs. 207 ± 98.3 mg/dl in CAU, p = 0.037). There were no significant differences in other studied clinical parameters between two races (data for racial differences were not shown). As compared to all controls (mixed), patient's group was 4.5 years older, had significantly higher body mass index (BMI), blood TG and fasting glucose, and lower high density apolipoprotein (HDL). There were no differences in low density apolipoproteins (LDL) and total cholesterols (Table 1) between controls and T2D patients.

Table 1 The clinical characteristics of the study subjects

Plasma lipid profile reveals phenotype factors

Plasma lipid profile is associated with various types of diseases or phenotypes. In order to illustrate the relationship between lipid species and gene expression level of peripheral blood, we performed unsupervised exploratory factor analysis and found significant linkages between lipid profile and phenotypes, including race, sex, and diabetes at the significant levels 1.87e-6, 9.28e-4, and 3.17e-3 by Wilcoxon Rank Sum Test, respectively. As shown in Figure 1, three types of CE species (C23:2CE, C23:3CE, C23:4CE) were found to be positively correlated with diabetes, while three types of ePE were shown to be negatively correlated. For sex, more than five and six lipid species were found to be correlated: PE40:5, PE36.4, and PE34.1 tend to be higher in female samples, while LPC18:2 and LCP18:1 were a little higher in male samples. For race, two types of SM (SM22:1 and SM22:0) were a little higher in black, while PE (PE34:2, PE36:3) and PI (PI36:1, PI38:3) were higher in white samples.

Figure 1
figure 1

Factor analysis of lipidomic profile. The upper panel is a heatmap of factory analysis. Factor loadings, where race, sex, and disease correspond to the three factors. Color depth represents for factor loadings of 71 different lipid indicators, positive loading were shown in red and negative loadings in green. The lower panel is a boxplot of the important loading lipid for three known factor, including diabetes, sex and race. The lipid levels were scaled to the range 0 to 1, and each lipid corresponds to two boxes with different factor levels.

Phenotype factors have lesser effect on gene expression profile

Unlike the lipid profile, the gene expression profile does not show direct correlation with phenotype indicators, according to both a hierarchical clustering (Figure 2) and principal component analysis (PCA). As shown in the clustering, all of the data can be divided in to four main classes, but none of the factors (sex, diabetes, age, and race) were significantly correlated with main classes. However, race and sex were shown to be non-randomly distributed in the dendrogram, which implies underlying correlation with gene expression profile (GEP). Moreover, significant correlation was identified between GEP and phenotype factors based on PCA scores in the correlation test. GEP was correlated with race, and many genes may be differentially expressed between black and white samples. Race was the factor most known to be GEP-correlated, and tested as correlated with the third component (p = 7.20e-4, Kruskal test), which contains 5.8% variances. Diabetes was then tested to be correlated with the fifth component (p = 8.67e-3, Kruskal test), which contains 4.7% variances, and sex with the tenth component (p = 2.02e-2, Kruskal test) containing 2.1% variances. There is almost 87.4% variance or unknown information in GEP. Direct differential expression genes were difficult to understanding in terms of biological meanings, which enriched in seemingly unrelated pathways (Table 2) such as ECM-receptor interaction and Riboflavin metabolism.

Figure 2
figure 2

Hierarchical clustering of all samples in filtered data set. Factors such as sex and type were represented by black or white blocks: female was in white, male in black; diabetes in black, non-diabetes in white; Asian as 2, blacks as 3, Indian as 4, Mexican as 5 and 6, whites as 7.

Table 2 Enriched pathways of differentially expressed genes.

Significant biological pathways link gene expression profile with lipid profile and diabetes

To overcome the limitation of the unknown variances in gene expression profile, and to recover the relationship between gene expression profile and lipid profile, PLS regression model was adopted. A list of significant pathways from the gene expression profile was found to explain the lipid profiles, and also the lipid profile associated T2D (Table 3). Six of the top ten pathways have direct linkage with diabetes, including one carbon pool by folate, arachidonic acid metabolism, insulin signaling pathway, amino sugar and nucleotide sugar metabolism, propanoate metabolism, and starch and sucrose metabolism. None of them can be retrieved from a differential expression gene selection.

Table 3 Enriched pathways of differentially expressed genes

Discussion

Gene expression profiling was generally adopted for diabetes in the levels of cell lines and drug response [11, 12]. Considering the environment and heredity heterogeneity, the homogeneity is not easy to conclude from a snapshot of the transcriptome for a wide cohort. Thus, we take lipid as an assistant to guide the exploration of gene-level mechanism of insulin resistance associated with lipid and gene expression in blood.

As expected, a major finding in our study is that very limited variance of transcriptome can be illustrated by the known phenotype factors. However, lipid profile shows an unexpected capacity on revealing the considered phenotype factors. By a lipid-guided exploration, a set of significant biological pathways and suspected genes were identified to be insulin resistance-associated, including one carbon pool by folate, arachidonic acid metabolism, and insulin signaling pathway, which cannot be directly found by gene expression profile. Our findings may prompt the understanding of the lipid associated gene-level mechanism of insulin resistance of type 2 diabetes mellitus in blood.

Materials and methods

Subjects and clinical laboratory data

The study was approved by the Institutional Review Board of Tougaloo College. All subjects provided written informed consent for this study. T2D was diagnosed based on American Diabetes Association (ADA) [5] and characteristic symptoms of diabetes, a higher BMI, and a fasting plasma glucose > 126 mg dl-1 or a 2 h plasma glucose during an oral glucose tolerance test of > 200 mg dl-1. A total of 144 blood samples from healthy controls (n = 60, 32 Caucasians and 28 African Americans), and T2D (n = 84, 40 Caucasians and 44 African Americans) were collected. All subjects were evaluated by age, sex, race, body mass index (BMI), triacylglycerol (TG), high-density lipoprotein (HDL), low-density lipoprotein (LDL), total cholesterol (TC), and glucose levels.

Microarray experiments

Total RNA from 8-10 mls peripheral blood WBCs was obtained using LeukoLock™ Total RNA system (Ambion Inc, Austin, TX) according to the manufacturer's instructions. The quantity and quality of the isolated RNA were evaluated by Nanodrop spectrophotometry and Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). Gene expression profiling was peerformed using Agilent Whole Human Genome1 (4 X44K) Oligo arrays with ~20,000 genes represented (Agilent Technologies, Palo Alto, CA). Each sample was hybridized with a human universal RNA control (Stratagene, La Jolla, CA). 500 ng of total RNA was amplified and labeled using the Agilent Low RNA Input Fluorescent Linear Amplification Kit, according to manufacturer's protocol. For each two color array, 850 ng of each Cy5- (universal control) and Cy3-labeled (sample) cRNA were mixed and fragmented using the Agilent In Situ Hybridization Kit protocol. Hybridizations were performed for 17 hours in a rotating hybridization oven according to the Agilent 60-mer oligo microarray processing protocol prior to washing and scanning with an Agilent Scanner (G2565AA, Agilent Technologies, Wilmington, DE). Arrays were processed and background corrected with default settings for all parameters with the Agilent Feature Extraction software (v.9.5.3.1).

Microarray data analysis

Microarray data analyses were processed with GeneSpring version 7.0 and 10.0. The sample quality control was based on the Pearson correlation of a sample with other samples in the whole experiment. If the average Pearson correlation with other samples was less than 80%, the sample was excluded for further analysis. More detailed analysis was done similar to previous description [13].

ESI-MS/MS lipid profiling

The same subjects that used for microarray experiments were also used for lipid profiling. Plasma was directly used for the lipid profiling, which was conducted as described previously [14].

Statistical analyses

To evaluate the correlation between various type of data and phenotypes, two-side Kruskal's test were performed in R [15]. Pathway analysis of the expression data was performed by Fisher exact test with GOstats [16] package. Factor analyses of lipid profile were also preformed in R, where varimax rotation was used to seek a basis that most economically represents each individual. Feature selection and cSVM classifier were implement with CMA [17]. PLS regression model were built [18] with leave-one-out cross-validation.