Integration of machine learning to identify diagnostic genes in leukocytes for acute myocardial infarction patients

Zhang, Lin; Liu, Yue; Wang, Kaiyue; Ou, Xiangqin; Zhou, Jiashun; Zhang, Houliang; Huang, Min; Du, Zhenfang; Qiang, Sheng

doi:10.1186/s12967-023-04573-x

Integration of machine learning to identify diagnostic genes in leukocytes for acute myocardial infarction patients

Research
Open access
Published: 27 October 2023

Volume 21, article number 761, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Translational Medicine Aims and scope Submit manuscript

Integration of machine learning to identify diagnostic genes in leukocytes for acute myocardial infarction patients

Download PDF

Lin Zhang ORCID: orcid.org/0000-0003-3064-7975¹^na1,
Yue Liu²^na1,
Kaiyue Wang¹,
Xiangqin Ou³,
Jiashun Zhou⁴,
Houliang Zhang⁴,
Min Huang²,
Zhenfang Du² &
…
Sheng Qiang²

1505 Accesses
3 Citations
Explore all metrics

Abstract

Background

Acute myocardial infarction (AMI) has two clinical characteristics: high missed diagnosis and dysfunction of leukocytes. Transcriptional RNA on leukocytes is closely related to the course evolution of AMI patients. We hypothesized that transcriptional RNA in leukocytes might provide potential diagnostic value for AMI. Integration machine learning (IML) was first used to explore AMI discrimination genes. The following clinical study was performed to validate the results.

Methods

A total of four AMI microarrays (derived from the Gene Expression Omnibus) were included in bioanalysis (220 sample size). Then, the clinical validation was finished with 20 AMI and 20 stable coronary artery disease patients (SCAD). At a ratio of 5:2, GSE59867 was included in the training set, while GSE60993, GSE62646, and GSE48060 were included in the testing set. IML was explicitly proposed in this research, which is composed of six machine learning algorithms, including support vector machine (SVM), neural network (NN), random forest (RF), gradient boosting machine (GBM), decision trees (DT), and least absolute shrinkage and selection operator (LASSO). IML had two functions in this research: filtered optimized variables and predicted the categorized value. Finally, The RNA of the recruited patients was analyzed to verify the results of IML.

Results

Thirty-nine differentially expressed genes (DEGs) were identified between controls and AMI individuals from the training sets. Among the thirty-nine DEGs, IML was used to process the predicted classification model and identify potential candidate genes with overall normalized weights > 1. Finally, two genes (AQP9 and SOCS3) show their diagnosis value with the area under the curve (AUC) > 0.9 in both the training and testing sets. The clinical study verified the significance of AQP9 and SOCS3. Notably, more stenotic coronary arteries or severe Killip classification indicated higher levels of these two genes, especially SOCS3. These two genes correlated with two immune cell types, monocytes and neutrophils.

Conclusion

AQP9 and SOCS3 in leukocytes may be conducive to identifying AMI patients with SCAD patients. AQP9 and SOCS3 are closely associated with monocytes and neutrophils, which might contribute to advancing AMI diagnosis and shed light on novel genetic markers. Multiple clinical characteristics, multicenter, and large-sample relevant trials are still needed to confirm its clinical value.

Identification through machine learning of potential immune- related gene biomarkers associated with immune cell infiltration in myocardial infarction

Article Open access 28 March 2023

Machine learning-based mRNA signature in early acute myocardial infarction patients: the perspective toward immunological, predictive, and personalized

Article 13 May 2023

Collapsing the list of myocardial infarction-related differentially expressed genes into a diagnostic signature

Article Open access 09 June 2020

Introduction

Acute myocardial infarction (AMI), the most severe form of cardiovascular disease, is associated with [1, 2] millions of deaths annually around the world [3, 4]. Generally, the diagnosis of AMI includes clinical syndrome, electrocardiogram, and serum changes in enzyme levels [5]. However, AMI is easily misdiagnosed because of the following three aspects: nonclassic clinical symptoms [6, 7], atypical underappreciation [8], and an untimely serum peak. Because of the above three problems, a previous study [9] reported that the missed diagnosis rate of AMI is higher than 0.9%. The diagnosis and treatment of AMI must be prompt; otherwise, it may trigger irreversible results. Therefore, exploring new markers of AMI to decrease missed diagnoses is essential and urgent.

Leukocytes play an important and varied role in the entire evolution of AMI. During the acute injury phase of AMI, leukocytes promote a severe inflammatory cascade response through the polarization of M1 macrophages [10]. During the repair phase of AMI, M2 macrophages in leukocytes suppress inflammation and mediate the repair of injured myocardium [11]. Furthermore, leukocyte alteration positively correlates with AMI severity and, inversely, with patient survival [12, 13].

RNAs are involved in the evolution of AMI. For example, miR-155 correlated positively with the concentration of inflammatory cytokines, such as IL-6 and TNF-α [14], in AMI. Neutrophil-derived S100A8/A9 amplify granulopoiesis and cardiac injury in AMI mice [15]. Conversely, M2 macrophage-derived exosomes carry miR-1271-5p [16] to alleviate AMI-related cardiac injury. In conclusion, RNA on leukocytes plays a different role in the evolution of AMI, possibly related to different leukocyte subtypes. However, numerous studies have focused on integrating target interventions [12, 17] and leukocyte complications [17, 18]. Few studies have focused on the diagnostic value of leukocytes' RNA. Because the leukocytes' RNA is involved in the evolution of AMI, these RNA might have diagnosing value for AMI patients. The diagnosis value might be related to various leukocyte subtypes.

Machine learning (ML) helps humans learn patterns from complex data to predict future behavioural outcomes and trends. ML was widely utilized in variable filtering. A previous study used a single ML algorithm or two integrated ML algorithms (e.g., support vector machine [18] or least absolute shrinkage and selection operator [19]) to optimize variables. Still, these approaches may have missed potential genes [20]. Compared with a single ML algorithm, the integrated ML (IML) approach [21,22,23] we developed is more advantageous in variable screening and model building. IML helps identify potential genes mistakenly deleted by a single ML and find more meaningful variables [21]. IML integrates the advantages of a single ML, and its predictive classification value is better [23]. Based on a favourable filtration value in transcriptomics, IML might be used to comprehensively explore the diagnostic value in AMI patients.

In summary, we aim to explore the potential diagnostic value of transcriptome within leukocytes for identifying AMI patients. Because of IML's good variable screening and excellent predictive value, IML was first used to mine diagnostic genes in AMI leukocytes with multiple microarrays. Single microarray data might have inherent biases in capturing the entire transcriptomic landscape, so multiple microarrays are integrated after resolving batch effects to reduce bias and validate each other. And clinical validation was added to confirm the result. The relationship between transcriptome and leukocyte subtypes was unclear, so the correlation between immune cells and target transcriptome was subsequently accomplished. We expect to explore the functional roles of the identified genes in AMI pathophysiology, investigating their potential as therapeutic targets.

Methods

Data acquisition

The raw data were obtained from the Gene Expression Omnibus (GEO, March 27, 2022). AMI patients have similar symptoms to stable coronary artery disease (SCAD), which were set as the controls. An increasing leukocyte may influence the result of other cardiovascular diseases (e.g., stroke [24, 25] and heart failure [26]), which will be excluded. Based on the above, the following inclusion and exclusion criteria were set: (I) inclusion criteria—(i) diagnosed as AMI patients on admission; (ii) transcriptome was obtained from leukocytes in blood; (iii) initial data were free and accessible; and (iv) the control individuals were diagnosed with health or SCAD; and (II) exclusion criteria—(i) other cardiovascular diseases suspected and (ii) blood were taken more than one day after hospitalization.

Data processing

To ensure the reliability of the data, the R package sva (version 3.46.0) was applied to data integration to minimize the branch effects with the ComBat function and parametric adjustments. Regarding the distribution ratio of previous literature (1.64:1 [27] to 5:1 [28]) and to minimize the branching effect, this research was distributed in the training or testing sets at a ratio of 5:2. GSE59867 was included in the training set. In contrast, GSE60993, GSE62646, and GSE48060 were included in the testing set. In brief, the training set was applied to explore candidate diagnostic genes, and the testing set was used for validation. Based on the differential DEGs, three functional enrichment analyses were developed via the Kyoto Encyclopedia of Genes and Genomes Gene Set Enrichment Analysis (KEGG-GSEA), Gene Ontology (GO), and Disease Ontology (DO). In addition, the GO terms included three branches: molecular function (MF), biological process (BP), and cellular components (CC). Notably, the novel IML served two functions: developing classification ML and exploring the candidate variable. Finally, the above candidate genes were verified in the testing group and clinical study, and an immune analysis among the candidate genes was performed. CIBERSORT was processed for immune correlation analysis in the corrplot R package (version 0.92). And the primary code was linked with https://github.com/Linzhang-BiuBiuBiu/ML-for-diagnosis-genes.

Searching for DEGs

Because the same gene may have multiple sequences, the transcriptome will appear to have several expression data for the same genes. For the same genes, limma (version 3.54.0) was employed to identify the DEGs with the average gene expression. According to the Benjamini and Hochberg method, two thresholds were established: a log of fold change (|logFC|) > 0.7 (previous studies were 0.5 [29]–1 [23]) and a false discovery rate [30] < 0.05.

IML of six ML algorithms

Classification models of IML, composed of six ML algorithms, were processed, covering support vector machine (SVM), neural network (NN), random forest (RF), gradient boosting machine (GBM), decision trees (DT), and least absolute shrinkage and selection operator (LASSO). In brief, IML was used to identify candidate genes with the overall normalized weights. The six ML algorithms were developed to optimize parameter settings, model development in the training sets, and validation in the testing sets. For stability, all ML algorithms were tenfold cross-validated. Notably, an accuracy value was applied to evaluate the predictive classification value, and a higher accuracy value showed a better classification value of the six ML algorithms.

LASSO was processed with the glmnet (version 4.1-6) R package. cv.glmnet was utilized to majorize lambda. For the parameters, the scale of "lambda" was set between 0 and 100 with "binomial" and "class".Based on the minimum lambda, glmnet was processed to the LASSO with alpha and a "binomial" method in training sets.

SVM was developed with the e1071 R package (version 1.7–12). tune.svm was adopted to optimize the settings parameter with the kernel of "linear", and the cost between 1 and 20. Next, based on the optimized number of support vectors in the training set, the SVM model was finished.

DT was processed with rpart (version 4.1.19) and rpart.plot (version 3.1.1). Based on the "class" method and a cp value of 0.001, the rpart function was adopted for the DT model.

RF was completed with the R package randomForest (version 4.7-1.1). First, the tuneRF function was adopted to optimize 0-700 trees with one step size. RF was developed based on the minimum error rate to optimize the number of trees.

NN was processed with neuralnet (version 1.44.2) with neuralnet function, five layers (an input, an output, and three hidden layers), err.fct of "sse", and the linear.

Compared with the other 5 ML algorithms, GBM processed more steps and was prone to making mistakes. The GMB was developed with h2o (version 3.38.0.1). First, the Java operating environment was installed, which is the virtual environment of GBM. Essential for running the memory setting in h2o.init, the model memory of GBM was adjusted to 8G. The h2o data type in GBM was inevitable, and the as.h2o function was utilized to transform the data format. Next, h2o.gbm tuned the parameters and developed the model with the "Bernoulli" distribution, 200 trees, a learning rate of 0.001, and a sample rate of 90%.

Furthermore, with the weights of the above six ML algorithms in DEGs, the normalized sum weight of IML was calculated as follows: overall weights = abs(RF)/abs(RFmax) + abs(SVM)/abs(SVMmax) + abs(LASSO)/abs(LASSOmax) + abs(NN)/abs(NNmax) + abs(GBM)/abs(GBMmax) + abs(DT)/abs(DTmax). For instance, if the weight of interleukin-6 in six ML algorithms was 30, -22, 20, -2, 320, and -8, the maximum absolute value weights in the six ML algorithms were 60, 88, 80, 8, 640, and 16. Therefore, the overall weight of interleukin-6 was |30|/60 +|− 22|/88 +|20|/80 +|− 2|/8 +|320|/640 +|− 8|/16 = 2.25. With normalized overall weights > 1, the candidate genes were estimated by the area under the curve (AUC).

Clinical validation

The clinical trial was performed according to the Declaration of Helsinki guidelines. All AMI and SCAD patients provided individual written informed consent from October 10, 2022, to December 31, 2022, and the Ethics Review Committee of Jinghai District Hospital approved the study. There was no increase in the cost of treatment for the patients, no addition of other intervention in the treatment, and the blood samples used were taken from the discarded blood of the patients after their routine blood tests on the same day. If the patient did not have a routine blood test on that day, then the patient was excluded. All AMI patients underwent coronary angiography, and blood samples were collected in anticoagulant tubes on admission. Density gradient centrifugation [31, 32] was performed for leukocyte isolation (mainly monocytes and lymphocytes). In brief, 8 mL of Ficoll solution was added to 8 mL of anticoagulated blood, and the upper plasma layer was discarded after centrifugation. The white cell layer at the isolate's junction was aspirated, added to 10 mL of saline, and centrifuged; the bottom layer was the leukocytes (mainly monocytes and lymphocytes). RNA, isolated from leukocytes, was synthesized with reverse transcription kits (Takara, Shiga, Japan). Quantitative reverse transcription PCR was executed on an ABI7900HI (Thermo Fisher Scientific). According to previous literature, the relative content of the candidate genes was scaled to the reference gene (GAPDH [33]), and Table 1 lists the primer sequences.

Table 1 List of primers for real-time PCR analysis in GAPDH, AQP9, and SOCS3

Full size table

Results

Included datasets

A total of 4 datasets (Table 2) (220 sample sizes), namely, GSE59867, GSE60993, GSE62646, and GSE48060, were integrated for this study. The training set was obtained from GSE59867 (46 controls and 111 AMI patients) based on a raw ratio of 5:2. Furthermore, the testing set was integrated with the other three datasets (28 controls and 35 AMI patients), namely, GSE60993, GSE62646, and GSE48060. The following analysis is presented in Fig. 1.

Table 2 Fundamental information in the 4 datasets

Full size table

DEG identification

Thirty-nine DEGs were identified (Additional file 1: Table S1) in a training set from 17,049 RNAs. Compared to the control group (SCAD), 28 genes were upregulated (SOCS3, HP, ECRP, AQP9, FAM20A, CES1, STAB1, NRG1.1, NRG1, DYSF, RNASE1, RNASE2, ASGR2, CYP1B1, MERTK, FCGR1A.2, MIR21, FCGR1A.1, TCN2, VSIG4, PPARG, FCGR1A, SLED1, S100A9, FMN1.1, CD163, TMEM176A, and SERPINB2) and 11 genes were downregulated (KLRC3, KLRD1, KLRA1P, DTHD1, KLRC4, MYBL1, CLC, KLRC2, KLRC4-KLRK1, SNORD20, and SNORD45B) in AMI individuals (Fig. 2).

Functional analysis

Based on the above DEGs, 45 GSEA terms (Additional file 1: Table S2) were identified, and the top 5 are shown in Fig. 3A, B; 160 GO terms (Additional file 1: Table S3) were identified, and the top 5 are shown in Fig. 3C; and the top 15 of 57 DO terms (Additional file 1: Table S4) are shown in Fig. 3D. In GSEA-KEGG of AMI, the top 3 were Fc gamma R-mediated phagocytosis, Huntington disease, and Leishmania infection. In GO, the top 3 in BP were the stimulatory C-type lectin receptor signalling pathway, response to lectin, and cellular response to lectin. In DO terms, the top 3 were atherosclerosis, arteriosclerotic cardiovascular disease, and arteriosclerosis.

IML of six ML algorithms

Six ML algorithms (Fig. 4) and their accuracies (Table 3) were assessed. Eight genes were identified in LASSO (Fig. 4A), and the training and testing sets' accuracy value was 70.70% (Table 3). In SVM, 13 genes were filtered (Fig. 4B), and the accuracies were 88.46% and 91.84%, respectively. The error rate of RF (Fig. 4C) decreased with an increasing number of trees. Until 161 trees, the error rate of RF was minimized, and the accuracy of the two sets was 98.09% and 100%. In DT (Fig. 4D), the gene expression of 9.8 in AQP9 could discriminate the control and AMI groups, while the accuracies were unstable, 94.27%, and 75.52%. In GBM (Fig. 4E), sixfold methods were established to optimize the diagnosis genes, but unstable accuracies, such as the above ML algorithms, were 93.30% and 85.71%. In the NN (Fig. 4F), although sufficient for discriminating the controls and AMI patients with three hidden layers, the accuracy was either 83.74% or 71.43%. Among the above ML algorithms, the primary weights of 39 DEGs were identified (Additional file 1: Table S5). Interestingly, RF had the highest and most stable accuracy value among all ML algorithms. The normalized overall weights (Table 4) were calculated to filter the candidate variables. Twenty-six genes (ASGR2, SOCS3, AQP9, PPARG, RNASE1, DYSF, S100A9, FCGR1A, VSIG4, STAB1, MYBL1, KLRD1, ECRP, TCN2, FAM20A, MERTK, HP, RNASE2, DTHD1, CLC, SNORD20, CD163, NRG1, SNORD45B, CYP1B1, and KLRC2) were identified because of overall weights > 1 (Table 4).

Table 3 Accuracy of six MLs based on 39 DEGs in the training and test sets

Full size table

Table 4 Overall weights of six classification models were constructed to optimize the candidate diagnostic genes

Full size table

With the basis of overall normalized weights > 1, 26 candidate genes were filtered for subsequent diagnosis in AMI and control groups in the training and testing sets. Among the 26 genes, 10 were excluded because of no differentiation in the testing set. Sixteen genes were significant in the two sets (Fig. 5).

Diagnosis value of candidate genes

Sixteen candidate genes were included in the following ROC analysis. The AUC values of SOCS3, AQP9, and ASGR2 were greater than 0.85 in both the training and testing sets. In particular, 2 genes, SOCS3 and AQP9, were greater than 0.9 (Fig. 6). The AUC value of the two genes indicated a potential diagnostic value in AMI.

Correlation analysis

Immune correlation was performed with the 220 samples (Fig. 7). The infiltration landscape (Fig. 7A) showed 22 immune distributions in the control and AMI groups. Nine types of immune cells (T cells CD8, T cells CD4 naive, T cells regulatory (Tregs), NK cells resting, monocytes, macrophages M0, macrophages M2, eosinophils, and neutrophils) infiltrated significantly between the control and AMI groups (Additional file 1: Fig. S1). Moreover, the correlations between 22 immunized cells and the two diagnostic genes, AQP9 and SOCS3, based on Spearman analysis (Fig. 7B, C) showed significant correlations with 9 immune cells (monocytes, neutrophils, T cells CD8, NK cells resting, T cells CD4 naive, eosinophils, macrophages M2, dendritic cells activated, and B cells memory). More importantly, two immune cell types (monocytes and neutrophils) possessed a higher correlation coefficient (Fig. 7B, C) than the other 7 immune cell types (Additional file 1: Figs. S2, S3). In particular, the correlation coefficients of monocytes (Fig. 7B, C) were highest for the two genes (0.56 for SOCS3 and 0.76 for AQP9).

Clinical validation

Finally, 40 individuals (20 SCAD and 20 AMI patients) were recruited. The general information of these individuals was shown in Table 5. Among 39 clinical characteristics were summarized, and 13 had significance between the SCAD and AMI patients, including WBC, NeP, MonP, Lym, GAT, D-dimer, CRP, SOCS3, AQP9, LDH, cTnT, CK-MB, and Albumin.

Table 5 The general characteristics of the 40 patients

Full size table

The relative RNA levels (Fig. 8A) of AQP9 and SOCS3 were both significant. The SOCS3 content of coronary arteries differed by the number of lesions (Fig. 8B): three lesions showed significantly higher SOCS3 than two and one (Fig. 8B). In Fig. 8C, two AMI patients were excluded because of unclear Killip grading. Patients with III-IV Killip classification had higher SOCS3 compared to those with I-II (Fig. 8C). Although more stenotic coronary arteries were associated with higher levels of AQP9, the difference was less significant than for SOCS3 (Fig. 8B). In addition, different Killip classifications associated with AQP9 possessed no significant differences (Fig. 8C). Furthermore, the 9 significant clinical features were analysed with Pearson correlation test (Additional file 1: Fig. S4). And SOCS3 had a positive correlation with AQP9. Both genes had a negative correlation with Albumin.

Discussion

To our knowledge, our work is the first to filter AMI diagnosis genes based on the overall normalized weights of IML. Four microarrays with 220 samples were adopted for data analysis, and further clinical studies were performed to validate the results. Two genes, AQP9 and SOCS3, showed an AUC > 0.9 in both the training set and testing set (Fig. 6). Both genes showed a typical and highest correlation coefficient (Fig. 7) in monocytes. The clinical study verified the significance between AMI and SCAD controls, indicating a potential diagnostic value of AQP9 and SOCS3. Compared with previous studies, we reached similar conclusions that AQP9 presented diagnostic value for AMI [34, 35], and we further explored the immune correlation of AQP9. Additionally, Prof. Zhu [36] identified SOCS3 as an immune-related gene in AMI, and we expanded it to have diagnostic value. More importantly, this study is the first to reveal the RNA correlation of AQP9 and SOCS3, especially SOCS3, between the number of stenotic coronary arteries and the Killip classification.

AQP9, a cell membrane protein, transports water down the concentration gradient. ERK1/2 can be reversed in AMI rats by silencing AQP9, attenuating cardiomyocytes' inflammatory response and apoptosis and upregulating cardiac function [37]. The above research indicated the crucial role of AQP9 in the pathogenesis of AMI. In human polymorphonuclear leukocytes, AQP9-related inflammation may result from the NK-κB [38] and F-actin polymerization [39]. In our work, the ROC curve of AQP9 was > 0.9. Therefore, AQP9 might be a potential genetic marker for diagnosing AMI with SCAD.

SOCS3 is increased in AMI mice [29] and regulates the T-cell repertoire with STAT3/SOCS3 signalling [40]. More importantly, cardiac-specific silencing of SOCS3 triggers sustained STAT3 and decreases myocardial apoptosis [41]. Therefore, SOCS3 is the dominant negative modulator [42] of Th17 via STAT3 [43]. Apoptosis regulates the pathophysiological evaluation of AMI [44]. In vitro, SOCS3 can trigger the apoptosis of mammary cells [45], and knocking out SOCS3 regulates the expression of apoptosis in 3T3-L1 preadipocytes [46]. The above research emphasized the immune regulation of SOCS3 and the regulation of apoptosis with STAT3. In our work, the ROC curve of SOCS3 was > 0.9. Therefore, SOCS3 might be an effective genetic marker for diagnosing AMI.

Additionally, the CIBERSORT algorithm showed that the proportion of neutrophils and monocytes in the AMI group was higher than in the control group. The progression of AMI is correlated with immune disorder. For example, the white blood cell count correlates highly with in-hospital mortality after AMI [47]. Neutrophils are increased in peripheral blood, and researchers have emphasized that neutrophils-lymphocytes [48, 49] and monocytes/macrophages [50] can be easily acquired factors for the prognosis of AMI. Macrophages were dominant in infarcted myocardium, especially over the first week of AMI [51]. However, NK cells have diminished cytotoxic function [52], and the targeted regulation of NK cells may indicate a dominant role in the cure of AMI. At the beginning of AMI, inflammation deteriorates with increased neutrophils and monocytes [53], and inflammation decreases over time with the reduced function of NK cells. Innate immunity is a vital regulatory factor in the inflammatory, proliferative, and maturation phases [3, 54, 55]. AMI leads to a deteriorated inflammatory process. Currently, novel therapeutic interventions targeting the immune system may regulate slant inflammation, which is conducive to resolving pathological conditions. In a previous clinical trial of 182 NSTEMI patients (a subtype of AMI), the patient's intake of IL-1 blockers decreased acute inflammation [56]. Another immune study showed that short-term blockade of S100A9 downregulates inflammation [57] in permanent coronary ischemia mice. However, the above immune interventions are still experimental and not in the clinic. In summary, regulating immune cells along with the progression of AMI and immune intervention in AMI might be a potential target.

AQP9 expression was highest in human polymorphonuclear leukocytes [39] compared with the spleen and liver, suggesting a possible correlation between AQP9 and immunity or inflammation. AQP9 regulates water flow on leukocytes [58], which regulates cellular morphology and motility, a change that facilitates the migration of leukocytes to inflammatory sites. Similar to our result, Hawang [59] indicated the correlation between AQP9 and neutrophile granulocytes. Research [29, 60, 61] emphasizes the correlation between SOCS3 and neutrophils in inflammation. In our research, both genes had a higher correlation with two immune cells, neutrophils and monocytes. The immune cell correlation indicated that the targeted gene therapy of immune cells may benefit the course of AMI—potential feasibility of using AQP9 and SOCS3 as therapeutic targets or predictors of treatment response.

ML algorithms are widely performed for various cardiovascular diseases, such as optimizing variables, classification, and congression. For variable filtration, numerous studies take only single or double ML algorithms (e.g., weighted gene coexpression network analysis [60], LASSO, and SVM). However, only the single or double ML algorithms might unconsciously delete the potential genes. For example, AQP9 will be ignored if we only take DT because the weights of AQP9 were zero in DT (Table 4). Taking only a single ML might miss some potential genes. For example, although LASSO can detect candidate genes with big data when highly correlated features exist, the LASSO regression method tends to select one of them and ignore all the other features, leading to the instability of the results [61]. In pigmented skin lesions [62], SVM and NN displayed their talent classification value. In preoperative postsurgical mortality [63], GBM is optimized rather than DT, RF, and SVM. Various ML algorithms may show different weights even in the same variable (Table 4). Necessarily, the overall normalized weights of IML were taken to filter genes. Surprisingly, IML explores two potential, unreported diagnostic genes in AMI. In our study, IML has good value in both variable screening and model prediction.

Inevitably, four limitations exist in this work, although the best efforts were taken to eliminate them. Primarily, small sample size verification might possess some bias. So, multicentre collaborations or leveraging larger external datasets is crucial for further verification. Although testing sets and clinical validation were developed to assess the stability of the diagnostic value, the bias of single-centre validation might exist. More confirmation, clinical trials and animal experiments are indispensable for solid verification. Next, the ML algorithms contained limitations (e.g., the black box phenomenon [64]), especially NN, which has numerous layers [65]. The set of operations an ML performs in making a prediction is unknown, even if a human knows precisely what the model is doing at each step of the decision-making process. The operations performed cannot be described in terms of human-understandable semantics. And the interpretability techniques for ML models always catch the eye of developers, which enhances the transparency and reliability of the ML. Thirdly, because of the limitations of our laboratory extraction process, clinically validated acquired leukocytes are predominantly lymphocytes and monocytes. Finally, limited clinical features were obtained (e.g., age [66], ethnicity, and race [67]). Clinical features could potentially enhance the predictive accuracy of the diagnostic model and provide a more comprehensive understanding of AMI. For example, various combinations (e.g., sex, smoking or not, and laboratory indicators) of clinical variables [68] are calibrated to analyze the relationship between the target variable and the outcome.

Conclusion

Based on the overall normalized weights of IML, the research successfully merges four microarrays and uncovers hidden diagnostic genes AQP9 and SOCS3 for leukocytes of AMI patients. AQP9 and SOCS3 are closely associated with monocytes and neutrophils, which might contribute to advancing AMI diagnosis and shedding light on novel genetic markers, including AMI pathogenesis, targeted therapies, and potential precision medicine. Although clinical validation copies the result again. Multiple clinical characteristics, multicenter, and large-sample relevant trials are still needed to confirm its clinical value.

Availability of data and materials

The datasets presented in this study can be found online. The names of the repositories and GEO numbers can be found below: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE59867; https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE60993;https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE62646;https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE48060.

Abbreviations

AUC:: Area under the curve
AMI:: Acute myocardial infarction
SCAD:: Stable coronary artery disease
IML:: Integration machine learning
DEGs:: Differently expressed genes
KEGG-GSEA:: Kyoto Encyclopedia of Genes and Genomes-Gene Set Enrichment Analysis
GO:: Gene Ontology
DO:: Disease Ontology
MF:: Molecular function
BP:: Biological process
CC:: Cellular components
SVM:: Support vector machine
ML:: Machine learning
LASSO:: Least absolute shrinkage and selection operator
RF:: Random forest
OOB:: Out of bag
GBM:: Gradient boosting machine
DT:: Decision trees
NN:: Neural network
RBC:: Red blood cell count
WBC:: White blood cell count
NeP:: Neutrophils percentage
MonP:: Monocyte percentage
Mon:: Monocyte count
Lym:: Lymphocyte count
RDW:: Red blood cell distribution width
PDW:: Platelet distribution width
Pla:: Platelet count
MCHC:: Mean corpuscular haemoglobin concentration
Hg:: Haemoglobin
GAT:: Glutamic transaminase
CRP:: C-reactive protein
LDH:: Lactate dehydrogenase
cTnT:: Cardiac troponin t
CK-MB:: Creatine kinase isoenzymes
LDL:: Low-density lipoprotein
HDL:: High-density lipoprotein
TC:: Total cholesterol
TG:: Total triglycerides
Cys:: Homocysteine
GGT:: Gammaglutaminase
IBIL:: Indirect bilirubin
DBIL:: Direct bilirubin
TBIL:: Total bile acid
ALP:: Alkaline phosphatase

References

GBD 2017 DALYs and HALE Collaborators. Global, regional, and national disability-adjusted life-years (DALYs) for 359 diseases and injuries and healthy life expectancy (HALE) for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2018;392:1859–922.
Article PubMed Central Google Scholar
Murray CJ, Barber RM, Foreman KJ, Abbasoglu Ozgoren A, Abd-Allah F, Abera SF, Aboyans V, Abraham JP, Abubakar I, Abu-Raddad LJ, et al. Global, regional, and national disability-adjusted life years (DALYs) for 306 diseases and injuries and healthy life expectancy (HALE) for 188 countries, 1990–2013: quantifying the epidemiological transition. Lancet. 2015;386:2145–91.
Article PubMed Google Scholar
Reed GW, Rossi JE, Cannon CP. Acute myocardial infarction. Lancet. 2017;389:197–210.
Article PubMed Google Scholar
Anderson JL, Morrow DA. Acute myocardial infarction. N Engl J Med. 2017;376:2053–64.
Article CAS PubMed Google Scholar
Levine GN, Bates ER, Bittl JA, Brindis RG, Fihn SD, Fleisher LA, Granger CB, Lange RA, Mack MJ, Mauri L, et al. 2016 ACC/AHA guideline focused update on duration of dual antiplatelet therapy in patients with coronary artery disease: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines: an update of the 2011 ACCF/AHA/SCAI Guideline for Percutaneous Coronary Intervention, 2011 ACCF/AHA Guideline for Coronary Artery Bypass Graft Surgery, 2012 ACC/AHA/ACP/AATS/PCNA/SCAI/STS Guideline for the Diagnosis and Management of Patients With Stable Ischemic Heart Disease, 2013 ACCF/AHA Guideline for the Management of ST-Elevation Myocardial Infarction, 2014 AHA/ACC Guideline for the Management of Patients With Non-ST-Elevation Acute Coronary Syndromes, and 2014 ACC/AHA Guideline on Perioperative Cardiovascular Evaluation and Management of Patients Undergoing Noncardiac Surgery. Circulation. 2016;134:e123-155.
PubMed Google Scholar
Huang CC, Liao PC. Heart attack causes head-ache - cardiac cephalalgia. Acta Cardiol Sin. 2016;32:239–42.
PubMed PubMed Central Google Scholar
Di Stefano R, Di Bello V, Barsotti MC, Grigoratos C, Armani C, Dell’Omodarme M, Carpi A, Balbarini A. Inflammatory markers and cardiac function in acute coronary syndrome: difference in ST-segment elevation myocardial infarction (STEMI) and in non-STEMI models. Biomed Pharmacother. 2009;63:773–80.
Article PubMed Google Scholar
Wei EY, Hira RS, Huang HD, Wilson JM, Elayda MA, Sherron SR, Birnbaum Y. Pitfalls in diagnosing ST elevation among patients with acute myocardial infarction. J Electrocardiol. 2013;46:653–9.
Article PubMed Google Scholar
Moy E, Barrett M, Coffey R, Hines AL, Newman-Toker DE. Missed diagnoses of acute myocardial infarction in the emergency department: variation by patient and facility characteristics. Diagnosis (Berl). 2015;2:29–40.
Article PubMed Google Scholar
Peet C, Ivetic A, Bromage DI, Shah AM. Cardiac monocytes and macrophages after myocardial infarction. Cardiovasc Res. 2020;116:1101–12.
Article CAS PubMed Google Scholar
Heidt T, Courties G, Dutta P, Sager HB, Sebas M, Iwamoto Y, Sun Y, Da Silva N, Panizzi P, van der Laan AM, et al. Differential contribution of monocytes to heart macrophages in steady-state and after myocardial infarction. Circ Res. 2014;115:284–95.
Article CAS PubMed PubMed Central Google Scholar
Jiang K, Tu Z, Chen K, Xu Y, Chen F, Xu S, Shi T, Qian J, Shen L, Hwa J, et al. Gasdermin D inhibition confers antineutrophil-mediated cardioprotection in acute myocardial infarction. J Clin Invest. 2022;132: e151268.
Article PubMed PubMed Central Google Scholar
Liang Y, Chen H, Wang P. Correlation of leukocyte and coronary lesion severity of acute myocardial infarction. Angiology. 2018;69:591–9.
Article PubMed Google Scholar
Kazimierczyk E, Eljaszewicz A, Zembko P, Tarasiuk E, Rusak M, Kulczynska-Przybik A, Lukaszewicz-Zajac M, Kaminski K, Mroczko B, Szmitkowski M, et al. The relationships among monocyte subsets, miRNAs and inflammatory cytokines in patients with acute myocardial infarction. Pharmacol Rep. 2019;71:73–81.
Article CAS PubMed Google Scholar
Sreejit G, Abdel-Latif A, Athmanathan B, Annabathula R, Dhyani A, Noothi SK, Quaife-Ryan GA, Al-Sharea A, Pernes G, Dragoljevic D, et al. Neutrophil-derived S100A8/A9 amplify granulopoiesis after myocardial infarction. Circulation. 2020;141:1080–94.
Article CAS PubMed PubMed Central Google Scholar
Long R, Gao L, Li Y, Li G, Qin P, Wei Z, Li D, Qian C, Li J, Yang G. M2 macrophage-derived exosomes carry miR-1271-5p to alleviate cardiac injury in acute myocardial infarction through down-regulating SOX6. Mol Immunol. 2021;136:26–35.
Article CAS PubMed Google Scholar
Li R, Jin J, Liu E, Zhang J. A novel circulating biomarker lnc-MALAT1 for acute myocardial infarction: its relationship with disease risk, features, cytokines, and major adverse cardiovascular events. J Clin Lab Anal. 2022;36: e24771.
Article CAS PubMed PubMed Central Google Scholar
GBD 2017 Disease and Injury Incidence and Prevalence Collaborators. Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2018;392:1789–858.
Article Google Scholar
Lu Y, Wu Q, Liao J, Zhang S, Lu K, Yang S, Wu Y, Dong Q, Yuan J, Zhao N, Du Y. Identification of the distinctive role of DPT in dilated cardiomyopathy: a study based on bulk and single-cell transcriptomic analysis. Ann Transl Med. 2021;9:1401.
Article CAS PubMed PubMed Central Google Scholar
Di Z, Di M, Fu W, Tang Q, Liu Y, Lei P, Gu X, Liu T, Sun M. Integrated analysis identifies a nine-microrna signature biomarker for diagnosis and prognosis in colorectal cancer. Front Genet. 2020;11:192.
Article CAS PubMed PubMed Central Google Scholar
Wang K, Zhang L, Li L, Wang Y, Zhong X, Hou C, Zhang Y, Sun C, Zhou Q, Wang X. Identification of Drug-Induced Liver Injury Biomarkers from Multiple Microarrays Based on Machine Learning and Bioinformatics Analysis. Int J Mol Sci. 2022;23:11945.
Article CAS PubMed PubMed Central Google Scholar
Zhang L, Lin Y, Wang K, Han L, Zhang X, Gao X, Li Z, Zhang H, Zhou J, Yu H, Fu X. Multiple-model machine learning identifies potential functional genes in dilated cardiomyopathy. Front Cardiovasc Med. 2022;9:1044443.
Article CAS PubMed Google Scholar
Zhang L, Mao R, Lau CT, Chung WC, Chan JCP, Liang F, Zhao C, Zhang X, Bian Z. Identification of useful genes from multiple microarrays for ulcerative colitis diagnosis based on machine learning methods. Sci Rep. 2022;12:9962.
Article CAS PubMed PubMed Central Google Scholar
DeLong JH, Ohashi SN, O’Connor KC, Sansing LH. Inflammatory Responses After Ischemic Stroke. Semin Immunopathol. 2022;44:625–48.
Article PubMed Google Scholar
Denorme F, Portier I, Rustad JL, Cody MJ, de Araujo CV, Hoki C, Alexander MD, Grandhi R, Dyer MR, Neal MD, et al: Neutrophil extracellular traps regulate ischemic stroke brain injury. J Clin Invest 2022, 132.
Swirski FK, Nahrendorf M. Leukocyte behavior in atherosclerosis, myocardial infarction, and heart failure. Science. 2013;339:161–6.
Article CAS PubMed PubMed Central Google Scholar
Hiremath A, Shiradkar R, Fu P, Mahran A, Rastinehad AR, Tewari A, Tirumani SH, Purysko A, Ponsky L, Madabhushi A. An integrated nomogram combining deep learning, Prostate Imaging-Reporting and Data System (PI-RADS) scoring, and clinical variables for identification of clinically significant prostate cancer on biparametric MRI: a retrospective multicentre study. Lancet Digit Health. 2021;3:e445–54.
Article CAS PubMed PubMed Central Google Scholar
Wang Y, Guan Q, Lao I, Wang L, Wu Y, Li D, Ji Q, Wang Y, Zhu Y, Lu H, Xiang J. Using deep convolutional neural networks for multi-classification of thyroid tumor by histopathology: a large-scale pilot study. Ann Transl Med. 2019;7:468.
Article PubMed PubMed Central Google Scholar
Zhu X, Yin T, Zhang T, Zhu Q, Lu X, Wang L, Liao S, Yao W, Zhou Y, Zhang H, Li X. Identification of immune-related genes in patients with acute myocardial infarction using machine learning methods. J Inflamm Res. 2022;15:3305–21.
Article PubMed PubMed Central Google Scholar
Ein-Dor L, Kela I, Getz G, Givol D, Domany E. Outcome signature genes in breast cancer: is there a unique set? Bioinformatics. 2005;21:171–8.
Article CAS PubMed Google Scholar
Jaatinen T, Laine J: Isolation of mononuclear cells from human cord blood by Ficoll-Paque density gradient. Curr Protoc Stem Cell Biol 2007, Chapter 2:Unit 2A.1.
Tan YS, Lei YL. Isolation of tumor-infiltrating lymphocytes by ficoll-paque density gradient centrifugation. Methods Mol Biol. 2019;1960:93–9.
Article CAS PubMed PubMed Central Google Scholar
Sugiyama Y, Yamazaki K, Kusaka-Kikushima A, Nakahigashi K, Hagiwara H, Miyachi Y. Analysis of aquaporin 9 expression in human epidermis and cultured keratinocytes. FEBS Open Bio. 2014;4:611–6.
Article CAS PubMed PubMed Central Google Scholar
Chen J, Yu L, Zhang S, Chen X. Network analysis-based approach for exploring the potential diagnostic biomarkers of acute myocardial infarction. Front Physiol. 2016;7:615.
Article PubMed PubMed Central Google Scholar
Shao G. Integrated RNA gene expression analysis identified potential immune-related biomarkers and RNA regulatory pathways of acute myocardial infarction. PLoS ONE. 2022;17: e0264362.
Article CAS PubMed PubMed Central Google Scholar
Yang Y, Liu P, Teng R, Liu F, Zhang C, Lu X, Ding Y. Integrative bioinformatics analysis of potential therapeutic targets and immune infiltration characteristics in dilated cardiomyopathy. Ann Transl Med. 2022;10:348.
Article CAS PubMed PubMed Central Google Scholar
Huang X, Yu X, Li H, Han L, Yang X. Regulation mechanism of aquaporin 9 gene on inflammatory response and cardiac function in rats with myocardial infarction through extracellular signal-regulated kinase1/2 pathway. Heart Vessels. 2019;34:2041–51.
Article PubMed Google Scholar
Takeuchi K, Hayashi S, Matumoto T, Hashimoto S, Takayama K, Chinzei N, Kihara S, Haneda M, Kirizuki S, Kuroda Y, et al. Downregulation of aquaporin 9 decreases catabolic factor expression through nuclear factor-κB signaling in chondrocytes. Int J Mol Med. 2018;42:1548–58.
CAS PubMed Google Scholar
Matsushima A, Ogura H, Koh T, Shimazu T, Sugimoto H. Enhanced expression of aquaporin 9 in activated polymorphonuclear leukocytes in patients with systemic inflammatory response syndrome. Shock. 2014;42:322–6.
Article CAS PubMed Google Scholar
Baker BJ, Akhtar LN, Benveniste EN. SOCS1 and SOCS3 in the control of CNS immunity. Trends Immunol. 2009;30:392–400.
Article CAS PubMed PubMed Central Google Scholar
Negoro S, Kunisada K, Fujio Y, Funamoto M, Darville MI, Eizirik DL, Osugi T, Izumi M, Oshima Y, Nakaoka Y, et al. Activation of signal transducer and activator of transcription 3 protects cardiomyocytes from hypoxia/reoxygenation-induced oxidative stress through the upregulation of manganese superoxide dismutase. Circulation. 2001;104:979–81.
Article CAS PubMed Google Scholar
Yoshimura A, Naka T, Kubo M. SOCS proteins, cytokine signalling and immune regulation. Nat Rev Immunol. 2007;7:454–65.
Article CAS PubMed Google Scholar
Chen Z, Laurence A, Kanno Y, Pacher-Zavisin M, Zhu BM, Tato C, Yoshimura A, Hennighausen L, O’Shea JJ. Selective regulatory function of Socs3 in the formation of IL-17-secreting T cells. Proc Natl Acad Sci U S A. 2006;103:8137–42.
Article CAS PubMed PubMed Central Google Scholar
Scarabelli TM, Stephanou A, Pasini E, Comini L, Raddino R, Knight RA, Latchman DS. Different signaling pathways induce apoptosis in endothelial cells and cardiac myocytes during ischemia/reperfusion injury. Circ Res. 2002;90:745–8.
Article CAS PubMed Google Scholar
Le Provost F, Miyoshi K, Vilotte JL, Bierie B, Robinson GW, Hennighausen L. SOCS3 promotes apoptosis of mammary differentiated cells. Biochem Biophys Res Commun. 2005;338:1696–701.
Article PubMed Google Scholar
Chhabra JK, Chattopadhyay B, Paul BN. SOCS3 dictates the transition of divergent time-phased events in granulocyte TNF-α signaling. Cell Mol Immunol. 2014;11:105–6.
Article CAS PubMed Google Scholar
Dutta P, Nahrendorf M. Monocytes in myocardial infarction. Arterioscler Thromb Vasc Biol. 2015;35:1066–70.
Article CAS PubMed PubMed Central Google Scholar
Lin G, Dai C, Xu K, Wu M. Predictive value of neutrophil to lymphocyte ratio and red cell distribution width on death for ST segment elevation myocardial infarction. Sci Rep. 2021;11:11506.
Article CAS PubMed PubMed Central Google Scholar
Sasmita BR, Zhu Y, Gan H, Hu X, Xue Y, Xiang Z, Huang B, Luo S. Prognostic value of neutrophil-lymphocyte ratio in cardiogenic shock complicating acute myocardial infarction: a cohort study. Int J Clin Pract. 2021;75: e14655.
Article CAS PubMed Google Scholar
Kervinen H, Mänttäri M, Kaartinen M, Mäkynen H, Palosuo T, Pulkki K, Kovanen PT. Prognostic usefulness of plasma monocyte/macrophage and T-lymphocyte activation markers in patients with acute coronary syndromes. Am J Cardiol. 2004;94:993–6.
Article CAS PubMed Google Scholar
Yan X, Anzai A, Katsumata Y, Matsuhashi T, Ito K, Endo J, Yamamoto T, Takeshima A, Shinmura K, Shen W, et al. Temporal dynamics of cardiac immune cell accumulation following acute myocardial infarction. J Mol Cell Cardiol. 2013;62:24–35.
Article CAS PubMed Google Scholar
Ortega-Rodríguez AC, Marín-Jáuregui LS, Martínez-Shio E, Hernández Castro B, González-Amaro R, Escobedo-Uribe CD, Monsiváis-Urenda AE. Altered NK cell receptor repertoire and function of natural killer cells in patients with acute myocardial infarction: a three-month follow-up study. Immunobiology. 2020;225: 151909.
Article PubMed Google Scholar
Leuschner F, Rauch PJ, Ueno T, Gorbatov R, Marinelli B, Lee WW, Dutta P, Wei Y, Robbins C, Iwamoto Y, et al. Rapid monocyte kinetics in acute myocardial infarction are sustained by extramedullary monocytopoiesis. J Exp Med. 2012;209:123–37.
Article CAS PubMed PubMed Central Google Scholar
Nahrendorf M. Myeloid cell contributions to cardiovascular health and disease. Nat Med. 2018;24:711–20.
Article CAS PubMed PubMed Central Google Scholar
Swirski FK, Nahrendorf M. Cardioimmunology: the immune system in cardiac homeostasis and disease. Nat Rev Immunol. 2018;18:733–44.
Article CAS PubMed Google Scholar
Yellon DM, Hausenloy DJ. Myocardial reperfusion injury. N Engl J Med. 2007;357:1121–35.
Article CAS PubMed Google Scholar
Ridker PM, Everett BM, Thuren T, MacFadyen JG, Chang WH, Ballantyne C, Fonseca F, Nicolau J, Koenig W, Anker SD, et al. Antiinflammatory therapy with canakinumab for atherosclerotic disease. N Engl J Med. 2017;377:1119–31.
Article CAS PubMed Google Scholar
Moniaga CS, Watanabe S, Honda T, Nielsen S, Hara-Chikuma M. Aquaporin-9-expressing neutrophils are required for the establishment of contact hypersensitivity. Sci Rep. 2015;5:15319.
Article CAS PubMed PubMed Central Google Scholar
Wang H, Dou S, Wang C, Gao W, Cheng B, Yan F. Identification and experimental validation of Parkinson’s disease with major depressive disorder common genes. Mol Neurobiol. 2023;60:6092–108.
Article CAS PubMed Google Scholar
Radulescu E, Jaffe AE, Straub RE, Chen Q, Shin JH, Hyde TM, Kleinman JE, Weinberger DR. Identification and prioritization of gene sets associated with schizophrenia risk by co-expression network analysis in human brain. Mol Psychiatry. 2020;25:791–804.
Article CAS PubMed Google Scholar
Choi BY, Bair E, Lee JW. Nearest shrunken centroids via alternative genewise shrinkages. PLoS ONE. 2017;12: e0171068.
Article PubMed PubMed Central Google Scholar
Dreiseitl S, Ohno-Machado L, Kittler H, Vinterbo S, Billhardt H, Binder M. A comparison of machine learning methods for the diagnosis of pigmented skin lesions. J Biomed Inform. 2001;34:28–36.
Article CAS PubMed Google Scholar
Chiew CJ, Liu N, Wong TH, Sim YE, Abdullah HR. Utilizing machine learning methods for preoperative prediction of postsurgical mortality and intensive care unit admission. Ann Surg. 2020;272:1133–9.
Article PubMed Google Scholar
Regazzoni F, Chapelle D, Moireau P. Combining data assimilation and machine learning to build data-driven models for unknown long time dynamics-Applications in cardiovascular modeling. Int J Numer Method Biomed Eng. 2021;37: e3471.
Article PubMed PubMed Central Google Scholar
Peng JC, Ran ZH, Shen J. Seasonal variation in onset and relapse of IBD and a model to predict the frequency of onset, relapse, and severity of IBD based on artificial neural network. Int J Colorectal Dis. 2015;30:1267–73.
Article PubMed Google Scholar
Kalkan IH, Dağli U, Oztaş E, Tunç B, Ulker A. Comparison of demographic and clinical characteristics of patients with early vs. adult vs. late onset ulcerative colitis. Eur J Intern Med. 2013;24:273–7.
Article PubMed Google Scholar
Jiang L, Xia B, Li J, Ye M, Deng C, Ding Y, Luo H, Ren H, Hou X, Liu H, et al. Risk factors for ulcerative colitis in a Chinese population: an age-matched and sex-matched case-control study. J Clin Gastroenterol. 2007;41:280–4.
Article PubMed Google Scholar
Adler ED, Voors AA, Klein L, Macheret F, Braun OO, Urey MA, Zhu W, Sama I, Tadel M, Campagnari C, et al. Improving risk prediction in heart failure using machine learning. Eur J Heart Fail. 2020;22:139–47.
Article PubMed Google Scholar

Download references

Acknowledgements

We thank Suzhou Science & Technology Development Plan.

Funding

The research was funded by Suzhou Science & Technology Development Plan (SYSD2019222). Zhangjiagang science and technology plan project (ZKS2135), Youth science and technology project of Zhangjiagang Municipal Health Commission (ZJGQNKJ202211).

Author information

Lin Zhang and Yue Liu contributed equally.

Authors and Affiliations

State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, 10 Poyanghu Road, Jinghai, Tianjin, 301617, People’s Republic of China
Lin Zhang & Kaiyue Wang
Department of Nephropathy, Zhangjiagang TCM Hospital Affiliated to Nanjing University of Chinese Medicine, Zhangjiagang, 215600, Jiangsu, People’s Republic of China
Yue Liu, Min Huang, Zhenfang Du & Sheng Qiang
The First Affiliated Hospital of Guizhou, University of Traditional Chinese Medicine, Guiyang, 550025, Guizhou, People’s Republic of China
Xiangqin Ou
Tianjin Jinghai District Hospital, 14 Shengli Road, Jinghai, Tianjin, 301699, People’s Republic of China
Jiashun Zhou & Houliang Zhang

Authors

Lin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yue Liu
View author publications
You can also search for this author in PubMed Google Scholar
Kaiyue Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiangqin Ou
View author publications
You can also search for this author in PubMed Google Scholar
Jiashun Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Houliang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Min Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenfang Du
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Qiang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

LZ and YL wrote the original draft. LZ, KW, YL, JZ, and HZ performed the research. LZ, YL, XO, and KW analyzed the data. SQ, MH, ZD, JZ, and HZ designed the experiment and revised the manuscript.

Corresponding authors

Correspondence to Zhenfang Du or Sheng Qiang.

Ethics declarations

Ethics approval and consent to participate

The clinical trial part was approved by the Ethics Review Committee of Jinghai District Hospital (No. JHYYLL-2022-0307).

Consent for publication

This study has not been published before, and this publication has been approved by all authors.

Competing interests

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1.

The 39 DEGs in healthy controls and AMI patients. Table S2. GSEA enrichment of 45 terms. Table S3. GO enrichment of 160 terms. Table S4. DO enrichment of 41 terms. Table S5. Primary weight of DEGs in the six classification ML algorithms. Fig. S1. Difference between the 22 immune cells. Fig. S2. Correlation analysis of SOCS3 in 7 immune cell types. Fig. S3. Correlation analysis of AQP9 in 7 immune cell types. Fig. S4. The Correlation analysis of 9 clinical variables.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Zhang, L., Liu, Y., Wang, K. et al. Integration of machine learning to identify diagnostic genes in leukocytes for acute myocardial infarction patients. J Transl Med 21, 761 (2023). https://doi.org/10.1186/s12967-023-04573-x

Download citation

Received: 07 February 2023
Accepted: 25 September 2023
Published: 27 October 2023
DOI: https://doi.org/10.1186/s12967-023-04573-x

Integration of machine learning to identify diagnostic genes in leukocytes for acute myocardial infarction patients

Abstract

Background

Methods

Results

Conclusion

Similar content being viewed by others

Identification through machine learning of potential immune- related gene biomarkers associated with immune cell infiltration in myocardial infarction

Machine learning-based mRNA signature in early acute myocardial infarction patients: the perspective toward immunological, predictive, and personalized

Collapsing the list of myocardial infarction-related differentially expressed genes into a diagnostic signature

Introduction

Methods

Data acquisition

Data processing

Searching for DEGs

IML of six ML algorithms

Clinical validation

Results

Included datasets

DEG identification

Functional analysis

IML of six ML algorithms

Diagnosis value of candidate genes

Correlation analysis

Clinical validation

Discussion

Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Supplementary Information

Additional file 1: Table S1.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation