Introduction

In December 2019, a cluster of pneumonia cases of unknown origin was reported in Wuhan [1, 2]. The pathogen of the novel pneumonia was identified to be a novel β-coronavirus, currently named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), and has close phylogenetic similarity to SARS-CoV [3]. SARS-CoV-2 infection has been named coronavirus disease 2019 (COVID-19) by the World Health Organization (WHO) [4].

COVID-19 has now become a worldwide health concern. The estimated overall case-fatality rate for COVID-19 is approximately 1–2.3%, which is similar to that of Spanish influenza (2–3%) and much higher than that of seasonal influenza (0.1%) [5,6,7]. The severity of COVID-19 has been classified as mild, moderate, severe and critical per the WHO-China Joint Mission [8]. Among a total of 72 314 case records from the Chinese Center for Disease Control and Prevention, approximately 81% of COVID-19 cases were defined as mild, 14% of COVID-19 cases were severe, and 5% were critical [5]. The overall hospital mortality of COVID-19 cases is approximately 15–20%, but it is up to 40–49% among critical cases requiring intensive care unit admission [2, 5]. Therefore, it is important to evaluate the risk factors for disease progression in COVID-19 patients. Early identification of COVID-19 patients with possible progression of the disease is particularly important for optimal treatment choice and reducing mortality.

Studies have revealed some changes in haematologic and immunologic tests and have investigated risk factors for mortality and outcomes in patients with COVID-19 [2, 7, 9,10,11,12,13]. Older age, high Sequential Organ Failure Assessment (SOFA) score, d-dimer greater than 1 µg/mL, lymphocytopenia and history of coronary vascular disease were reported to increase the risk of death in patients with COVID-19 [7, 9,10,11].

Moreover, score prediction models, such as prognostic nutritional index (PNI), Systemic immune-inflammatory index (SII) have been used to predict prognosis in COVID-19 patients and other diseases [14,15,16,17,18]. However, the risk factors related to the progression of COVID-19 symptoms from mild/moderate to severe are still limited and need to be assessed. The immune response and inflammatory cytokines are important to analyse to elucidate the mechanisms of host responses and pathogenesis of COVID-19 [10, 19,20,21]. In addition to decreased T cells, natural killer (NK) cell immunotypes were recently reported to be related to COVID-19 disease severity [19, 22]. We enrolled patients with COVID-19 from two medical centres in China and aimed to evaluate the risk factors for progression to severe-stage disease based on the new combined score including NK cell information.

Materials and methods

Study design and participants

This study was a retrospective, multicentre study. Laboratory-confirmed COVID-19 admitted cases from Renmin Hospital of Wuhan University and West China Hospital of Sichuan University were reviewed. We included adult patients (aged ≥ 18 years) admitted between February 6 and April 6, 2020, with SARS-CoV-2 infection confirmed by RT-PCR. Informed consent was exempted because it was reported as grouped data with no identifying factors.

The diagnosis of COVID-19 was made according to the WHO interim guidance [23]. The severity of COVID-19 was assessed according to the Seventh Version of the Novel Coronavirus Pneumonia Diagnosis and Treatment Guidance from the National Health Commission of China [24]. According to the guidelines, patients were categorized into the mild, moderate, severe, or critical group upon admission [24].

Data collection

The clinical data, laboratory data and radiological data of all COVID-19 patients were obtained from the electronic medical records of the treating hospital. Data were reviewed and verified by a team of physicians (Dongbo Wu, Wei Jiang, Changhai Liu, Ming Wang and Lang Bai). Any missing or unclear records were collated and clarified through communication with local medical staff or patients and their families.

Detailed demographic information, comorbidities, symptoms, and disease severity of all patients were recorded or diagnosed on admission. Clinical and virological characteristics were recorded, e.g., age, sex, past medical history, and clinical findings, e.g., white blood cell (WBC) count, neutrophil count (NEU), lymphocyte count (LYM), haemoglobin (HGB), platelet count (PLT), prothrombin time (PT), D-dimer, alanine aminotransferase (ALT), aspartate aminotransferase (AST), γ-glutamyl transpeptidase (GGT), albumin (ALB), total bilirubin (TBIL), direct bilirubin (DBIL), uric acid (UA), creatinine (Cr), creatine kinase (CK), lactate dehydrogenase (LDH), brain natriuretic peptide (BNP), procalcitonin (PCT), c-reactive protein (CRP), neutrophils/lymphocyte ratio (NLR), other biochemical parameters and SARS-CoV2 RNA test results. There were no cases lost to follow-up in this study.

Statistical analysis

The baseline clinical characteristics and outcomes of the patients were assessed. Categorical data are presented as frequencies (percentages); continuous variables are presented as medians (range, minimum–maximum). The demographic, clinical and laboratory characteristics of COVID-19 patients were compared among groups. The Mann–Whitney U test was used to compare continuous variables. The χ2 test with Yates’ correction was used for 2 × 2 contingency data, and Pearson’s χ2 test was used for contingency data for variables with more than two categories.

To explore risk factors, or their interactions, associated with COVID-19 severity, univariable and multivariable logistic regression models were used to estimate odds ratios (ORs) and 95% confidence intervals (CIs). A Cox proportional hazards model was constructed sequentially introduced variables, and a significance level of p > 0.05 was used to remove variables from the model. Final model selection was performed by backward selection of all factors. Schoenfeld and Martingale residuals were used to check the proportional hazards assumption and nonlinearity, respectively.

Survival curves were compared using the Kaplan–Meier method (log-rank test) since time-to-mortality and time-to-event are crucial in interpreting the results. Estimates of adjusted hazard ratios (HRs), 95% CIs, and p values are displayed. Harrell’s concordance index (C-index) was used to assess the score’s discrimination ability. C-index values and the corresponding 95% CIs were estimated for each main study time point. In addition, bootstrapping, calibration curves, decision curves and clinical impact curves were applied to verify the nomogram. A two-sided p < 0.05 was considered statistically significant. All statistical analyses were performed using R software (version 3.5.2, http://CRAN.R-project.org, R Foundation, Vienna, Austria).

Ethics approval and consent to participate

This study was approved by the Ethics Committee of West China Hospital of Sichuan University (NO. 2020-444) and was allowed exemption from the requirement of informed consent. All research was conducted in accordance with the Declaration of Helsinki. Raw data generated or analysed during this study are provided in Additional file 8

Results

Demographic and clinical characteristics of the study population

A total of 239 COVID-19 patients were included in this study. In the present cohort, 216 (90.38%) patients had mild/moderate disease, and 23 (9.62%) patients experienced progression to severe disease. The demographic and clinical characteristics of the study population are presented in Table 1. The median age was 58 years (range 26–90 years), and 58.20% (139/239) were male. The median maximum temperature was 38 °C (range 36.0–41.0).

Table 1 Demographic and clinical characteristics of the study population

When comparing demographic data at admission, patients with progression to severe disease were more likely to be male and older aged (> 75 years) than patients without progression (p < 0.05, Table 1). The clinical manifestations were mainly as follows (Table 1): fever 77.8% (186/239), cough 60.3% (144/239), expectoration 23.8% (57/239), dyspnoea 10.5% (25/239), chest pain 4.6% (11/239), angina 4.2% (11/239), fatigue 28.5% (68/239), myalgia 9.2% (22/239), headache 5.0% (12/239), vomiting 1.7% (4/239) and diarrhoea 18% (43/239).

The clinical characteristics of the study population are summarized in Table 2. When comparing biochemical indexes of COVID-19 between moderate cases with and without progression, we found that there were significant differences in lymphocytes, NLR, CRP, AST, TBIL, DBIL, Cr, urea, glucose, sodium, PT, CD3+ T cells, CD4+ T cells, CD8+ T cells, CD19+ T cells, CD16+/ CD56+ NK cells between patients with and without progression (all p < 0.05); detailed information are listed in Table 2.

Table 2 The lab test and clinical characteristics of the study population

Univariant and multivariant Cox regression model for progression from mild/moderate to severe disease

When exploring risk factors for progression from mild/moderate to severe COVID-19, we compared the demographic and clinical data of moderate cases and cases with progression to severe disease. Using univariant and multivariant Cox regression models, the results showed a significant difference in pulmonary disease (11.20, 95% CI 2.50–49.70, p = 0.001), age over 75 (3.92, 95% CI 1.61–9.73, p = 0.003), IgM (6.31, 95% CI 1.99–19.60, p = 0.002), CD16+/CD56+ NK cells (3.40, 95% CI 1.31–9.13, p = 0.014) and AST (4.60, 95% CI 1.31–16.00, p = 0.018) (Table 3), which were the 5 independent risk factors for progression from mild/moderate to severe disease (Fig. 1). However, there were no significant impacts by other variables in our study population (see Additional file 6: Table S1). We also used the global Schoenfeld test of Cox diagnostic deviance and Cox proportional hazards model fit to evaluate these five independent risk factors for progression from mild/moderate to severe disease, which suggested good performance (see Additional file 1: Fig. S1, Additional file 2: Fig. S2, Additional file 3: Fig. S3).

Table 3 Univariant and multivariant COX regression model for progression from mild/moderate cases into severe case
Fig. 1
figure 1

Forest plot of significant factors in the Cox proportional hazards regression model. Shown in the figure are the HR and the 95% CI associated with the end point

Moreover, the Kaplan–Meier survival curve analysis and log-rank test showed a significant difference in survival curve in COVID-19 patients categorized by pulmonary disease, age, IgM, CD16 + /CD56 + NK cells and AST (see Additional file 4: Fig. S4a–e).

Development of a predictive score for progression from moderate to severe disease

Predictors including pulmonary disease, age, IgM, CD16 + /CD56 + NK cells and AST were enrolled in the development of predictive scores for COVID-19 patient progression from mild/moderate to severe disease. The new predictive score (pulmonary disease, age, IgM, CD16+/CD56+ NK cell, AST; PAINT score) = (pulmonary disease) × 2.4174 + (age > 75) × 1.3594 + (IgM < 0.84) × 1.8399 + (CD16+/CD56+ NK cell < 116.5) × 1.2246 + (AST > 25) × 1.5182.

The points contributing to each variable are shown in Additional file 5: Fig. S5. To demonstrate the ability of the new predictive score to identify more severe patients for early clinical treatment, Kaplan–Meier survival curve analysis was used to find the best cut-off value; a value of 14.687 points was found to divide the patients into mild/moderate and progression to severe disease groups (P = 0.001, Fig. 2).

Fig. 2
figure 2

Kaplan–Meier survival curve analysis of the PAINT score

We performed ROC analysis to evaluate the efficacy of the PAINT score model for predicting COVID-19 patients’ progression from mild/moderate to severe disease. We compared the PAINT score with the qSOFA and CURB-65 (confusion, uraemia, respiratory rate, BP, age > 65 years) scores. As demonstrated in Fig. 3, the C-index of the new predictive progression model for predicting progression from mild/moderate to severe disease was 0.902 ± 0.021. However, the C-index of the qSOFA and CURB-65 scores for the prediction of progression was 0.534 ± 0.027 and 0.561 ± 0.058, respectively. We also compared the new predictive progression model with the 5 independent risk factors (pulmonary disease, age, IgM, CD16 + /CD56 + NK cells and AST), and the C-index for the prediction of progression was 0.5432 ± 0.034, 0.639 ± 0.052, 0.683 ± 0.044, 0.647 ± 0.050 and 0.716 ± 0.036, respectively (see Additional file 7: Table S2). Moreover, we evaluated the predictive value of the PNI and SII score in our study population, the C-index was 0.814 ± 0.042 and 0.769 ± 0.039, respectively (see Additional file 7: Table S2). These findings suggested that the PAINT score might be suitable for predicting progression from mild/moderate to severe disease.

Fig. 3
figure 3

ROC analysis was used to evaluate the efficacy of the PAINT score model for predicting COVID-19 patients’ progression from mild/moderate to severe disease. C-index values and the corresponding 95% CIs were estimated for each of the main study time points to assess the score’s discrimination ability. P values represent the statistical significance of the differences between the new score and the other prognostic score or factor

For internal validation of the ability of the new predictive progression model, we performed concordance index analysis to evaluate the discrimination of the PAINT score. Better discrimination was observed with our PAINT score than with the qSOFA and CURB-65 scores (Fig. 4a). Moreover, we performed 1000 bootstrap internal validations, and our new predictive PAINT score also showed better discrimination (Fig. 4b).

Fig. 4
figure 4

For internal validation of the discriminability of the PAINT score model, we performed concordance index analysis (A) and 1000 bootstrap replicates (B)

Nomogram, calibration, decision curve and clinical impact curve for progression from mild/moderate to severe disease

In our study population, we used 5 variables (pulmonary disease, age, IgM, CD16 + /CD56 + NK cells and AST) to predict 28-day progression from mild/moderate to severe disease. According to the principles of nomogram score construction, each variable is given different points and weights. The nomogram score is shown in Fig. 5a. We evaluated the score of each variable in turn according to its clinical characteristics and examination results and then summarized the score according to the total score of the 5 variables. Based on the total score, the probability of progression to severe COVID-19 can be determined. The calibration curves for 28-day progression were also well defined in the internal validation set (Fig. 5b). Nomogram and decision curve analyses also indicated good performance of the PAINT score (Fig. 5c). Clinical impact curves were used to assess the clinical usefulness of the risk prediction nomogram (Fig. 5d).

Fig. 5
figure 5

Nomogram, calibration curve, decision curves and clinical impact curves for progression from mild/moderate to severe disease. a Nomogram. To use the nomogram, the value of an individual patient is located on each variable axis, and a line is drawn upward to determine the number of points received for each variable value. The sum of these numbers is located on the total point axis, and a line is drawn downward to the survival axes to determine the likelihood of 28-day progression to severe disease. b Calibration. The nomogram-predicted probability of nonsevere survival is plotted on the x-axis, and that of actual nonsevere survival is plotted on the y-axis. c Decision curve. The abscissa of this graph is the threshold probability, and the ordinate is the net benefit. d Clinical impact curve. The red curve (number of high-risk individuals) indicates the number of people who are classified as positive (high risk) by the model at each threshold probability; the blue curve (number of high-risk individuals with outcome) is the number of true positives at each threshold probability

Discussion

By using univariant and multivariant Cox regression models, we identified five independent risk factors (pulmonary disease, age, IgM, CD16+/CD56+ NK cell and AST) for progression to severe COVID-19 in the present study. We developed a new predictive score, the PAINT score, for progression to severe disease and found that a value of 14.687 points divided the patients into mild/moderate and progression to severe disease groups. We also established a new nomogram score to predict 28-day progression from mild/moderate to severe disease. These results may be important to predict progression of moderate COVID-19 to severe disease and may be helpful in identifying cases of potential progression in a timely manner to improve the prognosis.

SARS-CoV-2 is a single-stranded RNA virus that infects cells through its structural spike (S) protein binding the angiotensin-converting enzyme 2 (ACE2) receptor [25]. Then, the type 2 transmembrane serine protease (TMPRSS2) receptor cleaves ACE2, activating the S protein, which promotes virus uptake and mediates SARS-CoV-2 entry into host cells [2, 25]. Both ACE2 and TMPRSS2 are expressed in host cells, particularly the alveolar epithelial type II cells of COVID-19 patients [20]. COVID-19 has various clinical manifestations, and the common symptoms in hospitalized patients include fever (70–90%), dry cough (60–86%), shortness of breath (53–80%), fatigue (38%), myalgias (15–44%), nausea/vomiting or diarrhoea (15–39%), headache, weakness (25%), and rhinorrhoea (7%) [2]. In a retrospective study of 548 patients with COVID-19 in China, most patients with severe/critical and fatal disease presented with sputum and dyspnoea much more often than those with mild/moderate disease on admission who survived [4]. Eighty-one percent of patients had mild manifestations, 14% had severe manifestations, and 5% had critical manifestations (defined as respiratory failure, septic shock, and/or multiple organ dysfunction). A study of 20,133 hospitalized patients in the UK reported that the most common major comorbidities were chronic cardiac disease (30.9%), diabetes (20.7%), chronic pulmonary disease excluding asthma (17.7%), and chronic kidney disease (16.2%) [26]. In our study population, the most common major comorbidity was pulmonary disease. Moreover, we found that pulmonary disease was an independent risk factor for progression to severe COVID-19. The limited sample size may be responsible for this difference with the previous UK report [26].

We used Cox regression methods to explore the risk factors related to progression to severe COVID-19. Risk factors related to progression were reported in nonsevere COVID-19 patients, such as lymphocyte count, neutrophil count, CD4+ and CD8+ T cell counts, CRP, D-dimer, interleukin-6, interleukin-8, lactate dehydrogenase, age, dyspnoea on admission, and hypertension [10, 27,28,29]. From the present study, there were significant differences in lymphocytes, NLR, CRP, AST, Cr, CD3+ T cells, CD4+ T cells, CD8+ T cells, CD19+ T cells, and CD16+/CD56+ NK cells in mild/moderate COVID-19 cases with and without progression. We also found that pulmonary comorbidities had an impact on the risk of progression. In another study, four variables (comorbidity, dyspnoea on admission, lactate dehydrogenase and lymphocyte count) were included in a predictive model. A total score of 6 points was used to divide patients into high-risk and low-risk groups [27]. Moreover, the risk factors for progression to severe illness in COVID-19 patients with cancer not only included the previous variables of older age, interleukin 6, procalcitonin, D-dimer, and lymphocytes but also included tumour stage, tumour necrosis factor α, N-terminal pro-B-type natriuretic peptide, CD4+ T cells and albumin–globulin ratio [12, 13, 30]. Many reports have linked diabetes and obesity to more severe COVID-19 illness and worth progress [31]. In our study cohort, 10.5% patients had diabetes mellitus, including 9.3% in patients without progression and 21.7% in patients with progression. However, there was no significant difference of diabetes in patients with and without progression. This may be because of our sample size. It is necessary to further expand the sample size in the future study.

There were many predict score models for disease severity in COVID-19 patients, including early warning score, National Early Warning Score 2, q-COVID score, prognostic nutritional index score, Brescia‑COVID Respiratory Severity Scale score, systemic immune-inflammatory index score, COVID-GRAM score, etc. [18, 32,33,34] However, the important information of immune cells was not involved. To predict the risk of progression, we developed the new, predictive PAINT score which contained the NK cells. Using a value of 14.687 points, we could divide the patients into mild/moderate and progression to severe disease groups, with a higher C-index (0.902 ± 0.021) than that obtained with the qSOFA (0.534 ± 0.027), CURB-65 scores (0.561 ± 0.058), PNI score (0.814 ± 0.042) and SII score (0.769 ± 0.039). The internal validation of discrimination and 1000 bootstrap replicates showed the good ability of our new predictive progression model. Moreover, other risk models for COVID-19 death or mortality were reported and evaluated, such as APACHE II, SIRS, SOFA, qSOFA, COVID-19 score, COVID-PIRO score, the COVID-19 Risk of Complications Score, Pneumonia Severity Index, etc. [35,36,37,38] The Specification and validation of COVID-19 scoring systems should be performed and verified in the large real-world cohort study in the future.

The immune response to SARS-CoV-2 is key for the control and resolution of COVID-19 infection. T cells also play important roles in the immune response to SARS-CoV-2 infection. Lymphocytopenia was found to be one of the most common features in laboratory tests of COVID-19 patients, and reduced CD4+ and CD8+ T cell counts were predictive of disease progression [10, 11]. In addition to decreased levels of CD3+/CD4+ T lymphocytes, CD3+/CD8+ T lymphocytes and CD19+ B lymphocytes, CD16+/CD56+ NK cells were also decreased in the peripheral blood of COVID-19 patients, and these cells may play critical roles in the inflammatory cytokine storm [21]. NK immunotypes are related to COVID-19 disease severity, and high expression of perforin, NKG2C, and Ksp37 in NK cells may reflect the increased presence of adaptive NK cells in the circulation of patients with severe disease [19]. This may be the mechanism of NK cell activation in COVID-19 and the potential role of NK cells in host protection and immunopathology [19]. Compared with mild cases, significantly lower levels of immune cells including CD3 + T cell, CD4 + T cell, CD8 + T cell, B cell and NK cell were found in severe cases [39,40,41,42]. In our cohort, CD3+ T cells, CD4+ T cells, CD8+ T cells, CD19+ T cells, and CD16+/ CD56+ NK cells showed significant differences in mild/moderate COVID-19 cases with and without progression. Moreover, CD16+/CD56+ NK cells were also independent risk factors for progression from moderate to severe disease. In a multi-center study, Benjamin Kramer et al. [42] reported that the dysfunction of NK cell not only affects antiviral immune responses but may also be related to the development of fibrotic lung disease in severe COVID-19 cases. From the pathologic mechanism, untimely early production of TGFβ and associated NK cell dysfunction is a hallmark of severe COVID-19 [43]. TGFβ-mediated impairment of NK cell function may reduce virus control and be detrimental in severe COVID-19 cases [43]. A detailed map of the NK cell activation landscape in COVID-19 disease might be a meaningful indicator of progression.

Our study has several limitations. First, the study population only included patients from Renmin Hospital of Wuhan University (Central China region) and West China Hospital of Sichuan University (Southwest China region). The study sample size was relatively small, it must still be considered preliminary information. We plan to apply the PAINT score to further validate the predict value in the future studies. Second, the data were obtained from the electronic medical database of the two hospitals. Some cases had incomplete records for the exposure history and laboratory examinations, and some patients were diagnosed in the outpatient department, with incomplete medical records and laboratory testing that was only briefly documented. Third, many patients remained in the hospital, and the outcomes were unknown at the time of data collection. Fourth, detailed follow-up information was not included in our study results. Therefore, the uncertainty of bias might have inevitably affected our assessment. Further evaluation may be needed to validate our predictive model.

Conclusion

In conclusion, pulmonary disease, age, IgM, CD16 + /CD56 + NK cells and AST were independent predictors of progression for patients with COVID-19 in the present study. A predictive model for progression to severe COVID-19 based on the PAINT score might be helpful to identify patients at risk of progression. Moreover, more intensive surveillance and appropriate therapy should be considered in patients at high risk of progression to improve their prognosis in clinical practice. Future studies with larger numbers of patients will be useful for updating and validating this PAINT score to improve identification of patients at risk of progression to severe COVID-19.