Introduction

Lung cancer is the leading cause of cancer deaths in the world1,2. The imaging evaluation of a solitary pulmonary nodule (SPN) is complex, however, which can be improved by early detection and prompt treatment3,4,5,6,7,8. The American National Lung Screening Trial (NLST) 2011 showed that CT screening for lung cancer reduced mortality by 20% compared to chest X-rays9. Although plain CT is prominent in lung nodule detection, it is limited in differentiating benign from malignant10,11,12. To facilitate timely and personalized patient treatment, it is crucial to accurately characterize the nature of lung lesions. Performing a tissue biopsy is an invasive procedure, done especially on smaller nodules or in difficult-to-reach areas of the lung13,14. The PET-CT scan plays a crucial role in the diagnosis of pulmonary nodules. Nevertheless, it is associated with a notable incidence of false negatives, exemplified by cases where lung adenocarcinoma presents as subsolid nodules, as well as false positives, with pathological findings indicating inflammatory pseudotumors and tuberculosis. In addition, PET-CT requires expensive equipment and increases patients' financial burdens. In contrast to PET-CT, contrast-enhanced CT is relatively low-cost and remains the primary preoperative examination for most patients with lung nodules in developing countries. Contrast-enhanced CT helps to highlight blood vessels and other structures, making it easier to identify abnormalities such as tumors15. Lung nodule CT contrast enhancement reflects the nodule blood supply. The region without enhancement is strongly predictive of benign hypovascular lesion, and the region with a rich blood supply may reflect underlying nodule angiogenesis and indicate nodule malignancy16,17,18. Thus, a model for diagnosing lung nodules using contrast-enhanced CT is needed.

A study focusing on contrast-enhanced CT showed high sensitivity to differentiate benign and malignant nodules by using 15HU enhancement as a cut-off value (sensitivity 98%), however, the specificity for malignancy was only 50–60%17. These results showed that the only feature of enhancement value is not enough to effectively differentiate benign from malignant. In practical clinical work, radiologists will comprehensively consider the imaging features of nodule size, margin, and location, especially the heterogeneity of enhancement. This suggests that we need to incorporate more features to build the model, rather than just a single enhancement value feature.

This study aims to establish and authenticate two models that rely on plain CT and contrast-enhanced CT to predict the malignancy of solitary solid pulmonary nodules in a dual-center investigation. Through a comparative analysis of the diagnostic effectiveness of the two models, the study endeavors to elucidate the optimal preoperative diagnostic approach for solitary solid pulmonary nodules.

Materials and methods

Ethics approval

The study was conducted in compliance with the principles outlined in the Declaration of Helsinki and received approval from the Ethics Committees of the Tianjin Medical University Cancer Institute and Hospital (TMUCIH), and Second Hospital of Shanxi Medical University (SHSMU) (No. bc2021327), and all procedures adhered to pertinent guidelines and regulations. Informed consent was obtained from all participants.

A retrospective analysis of data from dual centers was conducted in this study. Between January 2012 and July 2021, 392 patients with pathologically confirmed solitary pulmonary nodules were recruited from center A, and 135 patients from center B. Both centers had the same inclusion and exclusion criteria: requiring patients to have a primary solitary solid lung nodule with a diameter less than 30mm on CT, to have received both plain CT and contrast-enhanced CT within one month before surgery, to have clear histologic types as indicated by postoperative pathology reports, and to have no metastasis. The study employed exclusion criteria consisting of four conditions: (1) solitary solid lung nodules with a diameter of 30mm or greater, (2) patients who underwent preoperative therapy such as neoadjuvant chemotherapy or radiotherapy, (3) patients with an unclear pathology result, and (4) unavailable contrast-enhanced CT images.

Clinical and radiological data

We retrospectively collected and analyzed radiological data from two hospitals. The study enrolled a total of 392 patients at Center A and 135 patients at Center B who met the inclusion and exclusion criteria. Preoperative contrast-enhanced and plain CT scans were acquired using SOMATOM Sensation 64, GE Discovery 750HD, GE Revolution, and Philips IQon. RadiAnt DICOM Viewer (version 2021.1) was used for image evaluation, in a lung window setting (width, 1450HU; level, -500HU) and a mediastinal window setting (width, 350HU; level, -500HU) respectively. After mutual consultation, two radiologists (C.X.N 10-year experience and Z.W.J 9-year experience) resolved discrepancies between CT characteristics given by each radiologist. Two clinical characteristics (1) age; (2) gender, fifteen plain CT characteristics (3) diameter (4) nodule location_1 (left upper, left lower, right upper, right middle, right lower); (5) nodule location_2 (peripheral, central); (6)shape (round/oval, irregular); (7)margin (smooth, lobulated, spiculated); (8) calcification; (9) fat; (10) necrosis; (11) cavitation; (12) air bronchogram; (13) pleural indentation; (14) vascular invasion; (15) post obstructive pneumonia; (16) satellite nodules; (17) Plain CT value and four enhanced CT characteristics (18) subjective enhancement (uniform, heterogeneous, no); (19) enhanced CT value; (20) enhancement difference; (21) enhancement rate (enhancement difference / plain CT value).

Computed tomography examination

CT examinations were performed at dual centers, utilizing the GE Discovery CT 750 HD and Siemens Somatom Sensation 64 CT system at the Center-A, and the GE Revolution CT, GE Discovery CT 750 HD, and Philips iQon spectral CT at the Center-B. The examinations consisted of an acquisition both with and without iodine contrast. The inspiratory scans are performed with the patient in the supine position from the apex to the base of the lungs. The scanning protocol was as follows: at Center-A, tube voltage 120 kVp with automatic tube current modulation. The iodine contrast agent Visipaque (Iodixanol, 270 mg/ml) was administered at an amount of 1.5 mL/kg and injection rate of 2.5 mL/s. Contrast agents were administered intravenously through the upper extremity. Scanning was performed 70 s after the start of the injection. At Center-B, tube voltage 120 kVp with automatic tube current modulation. The iodine contrast agent Visipaque (Iodixanol, 320 mg/ml) was administered at an amount of 60 ml and injection rate of 3.5 mL/s. Contrast agents were administered intravenously through the upper extremity. Scanning was performed 60 s after the start of the injection. The pitch, acquired slice thickness, and reconstructed slice thickness varied among the GE, Siemens, and Philips CT systems. Specifically, the GE CT system had a pitch of 0.984 and acquired and reconstructed slice thickness of 1.25 mm, the Siemens CT system had a pitch of 0.95 and acquired and reconstructed slice thickness of 1.5 mm, and the Philips CT system had a pitch of 1.23 and acquired and reconstructed slice thickness of 1.0 mm.

Model construction and evaluation

Our two models for lung nodule malignancy classification were built using 392 nodules from Center-A as a training cohort and 135 nodules from Center-B as an external validation cohort. Clinical and CT characteristics were normalized using a feature standardization method19. Based on the training cohort, model 1 (plain CT only) was built with two clinical and fifteen plain CT characteristics based on logistic regression (P < 0.05 in univariable analysis). Model 2 (plain + contrasted) was built with two clinical, fifteen plain CT characteristics, and four enhanced CT characteristics. In the external validation cohort, we verify the classification efficiency of the two models.

Statistical analysis

The present study utilized the Students' t-test to compare continuous data, which were expressed as mean values accompanied by standard deviation (SD), and categorical data, which were presented as percentages (%). Additionally, the Kolmogorov–Smirnov test was employed to compare non-normally distributed continuous variables, which were presented as medians with interquartile ranges (IQR). Statistical significance was determined by a p-value of 0.05 on both sides. The evaluation of the two cohorts involved the assessment of various performance metrics, including the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, negative predictive values (NPV), and positive predictive values (PPV). SPSS software version 20.0 (IBM Corp.) was used for the statistical analyses.

Results

Participants

A total of 527 preoperative patients with a solitary pulmonary solid nodule were gathered from two independent institutions in China, with a median size of 20.0 (IQR, 15.0–24.0) mm and a mean age of 55.9 (SD, 9.8) years, of which 52.9% were male. The training cohort comprised 392 patients from Center-A, with a median size of 20.0 (IQR, 15.0–24.0) mm, a mean age of 55.8 (SD, 9.9) years, and a male representation of 53.3%. The external validation cohort consisted of 135 patients from Center-B, with a median size of 20.0 (IQR, 16.0–24.0) mm, a mean age of 56.4 (SD, 9.6) years, and 51.9% male. All clinical and CT characteristics are summarized in Table 1.

Table 1 Characteristic baseline of patients in cohorts.

The development of two models for predicting pulmonary nodule malignancy

Three hundred and ninety-two nodules (adenocarcinoma, 115 [75.2%]; squamous cell carcinoma, 23 [15.0%]; large cell carcinoma, 15 [9.8%]); pulmonary hamartoma, 87 [36.4%]; sclerosing pneumocytoma, 27 [11.3%]; tuberculosis, 74 [31.0%]; inflammatory pseudonodule, 51 [21.3%]) were in the training cohort. To compare the diagnosis difference between plain CT and enhanced CT for solid pulmonary nodules. We constructed two logistic regression models that without and with contrast enhanced CT characteristics. In the plain CT model, a total of twenty-one variables were generated, with vascular invasion (1.0877) and pleural indentation (0.5985) being the two most significant variables for predicting nodule malignancy, and fat (− 0.9334) and smooth (− 0.7732) being the two most important variables for predicting nodule benign. In the contrast-enhanced CT model, twenty-five variables were generated, with heterogeneous enhancement (1.8129) and vascular invasion (0.9249) being the two most significant variables for predicting nodule malignancy, and fat (− 0.9425) and round/oval (− 0.7041) being the two most important variables for predicting nodule benign. The detailed variables information of the two models are shown in Tables 2 and 3.

Table 2 The relative weight of model 1 (plain CT only) CT characteristics for predicting malignancy.
Table 3 The relative weight of model 2 (plain + enhanced) CT characteristics for predicting malignancy.

Classification performance of the two models

The present study reports on the performance of two CT-based models in a training cohort. The model 1 (plain CT only) demonstrated a mean sensitivity, specificity, accuracy, PPV, NPV, and AUC (95%CI) of 0.85, 0.84, 0.84, 0.77, 0.90, and 0.92 (95%CI, 0.89–0.95), respectively. Model 2 (plain + contrast CT) exhibited a mean sensitivity, specificity, accuracy, PPV, NPV, and AUC (95%CI) of 0.91, 0.87, 0.88, 0.81, 0.94, and 0.95 (95%CI, 0.93–0.97), respectively. In the external validation cohort, the mean sensitivity, specificity, accuracy, PPV, NPV, and AUC (95%CI) of model 1 (plain CT only) were 0.79, 0.78, 0.79, 0.67, 0.87, and 0.88 (95%CI, 0.82–0.93). Model 2 (plain + contrast CT) were 0.88, 0.91, 0.90, 0.84, 0.93, 0.93 (95%CI, 0.88–0.98), respectively. The detailed prediction performance of model 1 and model 2 in two cohorts is shown in Table 4. The AUC curves for the two models in the two cohorts are shown in Fig. 1. Model 2 (plain + contrast CT) showed the highest diagnosis performance, we modeled the model as logistic distribution in the equation shown in Supplementary Materials 1. Examples of the cases are shown in Figs. 2 and 3.

Table 4 Classification performance of plain CT-based model and plain & enhanced CT-based model for nodule malignancy in the two cohorts.
Fig. 1
figure 1

The AUC curves of model 1 and model 2 in the two cohorts.

Fig. 2
figure 2

Examples of malignant case A 58-year-old female patient with a 25mm diameter, peripheral, irregular, lobulated, cavitation, pleural indentation lesion located in the left lower lobe. The mean density of the nodule was 41 HU on plain CT and 64 HU after contrast (heterogeneous enhancement). Model malignancy classification scores: Malignant (73.52%); Benign (26.48%). Histology: Adenocarcinoma.

Fig. 3
figure 3

Examples of benign case A 50-year-old female patient with a 16 mm diameter, peripheral, round, smooth lesion located in the right lower lobe. The mean density of the nodule was 36 HU on plain CT and 100 HU after contrast (uniform enhancement). Model malignancy classification scores: Malignant (0.08%); Benign (99.92%). Histology: Sclerosing pneumocytoma.

Discussion

Our dual-center study demonstrated that model 2 (plain + contrast CT ) with twenty-five CT variables which include four contrast-enhanced variables (enhanced CT value, enhancement rate, uniform enhancement, heterogeneous enhancement) had better prediction performance (0.93 [95%CI, 0.88–0.98]) than model 1 (plain CT only) (0.88 [95%CI, 0.82–0.93]) for solitary solid pulmonary nodules.

Various models, including the Mayo Clinic model, the Veterans Affairs (VA) model, and the Brock model (PanCan model), have been developed utilizing clinical and CT characteristics to assess the malignancy of lung nodules20,21,22. The Mayo Clinic model identified age, smoking history, cancer history, nodule diameter, spiculation, and upper lobe as predictors of malignant nodules20. The Brock model was developed to detect malignancy in nodules through low-dose CT screenings, utilizing predictors such as age, sex, family history of lung cancer, nodule location, emphysema, nodule size, and spiculation21. The Veterans Affairs utilized logistic regression to design a model specifically for solitary nodules, estimating the likelihood of malignancy based on factors such as age, nodule diameter, smoking history, and time since quitting smoking22. However, prior research has demonstrated that while these models exhibit strong performance on their respective datasets, their utility for detecting large lung nodules is limited, necessitating optimization of model characteristics before clinical application23,24,25,26,27. Our study showed similar relative variables for predicting nodule malignancy in the model, like nodule location, nodule diameter, shape, age, and gender. Furthermore, our model incorporates a greater number of semantic features, such as air bronchogram, pleural indentation, vascular invasion, postobstructive pneumonia, cavitation, necrosis, calcification, satellite nodules, and fat, as well as enhancement characteristics such as enhanced CT value, enhancement rate, uniform enhancement, and heterogeneous enhancement. The importance of semantic features has already been proved by previous study28. Xiang et al. showed six radiological characteristics (diameter, lobulation, calcification, spiculation, pleural indentation, and vascular invasion) were adopted as important predictors in their SVM model for the diagnosis of solid solitary pulmonary nodules with AUC 0.89. Our model 2 (plain + contrast CT) showed a higher AUC of 0.93 since we included more semantic features and enhancement characteristics (enhanced CT value, enhancement rate, uniform enhancement, heterogeneous enhancement).

The significance of CT enhancement level in the determination of malignancy in lung solid nodules has been established18,29. A lack of significant enhancement on contrast-enhanced CT (< 15HU) is indicative of a benign nodule. Consequently, contrast-enhanced CT has been widely utilized as the primary imaging examination technique before surgery, particularly in less developed nations12. Our study showed a logistic regression model based on plain CT only for predicting malignancy of solitary solid pulmonary nodules with sensitivity 0.85, specificity 0.84, and diagnostic accuracy 0.84 in the training cohort and 0.79, 0.78, 0.79 in the external validation cohort. When we added contrast-enhanced CT features into the model, it improved the diagnosis performance with a sensitivity of 0.91, specificity of 0.87, and diagnostic accuracy of 0.88 in the training cohort and 0.88, 0.91, and 0.90 in the external validation cohort. A study showed the diagnostic sensitivity, specificity, and accuracy of radiologists are approximately 0.76, 0.73, and 0.88 in a Chinese dedicated cancer hospital24. This means that compared with the subjective experience of radiologists, our model 2 has a 12% higher sensitivity and 18% higher specificity for diagnosing lung cancer. It effectively reduces the missed diagnosis of lung cancer and avoids excessive surgery caused by misdiagnosis. This again suggests enhanced CT could be the basis for solitary solid pulmonary nodules preoperative diagnosis especially when preoperative biopsy and PET-CT are not applicable. At the same time, due to the use of iodinated contrast agents, increases the risk of patient allergies and contrast-induced nephropathy. Developing more safe and lung cancer-specific contrast agents in the future is a direction for improving enhanced CT application.

The present study has identified certain limitations. Firstly, model 2 (plain + contrast CT) utilized in the study comprises only fundamental clinical information such as age and gender, while other factors such as smoking history, cancer history, and family history of cancer are worth considering for inclusion. Secondly, the enhanced CT values ​​used in the model are affected by scanning parameters, different contrast agent concentrations, and scanning time. Therefore, it is necessary to study the differences in enhancement characteristics between different CT equipment in the future. The study data solely comprised clinical patients, and thus, the efficacy of the model for lung cancer screening patients requires further verification. Additionally, the evaluation of the model was restricted to two datasets, and therefore, additional validation at various centers is necessary before its clinical application.

To conclude, a logistic regression model was constructed utilizing plain + contrast-enhanced CT characteristics, exhibiting superior efficacy in the assessment of malignancy in solitary solid lung nodules when compared to only plain CT-based models. The utilization of this model 2 (plain + contrast CT) enables radiologists to provide recommendations concerning follow-up or surgical intervention for preoperative patients presenting with solid lung nodules.