Individualized prediction of non-sentinel lymph node metastasis in Chinese breast cancer patients with ≥ 3 positive sentinel lymph nodes based on machine-learning algorithms

Xie, Xiangli; Fang, Yutong; He, Lifang; Chen, Zexiao; Chen, Chunfa; Zeng, Huancheng; Chen, Bingfeng; Huang, Guangsheng; Guo, Cuiping; Zhang, Qunchen; Wu, Jundong

doi:10.1186/s12885-024-12870-x

Individualized prediction of non-sentinel lymph node metastasis in Chinese breast cancer patients with ≥ 3 positive sentinel lymph nodes based on machine-learning algorithms

Research
Open access
Published: 02 September 2024

Volume 24, article number 1090, (2024)
Cite this article

Download PDF

You have full access to this open access article

BMC Cancer Aims and scope Submit manuscript

Individualized prediction of non-sentinel lymph node metastasis in Chinese breast cancer patients with ≥ 3 positive sentinel lymph nodes based on machine-learning algorithms

Download PDF

Xiangli Xie³^na1,
Yutong Fang¹^na1,
Lifang He¹^na1,
Zexiao Chen¹,
Chunfa Chen¹,
Huancheng Zeng¹,
Bingfeng Chen¹,
Guangsheng Huang¹,
Cuiping Guo¹,
Qunchen Zhang² &
…
Jundong Wu¹

73 Accesses
1 Altmetric
Explore all metrics

Abstract

Background

Axillary lymph node dissection (ALND) is a standard procedure for early-stage breast cancer (BC) patients with three or more positive sentinel lymph nodes (SLNs). However, ALND can lead to significant postoperative complications without always providing additional clinical benefits. This study aims to develop machine-learning (ML) models to predict non-sentinel lymph node (non-SLN) metastasis in Chinese BC patients with three or more positive SLNs, potentially allowing the omission of ALND.

Methods

Data from 2217 BC patients who underwent SLN biopsy at Shantou University Medical College were analyzed, with 634 having positive SLNs. Patients were categorized into those with ≤ 2 positive SLNs and those with ≥ 3 positive SLNs. We applied nine ML algorithms to predict non-SLN metastasis. Model performance was evaluated using ROC curves, precision-recall curves, and calibration curves. Decision Curve Analysis (DCA) assessed the clinical utility of the models.

Results

The RF model showed superior predictive performance, achieving an AUC of 0.987 in the training set and 0.828 in the validation set. Key predictive features included size of positive SLNs, tumor size, number of SLNs, and ER status. In external validation, the RF model achieved an AUC of 0.870, demonstrating robust predictive capabilities.

Conclusion

The developed RF model accurately predicts non-SLN metastasis in BC patients with ≥ 3 positive SLNs, suggesting that ALND might be avoided in selected patients by applying additional axillary radiotherapy. This approach could reduce the incidence of postoperative complications and improve patient quality of life. Further validation in prospective clinical trials is warranted.

View this article's peer review reports

Non-sentinel node metastasis prediction during surgery in breast cancer patients with one to three positive sentinel node(s) following neoadjuvant chemotherapy

Article Open access 18 March 2023

A new prediction nomogram of non-sentinel lymph node metastasis in cT1-2 breast cancer patients with positive sentinel lymph nodes

Article Open access 26 April 2024

Prediction of nonsentinel lymph node metastasis in breast cancer patients based on machine learning

Article Open access 11 August 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Medical Imaging

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

At present, for patients with early-stage breast cancer (BC) and clinically negative axillary lymph nodes (ALNs), axillary sentinel lymph node biopsy (SLNB), as opposed to axillary lymph node dissection (ALND), demonstrates no significant difference in local disease control, disease-free survival (DFS), and overall survival (OS). However, SLNB can significantly mitigate numbness, sensory loss, shoulder joint dysfunction, and the incidence of upper limb lymphedema associated with ALND, as confirmed by several international multi-center clinical trials [1,2,3].

For patients with clinically negative ALNs and a low metastatic burden of sentinel lymph nodes (SLNs) ≤ 2 tumor metastases, evidence from multiple clinical trials, including ACOSOG Z0011, IBCSG 23–01, AMAROS, and OTOASOR, indicates that ALND can be safely avoided when high-tangential whole-breast irradiation (WBI) is added after breast-conserving surgery or when additional axillary regional nodal irradiation (RNI) is included after total mastectomy, without affecting regional recurrence rates and OS. This approach has been widely accepted and applied in clinical practice to reduce the incidence of complications such as lymphedema caused by ALND [4,5,6,7,8]. However, ALND is still performed in some breast centers under these circumstances. Studies show that only about 23–34% of non-SLN have metastasis, meaning that in 66–77% of patients without non-SLN metastasis, ALND provides no benefit while increasing complications when combined with RNI [4,5,6,7]. As a result, some prediction models have been developed to predict the status of non-SLN metastasis based on SLN and clinicopathological features. Patients accurately predicted as Non-SLN negative by these models could even potentially be spared from RNI. These prediction models have been validated in clinical applications across multiple centers [9,10,11,12,13,14].

However, in early-stage BC and clinically negative axillary nodes, if there are ≥ 3 SLNs with metastasis, the incidence of non-SLN metastasis is considered to be significantly increased. Current guidelines and clinical practice recommend ALND in these cases, although there are limited data from separate studies on these patients in the real world. Subgroup data reported in studies show that patients with ≥ 3 positive SLNs account for about 10% of SLN-positive cases, and still, more than 30% of the non-SLNs show no metastasis during ALND [15]. For these patients, ALND does not alter the postoperative treatment plan nor provide additional benefits, suggesting that further ALND can be exempted. Alternatively, WBI and RNI without ALND may also reduce the complications affecting the upper limb. It remains clinically significant to establish a prediction model for patients with ≥ 3 positive SLNs to predict the metastasis status of non-SLNs as a means to exclude the necessity of ALND and to evaluate prognosis. The MonarchE trial demonstrated that early-stage patients with hormone receptor-positive, Human epidermal growth factor receptor 2 (HER2)-negative status, having ≥ 4 positive LNs or 1–3 positive LNs alongside other high-risk factors, experience sustained survival benefits from abemaciclib combined with standard adjuvant endocrine therapy compared to endocrine therapy alone [16]. Machine learning (ML) is an emerging field in medicine, encompassing a robust set of algorithms designed for data representation, adaptation, learning, prediction, and analysis. To date, these algorithms have not been employed to construct predictive models for non-SLN metastasis.

In this study, we examined DFS and OS, and performed univariate and multivariate Cox regression analyses in patients with ≥ 3 positive SLNs compared to those with ≤ 2 positive SLNs. We utilized ML algorithms to develop predictive models for non-SLN metastasis in patients with three or more positive SLNs and assessed the feasibility of adding WBI or RNI without the need for ALND.

Patients and methods

Patients

This was a retrospective study. From January 2010 to January 2023, a total of 2217 consecutive female patients diagnosed with primary invasive BC underwent SLNB at the Breast Center, Cancer Hospital of Shantou University Medical College (CHSU). Among these patients, 634 were found to have positive SLNs and met the following criteria. (1) negative clinical and imaging examinations, or negative pathohistological results for suspicious ALNs via hollow-core needle aspiration, with tumors staged as cT1-3N0M0 according to the eighth edition of the American Joint Committee on Cancer (AJCC) staging manual. (2) No prior neoadjuvant therapy. (3) Positive SLNs, including tumor micrometastases or macrometastases, identified after SLNB. (4) SLNB performed by an experienced surgical team. (5) Patients accepted further ALND. (6) Patients had no history of previous malignancy. (7) Complete follow-up time. Patients were excluded if they met any of the following criteria. (1) BC in situ. (2) Stage IV BC. (3) Isolated tumor cells (ITC) in SLNs. (4) Necessary clinical information unavailable. Patients were divided into the SLNs ≤ 2 group and the ≥ 3 positive SLN group according to the number of positive SLNs for subsequent analyses. Additionally, we recruited 42 patients who met the aforementioned inclusion criteria and had ≥ 3 positive SLNs from Jieyang People's Hospital (JPH) as the validation cohort. This study was approved by the Ethics Committees of CHSU (No. 2024038) and JPH (No. 2024054), and was conducted in accordance with the 1964 Helsinki Declaration and its subsequent amendments, or comparable ethical standards. Our Ethics Committees granted a waiver of informed consent.

Surgery and pathology

SLNB was performed using methylene blue (MB) injection (Jumpcan Pharmaceutical Group Co., Ltd., Jiangsu, China) and indocyanine green (ICG) solution (Dandong Yichuang Pharmaceutical Co., Ltd., Jilin, China). First, 2 mL MB was injected subcutaneously into the periareolar area near the outer upper quadrant, and 5 min later 1 mL ICG solution (0.5 mg/mL) was injected subcutaneously in the same area. Then, the fluorescence detector (Mingde Pharmaceutical Co., Ltd., Jiangsu, China) was used to observe along the lymphatic vessels and mark the point of fluorescence disappearance as the incision of SLNB. Palpable and/or fluorescent lymph nodes (ICG positive) and/or blue-stained lymph nodes (MB positive) were excised as SLNs. SLN metastasis was diagnosed by frozen section during operation or by postoperative paraffin section. If tumor macrometastasis or micrometastasis was found in more than 2 metastatic SLNs or in 1 to 2 SLNs and the patients were not willing to receive additional axillary RNI, we routinely performed level I or II ALND. If lymph nodes in level II displayed metastases, we also performed level III ALND. After the operation, all specimens were paraffin-embedded for immunohistochemistry.

Patient clinicopathological characteristics

The clinicopathological variables included age, tumor location, tumor size, multifocality, histological type, lymphovascular invasion, extracapsular extension (ECE), histological grade, estrogen receptor (ER), progesterone receptor (PR), HER2 status, Ki-67, molecular subtype, number of SLNs, number of negative SLNs, number of positive SLNs, size of the SLN metastasis, surgery, chemotherapy, radiotherapy, and endocrinotherapy. ER and PR were judged as positive if ≥ 1% of tumor cells showed nuclear staining. HER2-positive status was defined as a 3 + score by immunocytochemistry or HER2 gene amplification by fluorescent in situ hybridization (FISH). The Ki-67 assay follows the 2011 'International Ki-67 in BC Working Group Recommendations': level ≤ 14% was considered low expression; level > 14% was considered high expression. Macrometastasis was defined as metastatic lesions larger than 2 mm in diameter, and micrometastasis was defined as metastatic lesions larger than 0.2 mm and no larger than 2.0 mm in diameter or more than 200 tumor cells in the slice.

Survival analysis

We utilized the Kaplan–Meier (K-M) method to illustrate survival curves between the two cohorts. Univariate and multivariate Cox regression analyses were performed to ascertain independent prognostic factors for survival. The study endpoints encompassed OS and DFS. The survival interval was delineated as the duration from the date of BC diagnosis to the date of disease progression or recurrence, death, or the last follow-up.

Feature selection and model construction

The Boruta algorithm, a feature selection methodology rooted in Random Forest (RF), was employed to identify pivotal features by comparing authentic features against randomly generated "shadow features". For this purpose, Boruta version 8.0.0 was utilized. To predict non-SLN metastasis in patients with SLNs ≥ 3 positive, nine prevalent ML algorithms were deployed, including RF, logistic, extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), adaptive boosting (AdaBoost), decision tree (DT), gradient boosting decision tree (GBDT), complement naive bayes (CNB), and Support Vector Machine (SVM). To enhance the model's robustness, iterative testing and tuning were conducted through ten-fold cross-validation and grid search to ascertain the optimal hyperparameter settings. Patients from the CHSU were randomly divided into training and validation sets in a 7:3 ratio to select the most effective ML model. Additionally, patients from JPH served as an external validation cohort to further verify the extensiveness of the optimal model.

Evaluation of ML models

The performance of the ML models was evaluated using a variety of metrics, including receiver operating characteristic (ROC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F1-score, and Kappa. Performance was further assessed through precision-recall curves, demonstrating precision-recall relationships at various thresholds, and calibration curves, which compared the models' predicted probabilities against actual observed probabilities to evaluate bias and accuracy. Additionally, Decision Curve Analysis (DCA) was employed to ascertain the clinical utility of these models. Kolmogorov–Smirnov (K-S) curves, based on the cumulative distribution function, and confusion matrices were utilized to analyze model performance across different thresholds and to visualize the classification accuracy of the optimal model, respectively. Feature contributions were explained using SHAPley Additive exPlanations (SHAP) values, calculated via the "SHAP" software package. Statistical analyses were conducted using R software version 4.2.1 (r-project.org) and Python version 3.8 (Python Software Foundation), with a significance threshold set at p < 0.05.

Results

Clinicopathologic characteristics

This study evaluated 634 BC patients with positive SLNs, divided into two groups: 522 patients with SLNs ≤ 2 positive and 112 patients with SLNs ≥ 3 positive. Analysis from Table 1 indicated that the mean tumor size for the group with SLNs ≥ 3 positive was 3.57 cm, markedly larger than the 3.10 cm observed in the SLNs ≤ 2 positive group (P < 0.001). Moreover, the mean number of positive SLNs was 4.79 for patients with SLNs ≥ 3 positive, compared to 3.13 for those with SLNs ≤ 2 positive (P < 0.001). Patients with SLNs ≥ 3 positive also had fewer negative SLNs, averaging 1.28, versus 1.86 in the SLNs ≤ 2 positive group (P < 0.001). Furthermore, a significantly higher proportion of ER and PR positivity was observed in the SLNs ≤ 2 positive group, while the rates of HER2 and Ki-67 positivity were significantly lower compared to the SLNs ≥ 3 positive group. In terms of treatment, the SLNs ≤ 2 positive group had a higher rate of breast-conserving surgery, while the SLNs ≥ 3 positive group received more chemotherapy, radiotherapy, and endocrinotherapy.

Table 1 Clinical characteristics of SLN-positive patients

Full size table

Survival analysis

To evaluate the survival disparities between patients with SLNs ≤ 2 positive and those with SLNs ≥ 3 positive, Kaplan–Meier survival analysis was conducted. This analysis revealed no significant differences in OS and DFS between the two cohorts (OS: P = 0.129; DFS: P = 0.228, Fig. 1A and B). Additionally, univariate and multivariate Cox regression analyses were performed to identify independent predictors influencing survival disparities among these groups. Initial multicollinearity tests indicated high collinearity, as the generalized variance inflation factors (GVIFs) for molecular subtype, ER, and endocrinotherapy exceeded 5 (Table S1). Therefore, molecular subtype and endocrinotherapy were excluded from further analysis. Further univariate and multivariate Cox regression analyses identified tumor size as an independent risk factor for OS, while G2 was an independent protective factor for OS. Regarding DFS, the number of SLNs was an independent risk factor for OS, whereas G2, the number of negative SLNs, and radiotherapy were independent protective factors for DFS (Table 2). Crucially, these analyses confirmed the absence of statistically significant survival differences between patients with SLNs ≤ 2 positive and those with SLNs ≥ 3 positive. However, forest plots showed survival differences within receptor subgroups between patients with SLNs ≤ 2 positive and those with SLNs ≥ 3 positive (Fig. 1C and D). OS for the SLNs ≤ 2 positive group was superior to that of the SLNs ≥ 3 positive group in the ER-positive and PR-positive subgroups. Furthermore, DFS for the SLNs ≤ 2 positive group was better than that for the SLNs ≥ 3 positive group in the ER-positive, PR-positive, and HER2-negative subgroups.

Table 2 Univariate and multivariate Cox analyses of SLN-positive patients

Full size table

Clinical characteristics and selection of features in patients with SLNs ≥ 3 positive

Among the 112 patients with ≥ 3 positive SLNs, 25% (28/112) did not have non-SLN metastases (Table 3). The incidence of micrometastases in SLNs was significantly higher in patients without non-SLN metastases compared to those with such metastases (P < 0.001). To enhance the identification of patients with ≥ 3 positive SLNs who did not develop non-SLN metastases, we developed predictive models using nine ML algorithms. Initial feature correlation analysis revealed that the number of negative SLNs had a correlation coefficient above 0.7, indicating significant multicollinearity, as did the relationship between estrogen and progesterone receptors (Fig. 2A). Consequently, the interaction between the number of negative SLNs and PR was excluded from further analysis. Boruta’s algorithm identified four critical features: size of positive SLNs, tumor size, number of SLNs, and ER status, as significant predictors (Fig. 2B).

Table 3 Clinical characteristics of patients with ≥3 positive SLN

Full size table

Construction and evaluation of models

We integrated key features into the construction of ML models to predict the risk of non-SLN metastasis in patients with ≥ 3 positive SLNs. Figure 3A and Table 4 presented the performance of the nine ML models in predicting non-SLN metastasis within the training group. The RF model demonstrated superior predictive ability, with an AUC of 0.987, accuracy of 0.955, F1-score of 0.977, and Kappa statistic of 0.855. At the optimal cutoff, the RF model achieved a sensitivity of 0.966, specificity of 0.964, PPV of 0.988, and NPV of 0.871. The AUCs for the Logistic, XGBoost, LightGBM, AdaBoost, DT, GBDT, CNB, and SVM models were 0.648, 0.743, 0.669, 0.910, 0.856, 0.694, 0.648, and 0.757 respectively, with corresponding accuracies of 0.745, 0.640, 0.412, 0.754, 0.677, 0.448, 0.521, and 0.743. The F1-scores for these models were 0.823, 0.785, 0.402, 0.857, 0.608, 0.358, 0.576, and 0.827, respectively. Figure 3B and Table 4 illustrated the performance of the nine ML models in the validation group. Again, the RF model outperformed the others, achieving an AUC of 0.828, accuracy of 0.832, F1-score of 0.882, and Kappa of 0.569. The Logistic, XGBoost, LightGBM, AdaBoost, DT, GBDT, CNB, and SVM models yielded AUCs of 0.559, 0.619, 0.592, 0.793, 0.677, 0.638, 0.592, and accuracies of 0.668, 0.558, 0.367, 0.698, 0.574, 0.414, 0.428, 0.609, respectively. Their F1-scores were 0.659, 0.521, 0.230, 0.838, 0.398, 0.320, 0.204, and 0.707, respectively.

Table 4 Performance of machine learning prediction models for non-SLN metastasis in the training and validation groups

Full size table

PR curves were utilized to assess the precision and recall of the models at various thresholds. Figure 3C and D depict the average precision (AP) scores of the nine models in both the training and validation groups. Notably, the RF model exhibited the highest performance, achieving AP scores of 0.995 and 0.918 in the training and validation groups, respectively. Additionally, the calibration curves demonstrated a strong concordance between the predicted probabilities and the actual observations for the RF model (Fig. 3E). DCA was employed to evaluate the clinical utility of the models. The results indicated that the RF model provided significant net clinical benefits in predicting non-SLN metastasis (Fig. 3F). Consequently, the RF model emerged as the optimal choice for predicting non-SLN metastasis in patients with ≥ 3 positive-SLNs.

Performance and interpretability of RF model

The K-S curve was instrumental in identifying the optimal classification threshold that maximized the difference between the true positive and true negative rates. By selecting the threshold with the highest K-S statistic, we enhanced classification performance. The results revealed that the maximum K-S statistic was 0.762 at an intercept value of 0.236 (Fig. 4A). In clinical practice, accurately predicting patients who were less likely to develop non-SLN metastasis prevented unnecessary ALND. The RF model accurately predicted non-SLN metastasis status in 86.4% (19/22) of patients in the training group (Fig. 4B). Furthermore, the RF model successfully predicted the non-SLN-negative status in 80% (5/6) of the cases in the validation group (Fig. 4C).

The parallel coordinates plot visualized the distribution and trends among various features, facilitating a comprehensive comparison (Fig. 4D). Subsequently, SHAP analysis elucidated the predictive mechanisms of the RF model for non-SLN metastasis by quantifying the importance of each feature. The SHAP values for each feature varied across different levels; features with increasing values turned progressively redder, while decreasing values shifted towards blue (Fig. 4E). Notably, a feature point positioned to the right of the axis signified an increased risk of non-SLN metastasis, whereas a point on the left indicated a reduced risk. Furthermore, the features were ranked based on their importance to the model (Fig. 4E). Higher-ranked features played a more crucial role in the model’s decision-making process. Notably, tumor size and the number of SLNs were the most valued by the RF model for predicting non-SLN metastasis.

External validation of RF model

To further assess the robustness and applicability of the RF model, we enrolled 42 patients with ≥ 3 positive-SLNs from JPH as an external validation cohort (Table S2). The RF model demonstrated strong performance, achieving an AUC of 0.870 in this cohort (Fig. 5A). Additionally, DCA confirmed the clinical utility of the RF model in the external validation cohort (Fig. 5B).

Discussion

In our study, patients with SLNs ≥ 3 positive constituted 17.7% of SLN-positive BC patients, a higher proportion compared to the 5.6–10.7% reported in the literature [14, 15, 17, 18]. This discrepancy might be attributed to our use of MB in combination with ICG during SLNB, which typically achieves a higher detection rate than using either dye alone, thereby reducing the rate of missed diagnoses. Meanwhile, the removal of palpable nodes during SLNB undoubtedly increases the detection rate of SLNs. The NSABP B-32 trial demonstrated that the false-negative rate was 10.0% when two SLNs were found, 6.9% when three were identified, and 5.5% when four were detected [19]. This may also reflect variations in clinicians' understanding and performance of SLNB. Research has demonstrated that preoperative imaging for ALN identification and the development of a nomogram can aid in predicting the likelihood of involvement of three or more lymph nodes [20]. This approach enhances the accurate assessment of ALN metastasis and reduces the proportion of patients who are clinically negative for ALNs but have SLNs ≥ 3 positive intraoperatively. Some studies have reported that the expression of ER, PR, HER2, and Ki-67 can be used as predictors of no-SLN metastasis in patients with SLNs ≤ 2 positive. However, neither ER/PR, HER2, nor Ki-67 status independently predicted no-SLN metastasis [8, 10,11,12]. In our study, although the statuses of ER, PR, HER2, and Ki-67 in patients with SLNs ≥ 3 positive did not significantly relate with non-SLN metastasis, the Boruta algorithm's feature selection suggests that the ER status should be included in the construction of ML models.

In this study, 75% of patients with SLNs ≥ 3 positive exhibited non-SLN metastasis. Other studies have reported that 55.5–67.7% of patients with SLNs ≥ 3 positive experience non-SLN metastasis [15, 18]. This discrepancy may be related with the false-negative rate of hollow-core needle biopsies of suspicious ALNs in our study. Among patients with SLNs ≤ 2 positive, approximately 30% have non-SLN metastasis, while 70% do not [8, 21]. In recent years, various study designs and the use of graphical and numerical models have been employed to predict non-SLN metastatic status in early BC. Several nomograms and scoring systems based on clinicopathological variables have been developed to estimate the probability of non-SLN metastasis in patients with early BC and SLNs ≤ 2 positive. Notable examples include the Memorial Sloan Kettering Cancer Center (MSKCC) nomogram [9], the Cambridge nomogram [10], the Stanford nomogram [11], the Tenon score [22], the MD Anderson Cancer Center Score [13], and the Shanghai Cancer Hospital (SCH) nomogram [23]. These models have been validated and provide improved predictions of non-SLN metastasis status. For patients predicted by the model to have no non-SLN metastasis, avoiding further ALND and additional axillary radiotherapy can reduce the incidence of lymphedema and shoulder joint complications. However, for patients with SLNs ≥ 3 positive, it remains necessary to develop and validate models for clinical application, which would be valuable for further axillary management. For patients with a very low probability of non-SLN metastasis predicted by the model, ALND can be avoided while adding axillary radiotherapy, thus reducing surgical trauma and the occurrence of upper limb lymphedema. Therefore, in patients with SLNs ≥ 3 positive, the model could be useful for identifying the presence of non-SLN metastasis in early BC.

In this study, we employed machine-learning models to predict the risk of non-SLN metastasis in patients with SLNs ≥ 3 positive. Among the nine ML models used in the training set, the RF model exhibited the best performance in predicting non-SLN metastasis, achieving an AUC of 0.987, an accuracy of 0.955, an F1-score of 0.977, and a Kappa of 0.855. The RF model also demonstrated superior performance in the validation cohort, with an AUC of 0.828. These results suggest that the RF model is an excellent tool for evaluating whether patients with SLNs ≥ 3 positive can be exempted from ALND. The model demonstrates superior performance compared to previous logistic models. Reports indicate that the average accuracy, specificity, sensitivity, and AUC of the deep ML model TabNet are significantly better than those of the logistic regression model [24]. Our model's AUC also surpasses that of models predicting the risk of non-SLN metastasis in patients with SLNs ≤ 2 positive, such as the MSKCC and other prediction nomograms, where the AUC varies from 0.6 to 0.8 due to regional and patient population differences [9, 25, 26]. The SCH nomogram, the first model established using a population of Chinese individuals diagnosed with BC, had an original AUC of 0.779 [27]. Recently, Yang et al. also developed a new nomogram to predict the non-SLN status including early patients with SLNs ≥ 3 positive, with an AUC of 0.701–0.813 [14].

In clinical applications, it is crucial that the model accurately predicts patients without non-SLN metastasis to avoid unnecessary ALND. Our prediction model, RF, demonstrated an accuracy of 87.1% in identifying non-SLN-negative patients in the training cohort. Furthermore, the RF model successfully predicted 71.7% of non-SLN-negative patients in the validation cohort, surpassing the accuracy of previous prediction models for SLNs ≤ 2 positive [10, 11, 23, 25, 26, 28]. A significant advantage of the RF model is its ability to rank the features based on their importance, with higher-ranked features contributing more to the model. Our study revealed that the RF model prioritized tumor size and the number of SLNs, which were pivotal in our decision to prioritize the exemption of ALND. The prediction accuracy and predictive weight of the proposed model were superior to those of previous models for SLNs ≤ 2 positive [10, 11, 23, 25, 26, 28].

The AUC of the RF model in this study was 0.870 in the external validation cohort, indicating high validation efficiency. Its prediction accuracy, clinical applicability, and robustness were also excellent. Notably, a prediction model for non-SLN metastasis tailored to individuals diagnosed with BC in the Chinese population is needed [23]. Correspondingly, this RF model is particularly suitable for female BC patients in this region, and the model predicting non-SLN metastasis status when SLNs ≥ 3 positive is appropriate for regional promotion and application.

The limitations of this study are as follows: (1) The small sample size reduces the overall reliability and generalizability of the results. (2) The prediction model requires prospective clinical comparative study data to further validate its predictive efficiency. (3) Additional follow-up data are necessary to confirm local disease control, DFS, OS, and complications in the affected upper limbs of patients exempt from ALND.

Conclusion

Our study developed ML models to predict non-SLN metastatic status in patients with SLNs ≥ 3 positive, based on the Chinese BC population. The results demonstrate that when clinical ALNs are negative and SLNs ≥ 3 positive, it is feasible to construct RF prediction models using clinicopathological characteristics of the patients. The prediction accuracy and efficiency are excellent, making it applicable to regional populations. For cases with a very low rate of non-SLN metastasis as predicted by the model, ALND can be avoided by incorporating axillary radiotherapy.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Abbreviations

BC:: Breast cancer
SLNB:: Sentinel lymph node biopsy
ALND:: Axillary lymph node dissection
DFS:: Disease-free survival
OS:: Overall survival
SLN:: Sentinel lymph node
non-SLN:: Non-sentinel lymph node
WBI:: Whole-breast irradiation
RNI:: Regional nodal irradiation
ML:: Machine learning
CHSU:: Cancer Hospital of Shantou University Medical College
AJCC:: American Joint Committee on Cancer
ITC:: Isolated tumor cells
JPH:: Jieyang People's Hospital
ECE:: Extracapsular extension
ER:: Estrogen receptor
PR:: Progesterone receptor
HER2:: Human epidermal growth factor receptor 2
FISH:: Fluorescent in situ hybridization
K-M:: Kaplan–Meier
RF:: Random Forest
XGboost:: Extreme gradient boosting
LightGBM:: Light gradient boosting machine
AdaBoost:: Adaptive boosting
DT:: Decision tree
GBDT:: Gradient boosting decision tree
CNB:: Complement naive bayes
SVM:: Support Vector Machine
ROC:: Receiver operating characteristic
PPV:: Positive predictive value
NPV:: Negative predictive value
DCA:: Decision Curve Analysis
K-S:: Kolmogorov–Smirnov
SHAP:: SHAPley Additive exPlanations
GVIF:: Generalized variance inflation factor
AP:: Average precision
MSKCC:: Memorial Sloan Kettering Cancer Center
SCH:: Shanghai Cancer Hospital

References

Krag DN, Anderson SJ, Julian TB, et al. Sentinel-lymph-node resection compared with conventional axillary-lymph-node dissection in clinically node-negative patients with breast cancer: overall survival findings from the NSABP B-32 randomised phase 3 trial. Lancet Oncol. 2010;11(10):927–33.
Article PubMed PubMed Central Google Scholar
Fleissig A, Fallowfield LJ, Langridge CI, et al. Post-operative arm morbidity and quality of life. Results of the ALMANAC randomised trial comparing sentinel node biopsy with standard axillary treatment in the management of patients with early breast cancer. Breast Cancer Res Treat. 2006;95(3):279–93.
Article PubMed Google Scholar
Ashikaga T, Krag DN, Land SR, et al. Morbidity results from the NSABP B-32 trial comparing sentinel lymph node dissection versus axillary dissection. J Surg Oncol. 2010;102(2):111–8.
Article PubMed PubMed Central Google Scholar
Giuliano AE, Ballman K, McCall L, Beitsch P, Whitworth PW, Blumencranz P, et al. Locoregional recurrence after sentinel lymph node dissection with or without axillary dissection in patients with sentinel lymph node metastases: long-term follow-up from the American College of Surgeons Oncology Group (Alliance) ACOSOG Z0011 randomized trial. Ann Surg. 2016;264(3):413–20. https://doi.org/10.1097/SLA.0000000000001863.
Article PubMed Google Scholar
Giuliano AE, Ballman KV, McCall L, Beitsch PD, Brennan MB, Kelemen PR, et al. Effect of axillary dissection vs no axillary dissection on 10-year overall survival among women with invasive breast cancer and sentinel node metastasis: the ACOSOG Z0011 (Alliance) randomized clinical trial. JAMA. 2017;318(10):918–26. https://doi.org/10.1001/jama.2017.11470.
Article PubMed PubMed Central Google Scholar
Galimberti V, Cole BF, Viale G, Veronesi P, Vicini E, Intra M, et al. Axillary dissection versus no axillary dissection in patients with breast cancer and sentinel-node micrometastases (IBCSG 23–01): 10-year follow-up of a randomised, controlled phase 3 trial. Lancet Oncol. 2018;19(10):1385–93. https://doi.org/10.1016/S1470-2045(18)30380-2.
Article PubMed Google Scholar
Donker M, van Tienhoven G, Straver ME, Meijnen P, van de Velde CJ, Mansel RE, et al. Radiotherapy or surgery of the axilla after a positive sentinel node in breast cancer (EORTC 10981–22023 AMAROS): a randomised, multicentre, open-label, phase 3 non-inferiority trial. Lancet Oncol. 2014;15(12):1303–10. https://doi.org/10.1016/S1470-2045(14)70460-7.
Article PubMed PubMed Central Google Scholar
Sávolt Á, Péley G, Polgár C, Udvarhelyi N, Rubovszky G, Kovács E, et al. Eight-year follow up result of the OTOASOR trial: the optimal treatment of the axilla - surgery or radiotherapy after positive sentinel lymph node biopsy in early-stage breast cancer: a randomized, single centre, phase III, non-inferiority trial. Eur J Surg Oncol. 2017;43(4):672–9. https://doi.org/10.1016/j.ejso.2016.12.011.
Article PubMed Google Scholar
Van Zee KJ, Manasseh DM, Bevilacqua JL, Boolbol SK, Fey JV, Tan LK, et al. A nomogram for predicting the likelihood of additional nodal metastases in breast cancer patients with a positive sentinel node biopsy. Ann Surg Oncol. 2003;10(10):1140–51. https://doi.org/10.1245/aso.2003.03.015.
Article PubMed Google Scholar
Pal A, Provenzano E, Duffy SW, Pinder SE, Purushotham AD. A model for predicting non-sentinel lymph node metastatic disease when the sentinel lymph node is positive. Br J Surg. 2008;95(3):302–9. https://doi.org/10.1002/bjs.5943.
Article CAS PubMed Google Scholar
Kohrt HE, Olshen RA, Bermas HR, Goodson WH, Wood DJ, Henry S, et al. New models and online calculator for predicting non-sentinel lymph node status in sentinel lymph node positive breast cancer patients. BMC Cancer. 2008;8: 66. https://doi.org/10.1186/1471-2407-8-66.
Article PubMed PubMed Central Google Scholar
Duijm LE, Groenewoud JH, Roumen RM, de Koning HJ, Plaisier ML, Fracheboud J. A decade of breast cancer screening in The Netherlands: trends in the preoperative diagnosis of breast cancer. Breast Cancer Res Treat. 2007;106(1):113–9. https://doi.org/10.1007/s10549-006-9468-5.
Article PubMed Google Scholar
Hwang RF, Krishnamurthy S, Hunt KK, Mirza N, Ames FC, Feig B, et al. Clinicopathologic factors predicting involvement of nonsentinel axillary nodes in women with breast cancer. Ann Surg Oncol. 2003;10(3):248–54. https://doi.org/10.1245/aso.2003.05.020.
Article PubMed Google Scholar
Yang L, Zhao X, Yang L, Chang Y, Cao C, Li X, et al. A new prediction nomogram of non-sentinel lymph node metastasis in cT1-2 breast cancer patients with positive sentinel lymph nodes. Sci Rep. 2024;14(1):9596. https://doi.org/10.1038/s41598-024-60198-0.
Article CAS PubMed PubMed Central Google Scholar
Maimaitiaili A, Wu D, Liu Z, Liu H, Muyiduli X, Fan Z. Analysis of factors related to non-sentinel lymph node metastasis in 296 sentinel lymph node-positive Chinese breast cancer patients. Cancer Biol Med. 2018;15(3):282–9. https://doi.org/10.20892/j.issn.2095-3941.2018.0023.
Article CAS PubMed PubMed Central Google Scholar
Johnston S, Toi M, O’Shaughnessy J, Rastogi P, Campone M, Neven P, et al. Abemaciclib plus endocrine therapy for hormone receptor-positive, HER2-negative, node-positive, high-risk early breast cancer (monarchE): results from a preplanned interim analysis of a randomised, open-label, phase 3 trial. Lancet Oncol. 2023;24(1):77–90. https://doi.org/10.1016/S1470-2045(22)00694-5.
Article CAS PubMed Google Scholar
Tong C, Miao Q, Zheng J, Wu J. A novel nomogram for predicting the decision to delayed extubation after thoracoscopic lung cancer surgery. Ann Med. 2023;55(1):800–7. https://doi.org/10.1080/07853890.2022.2160490.
Article PubMed PubMed Central Google Scholar
Dong LF, Xu SY, Long JP, Wan F, Chen YD. Role of number of sentinel nodes in predicting non-sentinel node metastasis in breast cancer. J Int Med Res. 2018;46(2):828–35. https://doi.org/10.1177/0300060517729589.
Article PubMed Google Scholar
Krag DN, Anderson SJ, Julian TB, Brown AM, Harlow SP, Ashikaga T, et al. Technical outcomes of sentinel-lymph-node resection and conventional axillary-lymph-node dissection in patients with clinically node-negative breast cancer: results from the NSABP B-32 randomised phase III trial. Lancet Oncol. 2007;8(10):881–8. https://doi.org/10.1016/S1470-2045(07)70278-4.
Article CAS PubMed Google Scholar
Ahn SK, Kim MK, Kim J, Lee E, Yoo TK, Lee HB, et al. Can we skip intraoperative evaluation of sentinel lymph nodes? Nomogram predicting involvement of three or more axillary lymph nodes before breast cancer surgery. Cancer Res Treat. 2017;49(4):1088–96. https://doi.org/10.4143/crt.2016.473.
Article PubMed PubMed Central Google Scholar
Ortega Expósito C, Falo C, Pernas S, Pérez Carton S, Gil Gil M, Ortega R, et al. The effect of omitting axillary dissection and the impact of radiotherapy on patients with breast cancer sentinel node macrometastases: a cohort study following the ACOSOG Z0011 and AMAROS trials. Breast Cancer Res Treat. 2021;189(1):111–20. https://doi.org/10.1007/s10549-021-06274-9.
Article PubMed Google Scholar
Barranger E, Coutant C, Flahault A, Delpech Y, Darai E, Uzan S. An axilla scoring system to predict non-sentinel lymph node status in breast cancer patients with sentinel lymph node involvement. Breast Cancer Res Treat. 2005;91(2):113–9. https://doi.org/10.1007/s10549-004-5781-z.
Article PubMed Google Scholar
Chen JY, Chen JJ, Xue JY, Chen Y, Liu GY, Han QX, et al. Predicting non-sentinel lymph node metastasis in a Chinese breast cancer population with 1–2 positive sentinel nodes: development and assessment of a new predictive nomogram. World J Surg. 2015;39(12):2919–27. https://doi.org/10.1007/s00268-015-3189-z.
Article PubMed Google Scholar
Shahriarirad R, Meshkati Yazd SM, Fathian R, Fallahi M, Ghadiani Z, Nafissi N. Prediction of sentinel lymph node metastasis in breast cancer patients based on preoperative features: a deep machine learning approach. Sci Rep. 2024;14(1):1351. https://doi.org/10.1038/s41598-024-51244-y.
Article CAS PubMed PubMed Central Google Scholar
Bi X, Wang Y, Li M, Chen P, Zhou Z, Liu Y, et al. Validation of the Memorial Sloan Kettering Cancer Center nomogram for predicting non-sentinel lymph node metastasis in sentinel lymph node-positive breast-cancer patients. Onco Targets Ther. 2015;8:487–93. https://doi.org/10.2147/OTT.S78903.
Article PubMed PubMed Central Google Scholar
Gur AS, Unal B, Johnson R, Ahrendt G, Bonaventura M, Gordon P, et al. Predictive probability of four different breast cancer nomograms for nonsentinel axillary lymph node metastasis in positive sentinel node biopsy. J Am Coll Surg. 2009;208(2):229–35. https://doi.org/10.1016/j.jamcollsurg.2008.10.029.
Article PubMed Google Scholar
Ishizuka Y, Horimoto Y, Nakamura M, Arakawa A, Fujita T, Iijima K, et al. Predictive factors for non-sentinel nodal metastasis in patients with sentinel lymph node-positive breast cancer. Anticancer Res. 2020;40(8):4405–12. https://doi.org/10.21873/anticanres.14445.
Article CAS PubMed Google Scholar
Wu P, Zhao K, Liang Y, Ye W, Liu Z, Liang C. Validation of breast cancer models for predicting the nonsentinel lymph node metastasis after a positive sentinel lymph node biopsy in a Chinese population. Technol Cancer Res Treat. 2018;17: 1533033818785032. https://doi.org/10.1177/1533033818785032.
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by funds from the Foundation of Basic and Applied Basic Research of Guangdong Province, China (No. 2022A1515220202), funds from the 2023 Science and Technology Innovation Strategy Project of Guangdong Province (Big Project + Task List), China (No. STKJ2023009,20230403), funds from the Foundation of Basic and Applied Basic Research of Guangdong Province, China (No. 2023A1515220231).

Author information

Xiangli Xie, Yutong Fang and Lifang He contributed equally to this work.

Authors and Affiliations

The Breast Center, Cancer Hospital of Shantou University Medical College, Shantou, Guangdong, 515041, People’s Republic of China
Yutong Fang, Lifang He, Zexiao Chen, Chunfa Chen, Huancheng Zeng, Bingfeng Chen, Guangsheng Huang, Cuiping Guo & Jundong Wu
Department of Breast, Jiangmen Central Hospital, Jiangmen, Guangdong, 529030, People’s Republic of China
Qunchen Zhang
The Breast Center, Jieyang People‘s Hospital, Jieyang, Guangdong, 522000, People’s Republic of China
Xiangli Xie

Authors

Xiangli Xie
View author publications
You can also search for this author in PubMed Google Scholar
Yutong Fang
View author publications
You can also search for this author in PubMed Google Scholar
Lifang He
View author publications
You can also search for this author in PubMed Google Scholar
Zexiao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chunfa Chen
View author publications
You can also search for this author in PubMed Google Scholar
Huancheng Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Bingfeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Guangsheng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Cuiping Guo
View author publications
You can also search for this author in PubMed Google Scholar
Qunchen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jundong Wu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Xiangli Xie, Yutong Fang, Lifang He participated in the data analysis, Qunchen Zhang organized the article writing. Jundong Wu critically modified the manuscript. Zexiao Chen modified the manuscript. Chunfa Chen and Huancheng Zeng drafted the manuscript. Bingfeng Chen were responsiblefor the acquisition of data; Guangshen Huang contributed to the literature search. Cuiping Guo corrected language expression. All authors read and approved the manuscript and agree to be accountable for all aspects of the research in ensuring that the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Corresponding authors

Correspondence to Qunchen Zhang or Jundong Wu.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Ethics Committees of Cancer Hospital of Shantou University Medical College (No. 2024038) and Jieyang People's Hospital (No. 2024054), and a waiver of informed consent was granted.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Xie, X., Fang, Y., He, L. et al. Individualized prediction of non-sentinel lymph node metastasis in Chinese breast cancer patients with ≥ 3 positive sentinel lymph nodes based on machine-learning algorithms. BMC Cancer 24, 1090 (2024). https://doi.org/10.1186/s12885-024-12870-x

Download citation

Received: 09 June 2024
Accepted: 28 August 2024
Published: 02 September 2024
DOI: https://doi.org/10.1186/s12885-024-12870-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Individualized prediction of non-sentinel lymph node metastasis in Chinese breast cancer patients with ≥ 3 positive sentinel lymph nodes based on machine-learning algorithms

Abstract

Background

Methods

Results

Conclusion

Similar content being viewed by others

Non-sentinel node metastasis prediction during surgery in breast cancer patients with one to three positive sentinel node(s) following neoadjuvant chemotherapy

A new prediction nomogram of non-sentinel lymph node metastasis in cT1-2 breast cancer patients with positive sentinel lymph nodes

Prediction of nonsentinel lymph node metastasis in breast cancer patients based on machine learning

Explore related subjects

Introduction

Patients and methods

Patients

Surgery and pathology

Patient clinicopathological characteristics

Survival analysis

Feature selection and model construction

Evaluation of ML models

Results

Clinicopathologic characteristics

Survival analysis

Clinical characteristics and selection of features in patients with SLNs ≥ 3 positive

Construction and evaluation of models

Performance and interpretability of RF model

External validation of RF model

Discussion

Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Supplementary Information

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation