Background

Idiopathic membranous nephropathy (IMN) is the most common pathologic type of adult nephrotic syndrome (NS) [1]. IMN is common in the middle-aged and elderly, the incidence has gradually increased in China in recent years, and there is a trend of younger age [2]. IMN is the second or third primary glomerulonephritis leading cause of end-stage renal disease (ESRD) in the USA and Europe [3], and approximately one third of IMN patients have a progressive disease course. The most frightening long-term consequence of IMN is progressive loss of kidney function. And among 60% of untreated patients, there are about 35% patients eventually develop to ESRD within 10 years [4,5,6,7].

In recent decades, some clinical [8, 9], pathological [10, 11], and genetic [12, 13] parameters have been identified as biomarkers for predicting the prognosis of IMN, and the Kidney Disease Improving Global Outcomes (KDIGO) 2021 clinical practice guideline have taken 24 h urinary protein (UTP), estimate glomerular filtration rate(eGFR), serum anti-phospholipase A2 receptor antibody (sPLA2R-Ab), serum albumin (ALB) and others as indicators that may be used to divide patients into categories of low, moderate, high, and very high risk of progressive loss of kidney function [14]. However, there are often inconsistencies in various indicators in clinical practice, such as high level proteinuria in patients with low titer sPLA2R-Ab, or high level proteinuria with nomal serum ALB. And there is currently no model that combines all of these clinical considerations. Therefore, there is an urgent need to develop a model that takes into account clinical factors, and use the cut-off value of the model to identify high-risk patients with poor prognosis, which is conducive to clinical application. In addition, if a patient is initiated immunosuppressive therapy according to the KDIGO 2021 clinical practice guideline, but the outcome of treatment is unclear. If a model can be used to predict patient outcomes and response to treatment, the model may help clinicians assess patient prognosis in advance, and can fully discuss with patients to determine the best treatment options to maximize the benefit of patients.

The nomogram is a useful and accessible tool for physicians to predict the disease progression, to plan for individualized treatment, and to decide the interval for follow-up [15, 16]. Nomograms have been previously developed for IMN [17,18,19,20], but most of the nomograms lack of external validity [17, 19], and no dynamic online nomogram related to IMN prognosis is found at present to our knowledge. Machine learning has recently been used to produce a prediction model for practice. Machine learning can help model information based on statistics, potentially revealing hidden dependencies between predictors and diseases. Previous studies have shown that machine learning algorithms such as extreme gradient boost (XGBoost), random forest (RF) and logistic regression (LR) have been used to predict or identify kidney disease [21, 22]. The purpose of this study is to establish a dynamic online nomogram model based on machine learning model, in order to accurately identify the prognosis and treatment response of IMN patients, and to help clinicians formulate personalized treatment plans.

Methods

Study cohorts

This was a retrospective analysis of multicenter study in 3 hospitals in Liaoning Province, northeast of China, which included 232 cases from September 2019 to December 2020. The training cohort included 130 patients from Shengjing Hospital of China Medical University, and the validation cohort was 102 patients from the First Affiliated Hospital of Jinzhou Medical University and the General Hospital of Angang Group (Fig. 1). The inclusion criteria for IMN were as follows: (i) patients with MN diagnosed by renal biopsy or a positive anti-PLA2R antibody test with NS. (ii) age 18 to 75 years. (iii) a follow-up time ≥ 24 months and with complete data which was obtained in our institution. The exclusion criteria were as follows: (i) secondary membranous nephropathy (SMN), including those with autoimmune disease, infection, malignancy, drug and heavy metal poisoning related MN. (ii) corticosteroids or immunosuppressants were applied before the start of the study. (iii) follow-up periods less than 24 months, or with missing data. iiii. patients with serious mental illness that is difficult to cooperate with treatment, and pregnant or lactating women. This study had been approved by the Ethics Committee of Shengjing Hospital affiliated to China Medical University, and informed consent was waived because it was a retrospective study(the ethics number: 2023PS847K).

Fig. 1
figure 1

Flowchart of inclusion and exclusion in the training and validation cohort. A: In the training cohort; B: In the validation cohort

Clinical data collection

The baseline and follow-up data were extracted from patients’ records in hospital’s electronic medical system, including demographic characteristics, clinical variables, laboratory results. According to the manufacturer’s recommendation, detection of sPLA2R-Ab titer was performed using ELISA (E200908BU, Euroimmun, Germany), and a value ≥ 20 RU/ml was considered as positive. Renal biopsy was performed, and the biopsy sample examined by light microscopy, immunofluorescence, and electron microscopy. Membranous lesions from IMN cases were classified into four stages based on the criteria of Ehrenreich and Churg [23].

Treatment options

The treatment strategy was based on the KDIGO 2021 clinical practice guideline [14]. Renin-angiotensin-aldosterone system (RAAS) inhibitors consist primarily of angiotensin-converting enzyme inhibitor (ACEi)/angiotensin-II receptor blocker (ARB). Immunosuppressant therapy includes cyclophosphamide (CTX) or calcineurin inhibitor (CNI). Targeted therapy refers to CD20 monoclonal antibody therapy, mainly including rituximab and obinutuzumab. Other immunosuppressant treatments include mycophenolate mofetil, leflunomide, tripterygium wilfordii and others.

Outcome

The clinical endpoint was non-remission of proteinuria at 24 months. Complete remission (CR) was defined as achieving a normal level of proteinuria excretion of no more than 0.3 g per 24 h and with a stable eGFR. Partial remission (PR) was defined as proteinuria between 0.3 g and 3.5 g per 24 h, or a reduction in proteinuria of at least 50% compared with baseline and with a stable eGFR [24, 25]. Patients who did not meet any of those criteria were categorized as non-remission (NR). A stable eGFR was defined as an eGFR that remained unchanged or declined less than 15% during the period of follow-up.

Model construction and performance evaluation

In the training cohort, we used univariate’ and multivariate’ logistic regression to screen for major risk factors with non-remission urine protein based on the patient’s baseline measurements, and constructed a nomogram based on XGBoost, RF, LR machine learning algorithms. Five key metrics are used to assess the effectiveness of the model: area under the receiver operating characteristic (AUROC), sensitivity, specificity, accuracy and F1-score.

Internal and external validation of the model

The nomogram was subjected to 1000 bootstrap resamples for internal validation to assess its predictive accuracy, and was performed by a visual calibration plot. The discriminative ability of the model was determined by AUROC, which ranges from 0.5 to 1, and the AUROC was compared using Z test. Finally, to estimate the clinical utility of the model, the decision curve analysis (DCA) was performed by calculating the net benefits for a range of threshold probability. The external validity of the model was evaluated by the AUROC, calibration and decision curve analysis in an independent cohort.

Statistical analysis

All the statistical analyses were done by SPSS26.0 and R 4.2.1. Normally distributed continuous variables were expressed with their means and standard deviations whereas non-normal continuous variables were expressed by their medians and interquartile ranges (IQR). Categorical variables were expressed with frequencies and percentages. The statistical significance between two cohorts was determined by T test or the Wilcoxon rank sum test for continuous variables and Chi-square test for categorical variables. Results with P<0.05 were considered statistically significant.

Results

Patient characteristics

232 IMN patients were enrolled in this study, and the characteristics were presented in Table 1. The training cohort included 85 males, and with the median age being 48 years and the median proteinuria was 7.0 g/d. The characteristics were compared between the training and validation cohorts, and it showed that there were significant differences in age, uric red blood cell (URBC), ALB, serum creatinine (Scr), blood urea nitrogen (BUN), and total cholesterol (TCHO). In the training cohort, 26 patients did not undergo renal biopsy due to personal willingness or physical condition, while the majority of patients with renal biopsy presented in stages 2 and 3. The patient’s treatment plan was based on the KDIGO 2021 clinical practice guideline, more than 60% of patients received RAAS inhibitors or immunosuppressant therapy in the training or validation cohort, and there were no significant differences in treatment regimens between the two groups. The follow-up time was 24 months, and the incidence of the endpoint of the IMN progression was 31.5% and 31.4% in the training and validation cohorts. In addition, we retrospectively analyzed the adjustment of the treatment regimen during the follow-up of the patients in the training cohort, it was found that 30 patients did not achieve remission even after changing the immunosuppressant regimen, and 7 of them made two adjustments to the treatment regimen and still had persistent urine protein.

Table 1 Characteristics of the overall population in the training and validation cohorts

Feature selection

As shown in Table 2, after the multivariable’ analysis, we identified four major risk factors: course ≥ 6 months, UTP, D-dimer, and sPLA2R-Ab. These four variables were used to construct XGBoost, RF and LR machine learning models to predict the prognosis of IMN patients. The performance evaluation results of the three models were shown in Table 3, and the ROC curve and confusion matrix were used to evaluate the model discrimination ability, as shown in Fig. 2. The performance difference between the models were significant, and the RF model had the best performance, with the highest AUROC (0.869), sensitivity (0.700), specificity (0.897), precision (0.700), accuracy (0.769) and F1-score (0.700).

Table 2 Variables associated with treatment response of IMN in the univariable’ and multivariable’ analyses
Table 3 Performance of the prediction models generated by the three machine learning models
Fig. 2
figure 2

Evaluation of the predictive models. A: The ROC curves from three models. B: The confusion matrix from three models

Model construction and comparison

To make the model more practical and easier to visualize, we developed a nomogram using the four predictors (course ≥ 6 months, UTP, D-dimer, and sPLA2R-Ab) (Fig. 3). For each predictive factor in the nomogram, the point was read out by drawing a line straight upward from each predictor to the point axis. The total point was calculated by summing up each point located in the total point axis, which was further converted to probability. Furthermore, a dynamic online nomogram was available via an internet interface at https://progression.shinyapps.io/DynNomapp/ (Fig. 4).

Fig. 3
figure 3

A constructed nomogram for predicting urine protein non-remission at 2 years in patients with IMN

Fig. 4
figure 4

A dynamic online nomogram for predicting prognosis and response to treatment in patients with IMN. In this simulated case: the patient had a course less than 6 months, UTP 8 g/d, D-dimer 446 μg/L and sPLA2R-ab 106RU/ML, the probability of proteinuria non-remission at 2 years was 20.4%

The internal validation of the model

In the training cohort, the C-index for the nomogram was 0.835 (95% CI 0.762–0.908) and the ROC curve displayed in Fig. 5A. Z test showed that the discriminative ability of the nomogram prediction was significantly higher than that of individual predictions (course ≥ 6 months, UTP and D-dimer, Table 4). The calibration plot of the nomogram was plotted in Fig. 6A and demonstrated a good correlation between observed and predicted progression with a mean absolute error of 0.047. The DCA of the nomogram was presented in Fig. 7A, and showed that if the threshold probability of was between 10 and 88% or greater than 90%, using the nomogram to predict the IMN progression added more net benefit.

Fig. 5
figure 5

The ROC curves of the nomogram. The ability of the nomogram was measured and compared according to area under the curve values for the training (A) and the validation (B) cohorts

Table 4 Z test in the training cohort and validation cohort
Fig. 6
figure 6

The calibration curves for the nomogram. A completely accurate prediction model will generate a plot where the probability of the actual observed and predicted corresponding completely and fall along the 45°line. The apparent calibration curve represents the calibration of the model in the development data set, while the bias-corrected curve is the calibration result after correcting the optimism with the 1000 bootstrap-resampling. The closer the apparent calibration curve is to the bias-corrected curve, the more accurately the model predicts prognosis. A: In the training cohort; B: In the validation cohort

Fig. 7
figure 7

The DCA curves analysis for IMN prognosis nomogram in (A) the training and (B) the validation cohorts. The y-axis tested the net benefit. The thin gray line indicates that all patients with IMN are assumed to have non-remission of urine protein at 2 years, while the thick black line indicates that all patients with IMN are assumed to have a remission of proteinuria. The thick red line represented the risk nomogram. In training group, the decision curve showed that if the threshold probability of a patient is between 0.01 to 0.88 or greater than 0.9, using the nomogram in the present study to predict IMN prognosis adds more benefit

The external validation of the nomogram

In the validation cohort, the C-index was 0.874 (95% CI 0.801–0.946, Fig. 5B) and z test showed that the nomogram discrimination was better than the individual indicator (course ≥ 6 months, D-dimer, and sPLA2R-Ab). In addition, in order to evaluate the good calibration ability of the nomogram, we also calculated other evaluation metrics beyond the AUROC based on the cut-off value and the threshold of 0.5, including the sensitivity, specificity, precision, and F1-score, and the results showed that it may be better to assess the patient’s ability to calibrate the mode based on cut-off values (Table 5). A calibration curve (Fig. 6B) also showed high consistency between predicted prognosis probability and actual prognosis proportion. The DCA curve showed that the use of the nomogram increased the net benefit and had a strong clinical utility in predicting IMN prognosis (Fig. 7B).

Table 5 The other evaluation metrics beyond the AUROC

Discussion

As a quantitative tool for risk and benefit assessment, clinical prediction model can provide more intuitive and rational information for doctors, patients and medical policy makers. In recent years, a number of nomograms with IMN had been established [17,18,19,20], which were used to predict progression or relapse of patients with IMN, and to distinguish malignancy-associated membranous nephropathy from IMN. Compared to the above researches, the endpoint of this study was the non-remission of proteinuria at 2-year follow-up, in order to evaluate the patient’s response to treatment. Furthermore, we constructed a dynamic online nomogram model, which was multi-indexed, simple and operable, without cumbersome formulas and calculations, and the external validation also showed the universality and applicability of the model. We only needed to slide and select the value of each variable to obtain the probability of non-remission of proteinuria in the patient. The most important thing was that there was no manual intervention in the whole process, which avoided accidental errors.

According to current reports, this is the first dynamic online nomogram based on baseline parameters to predict treatment response in patients with IMN. The nomogram has been validated internally and externally to show that it has good discrimination, calibration ability and clinical net benefit. And based on the nomogram, the clinician can preliminarily judge the patient’s prognosis and response to treatment after 2 years, fully communicate with the patient, and choose the most suitable personalized treatment plan for the patient.

The results of the retrospective study indicated that course ≥ 6 months, UTP, D-dimer, and sPLA2R-Ab were significant independent predictors of poor response in patients with IMN. What makes our study unique is that it links D-dimer, a marker of thromboembolic complications in IMN, to prognosis and confirms that D-dimer is an independent risk factor for urine protein remission in IMN. D-dimer is a specific product of cross-linked fibrin under the action of plasmin [26], which can be used as an important molecular marker to reflect the plasma hypercoagulability state and the activation of the fibrinolytic system in vivo [27, 28]. IMN is an immune-mediated inflammatory disease with a high risk of thromboembolic complications due to damage to vascular endothelial cells, activation of the coagulation system, and weakening of the fibrinolytic system [29, 30]. Persistent proteinuria in patients with IMN presenting with nephrotic syndrome may lead to secondary venous thrombosis, increasing the risk of infection and acute kidney injury, and thus leading to poor prognosis in patients with IMN [29, 31]. Therefore, the IMN patients with high levels of D-dimer may indicate a high risk of thrombotic events and critical condition in IMN patients, and need to initiate anticoagulation and immunosuppressive therapy as soon as possible [32, 33].

IMN is a slowly progressive immune and inflammation-associated renal disease [34]. We also found that patients with a long course of disease had a poor response to treatment, which may be due to the persistence of chronic inflammation, resulting in increased deposition of immune complexes on the outside of the glomerular basement membrane, massive formation of basement membrane “spike” and thickening of the basement membrane, thereby aggravating renal injury and leading to poor prognosis [35, 36]. Moreover, studies had shown that immune-inflammation index and monocyte-lymphocyte ratio were reliable markers which might be used to predict prognosis for IMN patients [37, 38].

Previous studies and well-known researchers agree that the prognosis of IMN patients is closely related to UTP and sPLA2R-Ab levels. The heavier UTP and the higher sPLA2R-Ab level, the worse the prognosis for patients with IMN. Higher proteinuria level is significantly associated with a higher risk of reduction in renal function [39, 40]. Persistent proteinuria is an independent risk factor for progression of IMN to ESRD. The results of the present study cohort indicated that the 24-h proteinuria level was an independent predictor for a poor renal outcome, which was consistent with the present reports.

It is well documented that PLA2R and its autoantibodies are closely related to the prognosis of IMN [41,42,43]. Compared with glomerular PLA2R deposition, serum anti-PLA2R antibody levels are more closely correlated with renal outcome [44]. The KDIGO 2021 glomerular disease management guidelines recommend longitudinal monitoring of sPLA2R-Ab levels at 6 months after start of treatment may be useful for evaluating treatment response in patients with MN, and can be used to adjust the treatment strategy [14]. Consistent with current findings, we also confirmed a significant association between baseline sPLA2R-Ab levels and renal outcome in IMN patients.

Unfortunately, we did not find satisfactory results for common prognostic markers of IMN, such as serum albumin [14, 45]. First, the nomogram predicts that most patients with non-remission will have refractory MN and will endpoint with the time outcome, while previous studies have mostly ended with event outcomes, which may lead us to different results. And secondly, we suspect that this may be due to the liver’s strong ability to synthesize albumin, and of course there is some correlation with our small sample size, and we hope to conduct further studies on large sample sizes to illustrate the association between them. Furthermore, our findings did not find that treatment regimens had a significant effect on urine protein outcomes. However, RAAS inhibitors, immunosuppressant, and CD20 monoclonal antibody therapy all showed low odds ratios in univariable’ logistics regression, implying that certain patients might benefit from these treatments. Therefore, based on our model, we recommend that when assessing the outcome of patients with urine protein, if the probability of non-remission of urine protein is high and the likelihood of disease progression is high, we should actively communicate with the patient and take intervention to achieve a good prognosis.

The present study developed a dynamic online nomogram model for the early prediction of poor treatment response in IMN patients, and we can formulate individualized treatment and management plans, and determine whether it is appropriate to initiate immunosuppressive therapy to reduce the risk of progression to ESRD. But, there are several limitations to the present study. First, this study covered data from three study centers, the failure to establish a unified testing platform resulted in differences in validation and training cohorts baseline data. Conversely, it also verified the universality and applicability of the prognostic model. Second, recent studies showed that chronic tubulointerstitial inflammation was considered as a risk factor for poor renal prognosis in patients with IMN [19, 46]. This retrospective study had a small sample size, and the urinary α1/β2-microglobulin were not been included the association between chronic tubulointerstitial inflammation and poor prognosis of IMN had not been studied. Therefore, there is a need for a prospective, multicenter, large- scale cohort to explore this correlation.

Conclusions

In conclusion, we developed a dynamic online model for assessing patient prognosis and treatment response in patients with IMN and validated the model using independent patient cohorts. The nomogram is easy to use and can identify patients with IMN who are at high risk of poor response to treatment and a poor prognosis, and may help clinicians formulate an individualized treatment plan for patients and discuss when to start immunosuppressive therapy for a good prognosis.