Background

Osteoporotic vertebral compression fracture (OVCF) substantially affects the health of the global elderly population. Untreated vertebral compression fractures can lead to chronic pain, progressive spinal deformity, and neurological deficits, severely impacting patients’ quality of life [1]. Percutaneous kyphoplasty (PKP), which demonstrates remarkable benefits in pain alleviation and function restoration, represents one of the primary treatments for OVCF. However, serious challenges associated with postoperative complications related to the vertebrae are still being encountered. The most common complication is extravertebral leakage of polymethylmethacrylate, which can lead to severe pain, neurological impairment, or embolism [1]. Other less common complications include vertebral recompression, cement displacement, and cement nonunion. Additionally, kyphoplasty can sometimes lead to secondary fractures of adjacent vertebrae, which is a particularly feared complication [2].

In vertebral recompression, recollapse of the treated vertebrae occurs postoperatively, leading to recurrent pain and functional impairment [3]. Multiple factors influence this condition, including cement distribution [4, 5], the severity of osteoporosis [4], and surgical techniques [4]. Cement nonunion indicates the failed complete integration of bone cement with the vertebral bone, resulting in gas or liquid interfaces postoperatively, a topic scarcely studied [6]. During cement displacement, injected bone cement migrates within the vertebra during follow-up, potentially extruding from the vertebra and causing severe pain and neurological damage [7]. The limited research and small sample sizes in previous studies highlight the need for comprehensive risk assessments. Most studies focus on single factors, and systematic research on the combined effects of multiple factors is lacking. This study innovatively defines adverse events of the fractured vertebra (AEFV) as bone-related adverse events occurring in the treated vertebra following surgery, including vertebral recompression, cement displacement, and cement nonunion. We also systematically introduce various novel predictive variables, such as vertebral body computed tomography value (CT value), kissing spine, pre- and postoperative Cobb angle changes, and vertebral height recovery rate (VHRR), to investigate AEFV.

In spine surgery, machine learning models have been increasingly applied to predict outcomes and identify risk factors, offering enhanced precision and the ability to process large, complex datasets. For example, these models have been used to predict the risk of postoperative complications, such as adjacent segment disease, by analyzing patient-specific data and surgical factors [8]. Such applications demonstrate the potential of machine learning to improve decision-making and patient outcomes in spine surgery.

This study aims to systematically analyze the risk factors associated with AEFV following PKP and to develop predictive models that can assist clinicians in identifying high-risk patients. By improving risk stratification and guiding clinical decision-making, this study seeks to contribute valuable insights to the management of OVCF patients and enhance patient outcomes.

Materials and methods

Study subjects

This retrospective study, reported in line with the STROCSS criteria [9], included 383 primary OVCF patients who underwent PKP at Beijing Luhe Hospital from January 2018 to March 2023. Patients were divided into two groups: those without AEFV (168 cases) and those with AEFV (215 cases), including vertebral recompression, cement displacement, and cement nonunion.

Inclusion criteria

  1. (1)

    Fragility fracture mechanism.

  2. (2)

    Clear diagnosis of OVCF with relevant symptom.

  3. (3)

    Complete preoperative imaging (X-ray, CT, MRI).

  4. (4)

    Follow-up time greater than one year with complete follow-up data.

  5. (5)

    Single OVCF.

Exclusion criteria

  1. (1)

    High-violence injuries.

  2. (2)

    Incomplete clinical data during follow-up.

  3. (3)

    Secondary osteoporosis.

  4. (4)

    Aggravated comorbidities including but not limited to malignant tumors, severe cardiovascular and cerebrovascular diseases, and liver and kidney insufficiency.

  5. (5)

    Multiple OVCFs.

Data collection

Collected data included gender, age, diagnosis, vertebral body CT value, kissing spine, Cobb change, VHRR, endplate integrity, anterior cortex integrity, cement leakage, and paravertebral muscle fat infiltration grade (PMFIG). These were measured by three experienced spine surgeons, and the average was used.

Variable observation methods

Cement Displacement: Identified via X-ray or CT showing anterior cortical rupture and cement displacement [7].

Vertebral Recompression: Detected through a reduction in anterior vertebral height on lateral X-rays, with a decrease > 5 mm [10].

Cement Nonunion: This condition is indicated by the presence of air or fluid around the cement mass on MRI or CT [6].

Vertebral Body CT Value: Measured at L1 using the region of interest in the central cancellous bone on axial CT (or adjacent vertebra if L1 was unmeasurable). The average CT value (HU) was calculated using three repeated measurements [11].

Kissing Spine: Also known as Baastrup disease, Characterized by thickening, sclerosis, and osteophyte formation in the spinous processes on sagittal CT [12].

Cobb Angle: This angle is measured on the lateral X-ray of the fractured vertebra, between the upper endplate of the vertebra above the injury and the lower endplate of the vertebra below the injury [13].

Cobb Change: Computed as postoperative Cobb Angle—preoperative Cobb Angle [13]

VHRR: Measured using lateral X-ray or sagittal CT bone window, the rate of vertebral recombination = the difference between the height of the anterior margin of the injured vertebra after and before surgery/the average height of the upper and lower anterior margins of the injured vertebra minus the height of the anterior margin of the injured vertebra before surgery [14].

PMFIG: This parameter is assessed at the L3 intervertebral disc level on T2-weighted MRI. The fat content in the paravertebral muscles is graded qualitatively as follows: 0, no intramuscular fat; 1, some fatty streaks; 2, significant fat but less than muscle; 3, equal fat and muscle; 4, more fat than muscle [15].

Statistical analysis

Descriptive statistics were calculated for all variables. Normality tests were applied to continuous variables, followed by Mann–Whitney U and chi-square tests for group comparisons. Spearman's rank correlation coefficient was used to assess the relationship between each variable and AEFV, vertebral recompression, cement displacement, and cement nonunion and determine the relationship direction (positive or negative).

The models included predictors, that is variables that me the following criteria: 1. significant differences observed during group comparisons and Spearman correlation analysis; 2. variables with a consistent positive or negative correlation with all the target events (AEFV, vertebral recompression, cement displacement, and cement nonunion), which ensured that no offsetting effects occurred between variables and target events.

The correlation coefficients between variables related to AEFV was revealed through a heatmap. Multivariate logistic regression and multiple machine learning models (logistic regression, SVM, decision tree, gradient boosting, and random forest) were constructed. Feature importance scores were calculated in the random forest model. (Supplemental Methods).

Basic statistical analyses were performed using SPSS (version 26.0), with R (version 4.0.5) and Python (version 3.8) used for complex data processing and model evaluation.

Results

In The patients were divided into those without AEFV of the fractured vertebra (168 cases) and those with (215 cases).

Categorical Variables: The AEFV group included a significantly higher proportion of patients with kissing spine (85.58% vs. 22.02%) and significantly lower proportions of patients with intact endplate (25.58% vs. 55.36%) and intact anterior cortex (32.56% vs. 75.60%). Cement leakage was more common (54.42% vs. 33.93%), and the prevalence of PMFIG was higher (Table 1).

Table 1 Descriptive Statistics and Univariate Analysis of Variables

Continuous Variables: The AEFV group had a median age of 73 years (compared with that in the non-AEFV group (67 years)), median vertebral body CT value of 51.20 (compared with that of the non-AEFV group (98.70)), median Cobb change of 2.50 (compared with that of the non-AEFV group (-1.60)), and median VHRR of 0.58 (compared with that of the non- AEFV group (0.38)) (Table 1).

Univariate Analysis: Significant differences were observed between the groups in terms of age, vertebral body CT value, Cobb change, VHRR, the presence of kissing spine, endplate integrity, anterior cortex integrity, cement leakage, and PMFIG (P < 0.05). However, no significant differences were detected in terms of gender and diagnosis (Table 1).

Spearman correlation analysis revealed the significantly positively correlation of the kissing spine, PMFIG, age, Cobb change, and VHRR with AEFV (Fig. 1), vertebral recompression, cement displacement, and cement nonunion. Conversely, significantly negatively correlations were observed among endplate integrity, anterior cortex integrity, and vertebral body CT value and these events. The relationships between these variables and the target events (AEFV, vertebral recompression, cement displacement, and cement nonunion) showed consistency, which enabled their inclusion in the AEFV prediction model without offsetting effects. Cement leakage exhibited a significantly positively correlation with AEFV, cement displacement, and cement nonunion but a negative correlation with vertebral recompression, with a small and nonsignificant coefficient. Therefore, a consistent relationship was also observed between cement leakage and the target events (AEFV, cement displacement, and cement nonunion), which guaranteed its inclusion in the AEFV prediction model without offsetting effects. Combined with the findings of group comparison, the models included kissing spine, endplate integrity, anterior cortex integrity, cement leakage, PMFIG, age, vertebral body CT value, Cobb change, and VHRR as the final predictive variables (Table 2).

Fig. 1
figure 1

Spearman correlation matrix Heatmap of variables related to AEFV

Table 2 Spearman correlation analysis

Figure 1 illustrates the Spearman correlation coefficients matrix between various variables (gender, diagnosis, kissing spine, endplate integrity, anterior cortex integrity, cement leakage, PMFIG, age, vertebral body CT value, Cobb change, VHRR) and AEFV. The color range spans from deep blue (negative correlation) to deep red (positive correlation), with color intensity indicating the strength of the correlation. Numeric values represent the specific correlation coefficients.

Multivariate Logistic Regression Analysis: The independent risk factors for postoperative AEFV comprised kissing spine (odds ratio (OR) = 8.47, 95% confidence interval (CI): 1.46–49.02), high PMFIG (OR = 29.19, 95% CI 4.83–176.04), low vertebral body CT value (OR = 0.02, 95% CI 0.003–0.13, P < 0.001); large Cobb change (OR = 5.31, 95% CI 1.77–15.77) (Table 3).

Table 3 Multivariate logistic regression analysis of adverse events of the fractured vertebra

Model Training and Evaluation: Training and prediction were attained using logistic regression, SVM, decision tree, gradient boosting, and random forest models. Confusion matrices were used to evaluate the model performance, and showed high classification accuracy and predictive capability were obtained. Model stability and reliability were assessed via ten-fold cross-validation. The SVM model presented the best performance and was thus selected as the optimal model (Figs. 2) (3). Feature importance analysis of the random forest model revealed vertebral body CT value, PMFIG, kissing spine, and Cobb change as the most critical factors for the prediction of postoperative AEFV (Table 4).

Fig. 2
figure 2

Line Plot Comparing Performance Metrics of Models. The different colored lines represent different models, with the x-axis showing various performance metrics and the y-axis showing the values

Fig. 3
figure 3

Receiver Operating Characteristic (ROC) Curve of the Support Vector Machine Model. AUC Area Under the Curve

Table 4 Confusion matrix, performance metrics, and cross-validation results of the models

The relative importance of each predictive feature was also evaluated in the random forest model. The findings indicate vertebral body CT value, PMFIG, kissing spine, and Cobb change as the most critical factors for the prediction of the postoperative AEFV, with importance scores of 0.34, 0.33, 0.13, and 0.12, respectively (Fig. 4).

Fig. 4
figure 4

Feature Importance in the Random Forest Model. Each bar represents the importance of a feature in predicting AEFV. The length of the bars indicates the importance value, and the features are ranked from highest to lowest importance

Discussion

Our evaluation of the performance metrics of the logistic regression model exhibited excellent data fitting and prediction performance, with an accuracy of 94.78% and an receiver operating characteristic (ROC) AUC of 99.46%. This study provides valuable insights into the risk factors associated with adverse events of the fractured vertebra (AEFV) following percutaneous kyphoplasty (PKP). By expanding the sample size and employing multiple machine learning models, we were able to identify key independent risk factors, such as kissing spine, high paravertebral muscle fat infiltration grade, low vertebral body CT value, and substantial Cobb change. The support vector machine (SVM) model, in particular, demonstrated superior predictive accuracy and generalization capability, making it a valuable tool for clinical decision-making. These findings contribute significantly to the existing knowledge and offer a strong foundation for improving patient outcomes post-OVCF surgery.

In kissing spine the spinous processes of adjacent vertebrae come into contact or erode each other, as commonly observed in patients with degenerative spinal changes [12]. Our study results indicate that this condition substantially affects the occurrence of AEFV. The possible mechanisms include the following: (1) Reduced spinal stability: Kissing spine alters local biomechanics, which leads to abnormal load distribution and mechanical disruption of the spine and increases the risk of recompression [16]. Changes in spinal mechanics may result in an uneven stress on the bone cement within the vertebrae, which increases the risk of cement nonunion [17]. (2) Local inflammatory response: Kissing spine induces soft tissue inflammation, which results in the release of various inflammatory mediators that damage bone structure and increases the risk of recompression [18]. Inflammatory factors (e.g., tumor necrosis factor-α, interleukin (IL)-1, IL-6, and C-reactive protein) inhibit osteoblast function, weaken the bonding strength between bone cement and vertebral bone, and promote osteoclast activity, which lead to bone loss, cement nonunion, and displacement [19, 20].

Paravertebral muscle fat infiltration grading greatly contributes to the assessment of spinal health and surgical prognosis [21]. Our study revealed high PMFIG as an independent risk factor for AEFV, which can be achieved possibly through multiple mechanisms: (1) Biomechanical environment alteration: High fat infiltration causes reduction of muscle fibers, decreased strength, and abnormal spinal load distribution [22]. Decreased buffering capacity of paravertebral muscles concentrates stress in certain vertebral areas, which increases the risk of recompression and cement micromovement or displacement [22, 23]. (2) Reduced blood supply: Fat infiltration decreases blood supply to muscle tissues, which influences the nutrition and metabolism of bone tissue surrounding the cement, causes bone loss and reduction in the bonding strength between cement and the vertebral bone, increases the risk of cement nonunion, and prolongs recovery [24, 25].

Vertebral body CT value serves as a substitute indicator for bone density, and it has become increasingly important in the treatment of OVCF patients. Traditional dual-energy X-ray absorptiometry primarily assesses overall bone density, and vertebral body CT value provides more details on bone structure information and is easier to measure without positional influence [26]. Our study revealed the importance of a low vertebral body CT value as an independent risk factor for AEFV. Patients with low CT values show increased osteoporosis, which promotes vertebral structural fragility, reduces the effectiveness and stability of bone cement fixation, and increases the risk of cement displacement [27]. Osteoporosis also reduces the compressive strength and toughness of vertebrae, which make them more susceptible to postoperative deformation and fracture and further increase in the risk of AEFV [28].

Cobb change is a vital indicator for the assessment of the spinal kyphosis in OVCF patients [29]. This work revealed substantial Cobb changes as an independent risk factor for AEFV. An increased postoperative Cobb angle suggests overcorrection during surgery, which leads to uneven vertebral stress distribution and increased load on adjacent and injured vertebrae and recompression [20, 31]. In addition, stress concentration influences the stability of bone cement within the vertebrae, which causes micromovement and cement displacement; this condition potentially damages the surrounding bone tissue and leads to cement nonunion [27, 32].

In our study, the SVM model outperformed all other models in terms of all the performance metrics and was thus selected as the optimal model. In addition, the SVM model includes several advantages [33]: (1) strong capability to handle nonlinear relationships; (2) maintains a high classification accuracy with small sample sizes; (3) excels in high-dimensional space processing, handling complex tasks; (4) offers good generalization capability which effectively avoids overfitting. The potential multicollinearity issues and reliance on linear relationships in the logistic regression model [34] prompted us to compare its performance with those of other models. Decision tree, a tree-based classification model, performs recursive selection of optimal split points to divide data into subsets [35]. The decision tree model is intuitive and interpretable but susceptible to overfitting [35]. In this work, the decision tree model attained an accuracy and ROC AUC of 90.43% and 94.02%, respectively. Despite decent specificity, the model exhibited lower precision, recall, and F1 score were lower than the other models. Gradient boosting is an ensemble learning method used for the construction of multiple weak classifiers and their iterative optimization to improve the overall model performance [36]. The gradient boosting model showed an excellent performance, with an accuracy of 94.78% and an ROC AUC of 98.64%. However, this model achieved slightly lower recall and F1 score than the SVM model. Notable, gradient boosting shows an excellent in handling nonlinear relationships and high-dimensional data. Another ensemble learning method, that is, random forest, offers strong overfitting resistance and high stability, which are suitable for complex nonlinear relationships and high-dimensional data [37]. The random forest model revealed a remarkable performance, with an accuracy of 95.65% and ROC AUC of 98.97%. However, this model was slightly inferior to the SVM model. Feature importance analysis of the random forest model further validated the effect of different variables on the risk of AEFV and provided good interpretability.

Our study, while insightful, has limitations. As a retrospective analysis, selection bias and limited data representativeness are concerns. Due to incomplete follow-up, we included only 383 patients with complete data, possibly underestimating the true incidence of AEFV. The short follow-up period might have restricted the observation of long-term complications, and potential confounders, like postoperative activity levels, were not fully controlled. Future large-scale, multicenter prospective studies with extended follow-up are needed to validate and generalize these findings.

Conclusion

We identified four independent risk factors for AEFV and developed five predictive models to aid clinicians in identifying high-risk patients. These findings highlight the importance of thorough preoperative assessments and proper vertebral realignment during surgery, minimizing changes in Cobb angle to reduce AEFV incidence. Postoperatively, personalized rehabilitation and enhanced follow-up, including anti-osteoporosis treatments, are recommended to improve outcomes.