Introduction

Pyogenic spondylitis refers to the infection of the spine, which encompasses a broad range of clinical entities (pyogenic spondylodiscitis, vertebral osteomyelitis and epidural abscess) [1, 2]. This devastating disorder accounts for 2–7% of all cases of musculoskeletal infection [2], and along with the aging of the population, the prevalence of pyogenic spondylitis is increasing [3, 4]. Meanwhile, postoperative spondylitis constitutes an important complication of spine surgery, which often leads to hospitalization/reoperation, increased morbidity and high economic burden [5]. Most patients with pyogenic spondylitis can achieve a satisfactory prognosis with conservative treatment at the early stage, but for patients with spinal cord or cauda equina compression and progressive neurological deficits, surgical intervention is indicated [5]. Early and accurate identification of pyogenic spondylitis is necessary to initiate timely management and improve the long-term outcome. However, the symptoms and signs of pyogenic spondylitis are often unspecific, diffuse and treacherous (e.g., fever, mild back pain), resulting in diagnostic difficulty.

A combination of clinical symptoms, laboratory tests and imaging investigations are necessary for assessing suspected pyogenic spondylitis in the clinic. The basic diagnostic examinations to establish a diagnosis of spondylitis include conventional X-ray, magnetic resonance imaging (MRI), computed tomography (CT), biopsy, blood culture and multiple nuclear imaging techniques. Changes in inflammatory markers such as C-reactive protein (CRP) and erythrocyte sedimentation rate (ESR) are evident on examination only after 4–8 weeks from the onset of the symptom [6, 7]. A substantial number of patients with chronic infection have normal or near-normal inflammatory markers [6]. Biopsy and blood culture are only successful in about half of the suspected patients to obtain a definite diagnosis and therefore are more suitable in the acute phase of spondylitis [8]. X-ray imaging and CT are relatively insensitive to diagnosing early spondylitis and predominantly used to guide biopsy [5]. Whereas MRI has high diagnostic accuracy for infection in the non-operated spine (sensitivity and specificity of > 90%) and therefore was ranked as the golden standard, detecting postoperative spondylitis by MRI remains challenging [9, 10]. Other shortcomings of MRI for assessing spondylitis are artifacts caused by metallic implants, similarities among pyogenic spondylitis, fracture and degenerative spinal diseases, and compromised sensitivity in patients with short-term symptoms [9, 10]. Therefore, additional practical tools are needed for diagnosing pyogenic spondylitis.

Fluorine-18 fluorodeoxyglucose positron emission tomography (18F-FDG PET) and PET/CT can be used to appraise increased glucose uptake and therefore are usually recommended to diagnose, stage, and detect the recurrence or progression of the malignant tumors [11]. Studies using this nuclear imaging technique to assess infection/inflammation are growing in recent years, especially for musculoskeletal infections [12]. Compared to CT and MRI, 18F-FDG PET and PET/CT provide the advantages of whole-body coverage to detect unintended metastasis of infection, and fewer artifacts due to metallic implants. Meanwhile, it can differentiate infections in bone or soft tissue. Thus, 18F-FDG PET and PET/CT have been suggested for assessing spinal infection when other imaging modalities fail to provide a definitive diagnosis or patients do not respond to antibiotic therapy as expected. A series of studies investigated the diagnostic accuracy of 18F-FDG PET and PET/CT for suspected spondylitis; however, these studies have relatively small sample sizes and limited power individually [3, 5, 9, 10]. Meanwhile, their reported results were controversial.

Several systematic reviews qualitatively appraised these studies but did not calculate the pooled diagnostic accuracy. A meta-analysis published in 2019 by Kim et al. further compared the diagnostic accuracy of 18F-FDG PET and MRI for spondylodiscitis [9]. However, this investigation only included seven studies published before 2018. Multiple large-scale studies specifically investigating the diagnostic efficiency of 18F-FDG PET for spondylitis were ignored. A literature search revealed no previous meta-analysis that provided a general overview of the 18F-FDG PET and PET/CT for diagnosing suspected spondylitis. With this in mind, this study aims to quantitatively clarify this issue by performing a systematic review and meta-analysis. Meanwhile, the diagnostic accuracy of this nuclear technique for suspected spondylitis with and without previous spine surgery was investigated separately.

Materials and methods

The methodological approach to evidence searching and synthesis described in this article was following the Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies (PRISMA-DTA) statement [13]. Two investigators (QY Zhang and HT Feng) independently conducted electronic database searching, eligibility assessment, data extraction and methodological quality appraising. Any disagreement was settled through discussion and consensus. Informed consent and ethical approval were not needed because all data were retrieved from public databases.

Search strategy

We systematically searched three electronic databases including PubMed, Embase, and Cochrane Library to retrieve entries published before March 15, 2022. Vocabulary and syntax were specifically adapted according to the database. We used (“spondylodiscitis” OR “discitis” OR “spondylitis” OR “vertebral osteomyelitis” OR “epidural abscess”) as the diagnosis of interest and (“fluorine-18 fluorodeoxyglucose positron emission tomography” OR “18F-FDG PET”) as the index tests. No language limitation or other search filters were applied. Bibliographies of related articles were also screened manually to identify additional eligible records.

Inclusion and exclusion criteria

Studies eligible for this meta-analysis needed to meet all the following criteria: (1) study design, diagnostic accuracy study, (2) population, patients with suspected pyogenic spondylitis, (3) 18F-FDG PET or PET/CT was performed, (4) the final diagnosis of pyogenic spondylitis was confirmed by predesigned reference criteria; and (5) enough data could be extracted to construct a 2 × 2 contingency table to determine the diagnostic performance of index test.

Exclusion criteria were (1) case–control studies; (2) studies not specifically investigating suspected pyogenic spondylitis; (3) studies with fewer than five participants; and (4) expert opinions, comments, letters and editorials.

The titles and abstracts were first assessed to exclude apparently ineligible studies, and the final decision on inclusion was based on downloading and scrutinizing the full articles. For studies with overlapping participants, the latest one was included. Inter-reviewer agreement was assessed at each stage by using the Kappa (κ) statistic: a k > 0.6, substantial agreement, a k between 0.21 and 0.6, moderate agreement, and a k < 0.2, poor agreement.

Data extraction

The following information was extracted and recorded into a standardized excel file: first author’s last name, year and region of publication, study design, inclusion interval of participants, number and demographic information of participants, standardized references, number of patients with suspected spondylitis, type and methodological data of applied nuclear imaging modality, history of spine surgery, presence or absence of implants, as well as the raw diagnostic data (number of false/true-positive [FP/TP] and false/true-negative [FN/TN] cases).

Quality assessment

The methodological quality of included studies was appraised according to the QUADAS (Quality Assessment of Diagnostic Accuracy Studies)-2 tool [14], which consisted of eleven questions. These questions were answered with “yes” for a low risk of bias/concerns, “no” for a high risk of bias/concerns or “unclear” when the relevant information was not clearly provided.

Statistical analyses

The pooled positive incidence was computed by generating the proportion of the yield (true-positive) and associated 95% confidence intervals (CI) using a random effect model. For the diagnostic modalities, TP, FP, TN and FN results were derived from a two-by-two contingency table. The pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), summarized receiver operating characteristic curve (sROC) and the area under sROC (AUC) were constructed utilizing the original diagnostic data. Heterogeneity across included studies was assessed using the I2 statistic. An I2 value of 0% implied no heterogeneity, and values of > 50% indicated substantial heterogeneity. Publication bias was assessed using Deeks' Funnel Plot Asymmetry. All statistical analyses were finished using STATA (Version 16.0, StataCorp, College Station, TX).

Results

Included studies

A total of 931 unique records were identified by searching electronic databases and screening reference lists of relevant articles. After the initial screening of titles and abstracts, 356 articles were further assessed by scrutinizing the full texts against the predesigned criteria, and the eligibility of each article was established by a group discussion until consensus was reached. Eventually, 18 [5,6,7, 15,16,17,18,19,20,21,22,23,24,25,26,27,28,29] articles published during the period from 2001 to 2021 were selected for quantitative analysis. Selection processes for eligible studies are summarized in Fig. 1. There was a substantial inter-reviewer agreement as to the title (κ = 0.94) abstract (κ = 0.87) and full-text screening (κ = 0.97) stages.

Fig. 1
figure 1

Selection process of included studies

Study characteristics

The sample size of included studies ranged from 9 to 149 with a total of 660 patients, of which 332 ones were identified with pyogenic spondylitis. Nine [5, 18, 19, 21, 22, 25,26,27,28] studies used prospective study design, whereas nine [6, 7, 15,16,17, 20, 23, 24, 29] were retrospective ones. Twelve [5, 7, 15, 19, 21,22,23, 25,26,27,28,29] studies claimed that the nuclear medicine specialists were blinded to the final diagnosis of reference tests, while six [5, 15, 18, 20, 24, 29] studies did not clarify this fact. Among included studies, seven [21, 24,25,26,27,28,29] focused on 18F-FDG PET and 11 [5,6,7, 15,16,17,18,19,20, 22, 23] used 18F-FDG PET/CT. A schematic overview of the included studies is presented in Table 1, and the technical details are listed in Table 2. As for the methodological quality of included studies, fifteen studies were identified to have high risk of bias in the term “differentially verification avoided” because the reference standard was not uniform (Fig. 2).

Table 1 Main characteristics of the included studies
Table 2 Technical aspects of the included studies
Fig. 2
figure 2

Methodological quality summary of the included studies. Red indicates a high risk of bias, yellow an unclear risk of bias and green a low risk of bias

Meta-analysis

Incidence of a positive result

The overall incidence of positivity of 18F-FDG PET as generated from seven [21, 24,25,26,27,28,29] datasets was 56% (95% CI: 33–79%) for suspected pyogenic spondylitis (Fig. 3). For 18F-FDG PET/CT, the overall incidence of positivity as generated from 11 [5,6,7, 15,16,17,18,19,20, 22, 23] datasets was 59% (95% CI: 49–68%), demonstrating a similarly positive result with 18F-FDG PET (Fig. 3).

Fig. 3
figure 3

Overall incidence of a positive result of 18F-FDG PET and PET/CT in patients with suspected pyogenic spondylitis

Overall diagnostic value of 18F-FDG PET or PET/CT for pyogenic spondylitis

The pooled sensitivity and specificity of 18F-FDG PET or PET/CT for diagnosing pyogenic spondylitis were 0.91 (95% CI, 0.84 to 0.95) and 0.90 (95% CI, 0.79 to 0.95), respectively (Fig. 4A and B). The pooled PLR, NLR, DOR and AUC were 8.9 (95% CI, 4.2 to 18.9), 0.10 (95% CI, 0.06 to 0.18), 86.00 (95% CI, 31.00 to 240.00) and 0.96 (95% CI, 0.94 to 0.97), respectively (Fig. 4C). Deeks' funnel plot asymmetry test revealed no existences of publication bias (p = 0.93, Fig. 4D).

Fig. 4
figure 4

Diagnostic performance of 18F-FDG PET and PET/CT for pyogenic spondylitis: A pooled sensitivity B pooled specificity C summary receiver operating characteristic curve (sROC) with the Q*-index and D publication bias

18F-FDG PET for pyogenic spondylitis

The pooled sensitivity and specificity of 18F-FDG PET for diagnosing pyogenic spondylitis were 0.98 (95% CI, 0.88 to 1.00) and 0.88 (95% CI, 0.69 to 0.96), respectively (Additional file 1: Figure S1A and B). The pooled PLR, NLR, DOR and AUC were 8.5 (95% CI, 2.8 to 26.1), 0.02 (95% CI, 0.00 to 0.16), 414 (95% CI, 30 to 5800) and 0.99 (95% CI, 0.98 to 1.00), respectively (Additional file 1: Figure S1C). No publication bias was identified, either (p = 0.93, Additional file 1: Figure S1D).

18F-FDG PET/CT for pyogenic spondylitis

The pooled sensitivity and specificity of 18F-FDG PET/CT for diagnosing pyogenic spondylitis were 0.86 (95% CI, 0.78 to 0.91) and 0.91 (95% CI, 0.76 to 0.97), respectively (Additional file 2 and 1: Figures S2A and S1B). The pooled PLR, NLR, DOR and AUC were 9.6 (95% CI, 3.2 to 28.4), 0.16 (95% CI, 0.09 to 0.25), 62 (95% CI, 17 to 231) and 0.97 (95% CI, 0.95 to 0.98), respectively (Additional file 2: Figure S2C). Deeks' funnel plot asymmetry test did not reveal publication bias (p = 0.71, Additional file 2: Figure S2D).

18F-FDG PET or PET/CT for pyogenic spondylitis without previous spine surgery

Twelve [16, 18,19,20,21,22,23,24,25, 27,28,29] studies investigated the accuracy of 18F-FDG PET or PET/CT for diagnosing pyogenic spondylitis without previous spine surgery. The pooled sensitivity and specificity were 0.93 (95% CI, 0.85 to 0.97) and 0.91 (95% CI, 0.77 to 0.97), respectively (Fig. 5A and B). The pooled PLR, NLR, DOR and AUC were 10.5 (95% CI, 3.90 to 28.60), 0.08 (95% CI, 0.04 to 0.17), 136 (95% CI, 35 to 530) and 0.97 (95% CI, 0.95 to 0.98), respectively (Fig. 5C). No publication bias was identified (p = 0.10, Fig. 5D).

Fig. 5
figure 5

Diagnostic performance of 18F-FDG PET and PET/CT for spondylitis without spine surgery: A pooled sensitivity B pooled specificity C summary receiver operating characteristic curve (sROC) with the Q*-index and D publication bias

18F-FDG PET or PET/CT for postoperative spondylitis

Six [5,6,7, 15, 17, 26] studies investigated the accuracy of 18F-FDG PET for diagnosing postoperative spondylitis. The pooled sensitivity and specificity were 0.85 (95% CI, 0.71 to 0.93) and 0.87 (95% CI, 0.66 to 0.96), respectively (Fig. 6A and B). The pooled PLR, NLR, DOR and AUC were 6.60 (95% CI, 2.20 to 19.4), 0.17 (95% CI, 0.08 to 0.36), 38 (95% CI, 9 to 167) and 0.92 (95% CI, 0.89 to 0.94), respectively (Fig. 6C). Deeks' funnel plot asymmetry test revealed no existence of publication bias (p = 0.24) (Fig. 6D).

Fig. 6
figure 6

Diagnostic performance of 18F-FDG PET and PET/CT for postoperative spondylitis: A pooled sensitivity B pooled specificity C summary receiver operating characteristic curve (sROC) with the Q*-index and D publication bias

Qualitative analysis

Cut-off value of SUVmax values for the diagnosis of pyogenic spondylitis

Three [15, 21, 22] studies provided the cut-off values of 18F-FDG uptake for the diagnosis of spondylitis. Segawa et al. [15] revealed that all patients with a cut-off SUVmax of ≥ 5.0 had infection (specificity = 100%) and the sensitivity is 90%. Another study [22] applied the SUVmax of 5.324 as the threshold value, however, the sensitivity is only 56%. Fahner et al. [21] and Fuster et al. [22] adapted SUVmax ratios (SUVmax corrected by normal tissue) of 2.1 and 2.2, respectively. The reported specificities were 100% and 88%, respectively.

Presence of implants for the diagnostic yield

Studies reported by Frenkel et al. [17] and Follenfant et al. [7] only investigated the postoperative spondylitis with implants. These two studies found similar diagnostic values to the pooled performance of 18F-FDG PET and PET/CT for postoperative spinal infections. 18F-FDG PET and PET/CT could effect tools to narrow the surgical field for a successful revision surgery.

Discussion

As noninvasive modalities, nuclear imaging techniques have become increasingly important for diagnosing infection/inflammation by identifying elevated glycolysis. Multiple studies specifically investigated the diagnostic accuracy of 18F-FDG PET or PET/CT for pyogenic spondylitis but no definitive result was obtained. To address this question scientifically, an updated pairwise meta-analysis was performed and the pooled results demonstrated that 18F-FDG PET and PET/CT had satisfactory accuracy (sensitivity = 0.91, 95% CI: 0.84–0.95; specificity = 0.90, 95% CI: 0.79–0.95) in identifying or excluding suspected pyogenic spondylitis. These results are partly consistent with the previously reported accuracy of this imaging technique for diagnosing suspected fracture-related infection and periprosthetic joint infection [12].

Currently, MRI is the most commonly used diagnostic procedure for pyogenic spondylitis. The manifestations include intervertebral disk injury, subsequent intervertebral stenosis, blurring of the endplates, height loss of the invaded vertebral bodies, possible epidural involvement and increased contrast enhancement in the spine [30, 31]. However, MRI is not a suitable examination for all patients. First, clinical evaluation by MRI is negatively affected by embedded implants due to the existence of artifacts [32]. Metallic hardware could induce perturbations to the static B0 magnetic field, disrupting the spatial encoding mechanism used in conventional MRI. Besides, for patients susceptible to multiple infections, whole-spine or whole-body examination using MRI needs high costs and a long run time. Last but not least, MRI cannot be performed in patients with implantable cardiac electronic devices, cochlear implants or other foreign metallic bodies [33]. Higher SUVs of 18F-FDG reflect an elevated risk for pyogenic spondylitis, providing valuable information to complement the morphological and structural data provided by MRI and CT [34]. Considering the high diagnostic accuracy, 18F-FDG PET and PET/CT can overcome the shortcoming of MRI mentioned above, fit for patients with internal implants or with suspected multiple infections. Meanwhile, compared with other alternative nuclear imaging methods (for instance, SPECT/CT), 18F-FDG PET and PET/CT are convenient and cost-effective by preventing unnecessary examinations and reducing duration of hospitalization, widely used in clinical practice [35].

Only three studies defined specific cut-off values of maximum standardized uptake value (SUVmax) for positivity [15, 21, 22]. From these results, a constant value of SUVmax to distinguish pyogenic spondylitis from non-infected cases is not realistic. A SUVmax ratio based on normal tissue is a preferable choice. Establishing criteria for identifying pyogenic spondylitis according to the dose of 18F-FDG administrated, and the characteristic of each center is imperative. In other studies, pattern-based diagnosis is employed. Through the joint application of semi-quantitative data (SUV) and qualitative assessment (distribution patterns and grades), 18F-FDG PET and PET/CT revealed considerable sensitivity in the diagnosis and appraisal of spondylitis foci throughout the body.

Postoperative spondylitis is a severe complication after spine surgery with and without instruments. The reported incidence varies from 1 to 15% with posterior surgery having a higher rate than anterior one [5, 6, 15]. It could be noticed that the most commonly cultured organism in spondylitis was Mycobacterium tuberculosis [5,6,7, 15, 17]. This infection could lead to osteolysis and the subsequent loosening of orthopedic implants [36]. Besides additional surgery and pain, postoperative spondylitis is associated with a high financial cost, averaging about 38, 000 dollars per patient [5]. Although not influenced by artifacts, 18F-FDG uptake will also be increased in postoperative patients without infection due to the local inflammation. In this study, we noticed that 18F-FDG PET and PET/CT revealed excellent sensitivity (0.85, 95% CI, 0.71–0.93) and specificity (0.87, (95% CI, 0.66 to 0.96) for diagnosing postoperative spondylitis, which seems to be lower than the diagnostic effect of 18F-FDG PET and PET/CT for pyogenic spondylitis without previous spine surgery. Further appraising using DOR, a parameter compromising the sensitivity and specificity, confirmed this conclusion.

Depending on the clinical scenario where 18F-FDG PET or PET/CT is employed, one might want to maximize the sensitivity to allow the scan to effectively rule out infection or to enhance specificity to ascertain the presence of pyogenic spondylitis. The reasons for the misdiagnoses are multifactorial. Increased 18F-FDG uptake is not pathognomonic for infectious diseases. Some benign lesions such as sterile inflammatory disease, granulomatous tissue and fractures also exhibit high levels of tracer accumulation. Therefore, the findings of 18F-FDG PET and PET/CT should be finally confirmed by histopathology examination or follow-up. Meanwhile, false-negative cases are inevitable. The first cause is the nonspecific 18F-FDG uptake and asymmetric 18F-FDG distribution in infectious diseases. Second, since the limited spatial resolution of 18F-FDG PET, occult or small lesions could not be identified. Third, some false-negative results are related to low-virulent bacteria.

Although this meta-analysis only investigated the diagnostic value of 18F-FDG PET and PET-CT, in comparison with the former two meta-analyses [9, 10], this study has three strengths. First, this investigation directly assessed the diagnostic accuracies of 18F-FDG PET and PET/CT in suspected spondylitis using a pairwise meta-analysis. Unlike the recent review [9] that only retrieved seven studies and focused on the comparison of 18F-FDG PET and MRI, we performed a thorough literature search and eventually eighteen studies were included. This is the most comprehensive analysis on this topic. Second, a subgroup meta-analysis was performed according to the type of pyogenic spondylitis, and main diagnostic parameters were obtained in each analysis. Distinct characteristics existed between spondylitis with spine surgery and those without surgical history, and in this study, these two groups of entities were analyzed separately. Last but not least, multiple statistical indicators as well as qualitative analysis were used to appraise the performance of 18F-FDG PET and PET/CT for spondylitis.

This study is not without limitations. A general shortcoming of included studies is the lack of a uniform reference standard for identifying spondylitis. We could not restrict inclusion to studies using biopsy or microbiological examination because these techniques only revealed limited sensitivity. Therefore, clinical follow-up sometimes was imperative to reconfirm the final diagnosis. Second, evidence of heterogeneity in diagnostic data existed throughout enrolled articles. Another major limitation is that subgroup analyses were not conducted during data merging based on essential variables such as whether metallic implants were used. Other minor shortcomings also merit consideration. For instance, the number of eligible studies is relatively small, and whether the results of nuclear imaging modalities were interpreted with blinding to the findings of reference test was not mentioned in nine studies, which could reduce the reliability of the result.

Conclusions

Based on the results of the current meta-analysis, 18F-FDG PET or PET/CT presents satisfactory accuracy for the diagnosis of pyogenic spondylitis. The diagnostic effect of this nuclear imaging method for spondylitis without previous spine surgery seems to be better than that for the postoperative ones. However, whether 18F-FDG PET or PET/CT could become a routine in patients with suspected spondylitis remains to be confirmed.