Introduction

Rationale

Colon cancer is among the most common types of malignancies worldwide, with recent data suggesting a steadily increasing incidence in younger patient subgroups1. Despite recent advancements in perioperative therapies and refinement of surgical techniques, mortality risk is, still, considerably high, with the current 5-year survival rates ranging from 32 to 64%2.

Surgical treatment is considered the gold standard for resectable, non-metastatic colon cancer3. Adherence to the respective oncological principles known as CME, including the completeness of the mesocolic dissection plane along with the central vessel ligation at their origin, ensures extended lymph node clearance, which has a positive impact on survival outcomes3.

However, recent evidence has shown a beneficial role of neoadjuvant chemotherapy (NAC) in locally advanced tumors with specific characteristics4,5. More specifically, in the FOxTROT trial, computed tomography (CT) predicted T3-4 tumors with extramural extension ≥ 1 mm were randomized to receive either NAC or adjuvant chemotherapy4. Initial results confirmed that six weeks of NAC resulted in significant tumor downstaging, increased R0 resection rates, and improved 2-year disease control, especially for the pMMR cancers4. Similarly, the OPTICAL study5 reported a substantial pathological downstaging in T3 (with extramural spread ≥ 5 mm) or T4 tumors that received NAC.

The potential implementation of neoadjuvant modalities in colon cancer, highlights the pivotal role of accurate preoperative staging6. Colon cancer staging is based on the evaluation of colonic wall invasion, lymph node involvement and local or remote disease dissemination6. Besides these, the significance of novel predictors, including extraperitoneal resection margin, tumor budding, and extramural vascular invasion (EMVI), has also been highlighted7,8,9,10. Although CT is the current baseline imaging modality11, its accuracy in multiple key characteristics (i.e. T overstaging, unreliability in N status) of colon cancer staging has been repeatedly questioned5,12,13,14.

Magnetic Resonance Imaging (MRI) has been proposed as a valid alternative to CT, since it provides optimal spatial resolution and higher tissue differentiation15,16,17,18. In contrast to CT, MRI allows the differentiation between desmoplastic and neoplastic pericolic invasion, thus resulting in more accurate local staging17. Additionally, the application of diffusion weighted imaging sequences, could theoretically, further contribute to preoperative staging accuracy18. Issues regarding bowel peristalsis have been addressed with the introduction of faster protocols18.

Several studies examined the diagnostic performance of MRI over CT in preoperative colon cancer staging, with most of them reporting favorable results in terms of T3ab vs T3cd differentiation and the detection of serosal involvement and extramural invasion15,18. However, in some cohorts, though, MRI was associated with moderate diagnostic accuracy in T3 identification and comparable results in diagnosing N and EMVI positive disease, thus adding to the discrepancy regarding its current role for local colon cancer staging19.

Aim

Therefore, the aim of the present study was to compare the diagnostic performance of MRI and CT in various aspects of local staging in colon cancer patients.

Materials and methods

Study design

This study was designed as a prospective single-arm diagnostic accuracy study. A local ethics committee clearance was granted, and the study protocol was deposited in an electronic registry (ClinicalTrials.gov ID: NCT05727007) prior to patient screening. All participants provided signed informed consent prior to inclusion, and all the scan examinations (both MRI and CT) were performed at the Radiology Department of the University Hospital of Larissa. The study report adhered to the STARD 2015 guidelines20.

Outcomes

With the assumption that T1 and T2 represent an “early” T stage, as opposed to the “locally advanced” T3 and T4, the comparison of the diagnostic accuracy of CT and MRI regarding the distinction between these two T stage groups (T1/T2 vs T3/T4) was considered the primary outcome of the study. The pathology report was considered the gold standard for local staging. Secondary endpoints included the evaluation of the diagnostic accuracy in terms of T3ab vs T3cd differentiation, nodal infiltration, EMVI and serosal and retroperitoneal surgical margin (RSM) invasion. A comparison of the tumor size measurements was also performed. More specifically, the maximum tumor diameter on each of the three dimensions (craniocaudal, transverse, and anteroposterior) was recorded.

Participants

All consecutive adult patients evaluated during the inclusion period (January 2019–March 2023), with confirmed colon cancer that met the current criteria for surgical resection were considered eligible. The following exclusion criteria were introduced: (1) allergy or contraindication to receive intravenous contrast, (2) claustrophobia, (3) incompatible implants with MRI, (4) active sepsis, (5) untreated physical or mental disability, (6) lack of compliance with the research protocol and, (7) non granting of signed informed consent.

Diagnostic examinations

All abdominal MRI scans were performed on a 3.0 T MRI system (Signa HDX, General Electric, Wisconsin, USA) with a standard body coil with the patient in the supine position. Our protocol covered the whole abdomen and consisted of the following sequences: coronal T2-SE, axial T2-SE, axial T2 FS, DWI with b values of 0 and 800 and axial pre- and post-contrast T1-SE. In cases of lesions located at the sigmoid, sagittal T2-SE was received, and then axial and coronal T2-SE were planned on the lesion axis.

The CT scan protocol was performed using a 128 row MDCT scanner (Philips Ingenuity CT 128) with intravenous dynamic contrast medium injection. Abdominal MDCT was performed with the patient in the supine position, covering the area from the dome of the diaphragm to the inferior pelvic margin. In each case, standardized imaging parameters (detector collimation: 16 × 10 mm, tube voltage: 120kVP, tube current: 200–400 mA, pitch: 1, slice thickness: 1 mm, matrix: 512 × 512) were applied.

Study group

Imaging evaluation was performed in duplicate by two independent radiologists (V.A, F.I.). Both assessors were senior radiologists with over 10-year experience in CT and MRI imaging evaluation. A structured report of the imaging assessments was utilized by the involved radiologists. There was no blindness at the level of the patient, the treating physicians (surgeon, oncologist, radiologist), and the researcher who recorded the data. Radiologists were blind regarding the specimen pathology report. Tumor staging was based on the American Joint Committee on Cancer 8th Edition TNM classification21. Both CT and MRI staging were performed upon colorectal cancer diagnosis and within one week of each other. All cases were discussed in the institutional colorectal multidisciplinary team meeting (MDT), where staging examinations were reviewed, and therapeutic options discussed. All operations (open or laparoscopic) were performed within 1 week of the MDT and adhered to the respective oncological principles (complete mesocolic excision—CME, central vascular ligation—CVL)22. A structured pathology report was applied to all specimens23. Morphological criteria were introduced for the evaluation of lymph node invasion, including irregular outer borders and internal heterogeneity24. EMVI was identified by a characteristic serpiginous extension of the tumor signal into the pericolic fat25. All members of the research group (surgeons, pathologists) were experts in the field of colorectal cancer and had completed their respective learning curves.

Statistical analysis

The estimation of the required sample size was based on the primary endpoint. According to the literature, the T1/T2 vs T3/T4 differentiation specificity of the MRI is 84%15. Subsequently, for a prospective protocol with Type I error: 5%, Type II error: 20% (Power: 80%), d: 10% and dropout rate: 30%, the estimated sample size was 120 patients.

Prior to any statistical analyses, all data were submitted to the Shapiro–Wilk normality test26. Since normality was not confirmed (Supplementary Material Tables), a non-parametric approach was implemented. Continuous and categorical data were reported as Median (Interquartile Range—IQR) and N (Percentage), respectively. Comparisons of tumor size measurements were based on the related samples Wilcoxon Signed Rank test. Interobserver agreement for tumor size measurements was estimated via the intra-class coefficient.

To evaluate the performance of the diagnostic studies, contingency tables were created. Pathologic data were considered reference data. Sensitivity (SE), specificity (SP), and diagnostic odds ratio (DOR) were calculated for both observers. Further performance evaluation was based on the estimation of the area under the curve (AUC) for each receiver operator characteristics curve (ROC)27. AUC levels 0.7 to 0.8 are considered as acceptable, while 0.8 to 0.9 are excellent. AUC > 0.9 refers to an outstanding diagnostic technique28. AUC comparisons were performed via DeLong’s test. The applied method was metric maximization with a 0.05 tolerance metric. All results were provided with the respective 95% Confidence Interval (95% CI). Subgroup analyses evaluated the effect of tumor location (right & left colon) and tumor size (based on the median pathology tumor size) on the AUC values of imaging modalities.

The unweighted Cohen’s kappa coefficient (k) was used for the interrater agreement assessment. More specifically, the level of interrater agreement was categorized based on the kappa values (poor: < 0.2, fair: 0.2–0.4, moderate: 0.4–0.6, good: 0.6–0.8, very good: > 0.8).

Statistical significance was considered at the level of p < 0.05. All statistical analyses were performed in SPSS Statistics v29 software (SPSS Inc. Chicago, IL, USA).

Ethics approval

This study was performed in line with the principles of the Declaration of Helsinki. A local ethic committee approval was granted prior to patient screening (Approval Number: 26-2/13.03.2019, Institutional Ethics Committee, University of Thessaly, Phone: 2410685703, Email: g-med@med.uth.gr).

Informed consent to participate

Informed consent was obtained from all individual participants included in the study.

Results

Patient characteristics

Overall, 134 colon cancer patients were screened during the study period (Fig. 1). Of those, 14 patients were not able to receive MRI scan (incompatible implants) and thus were excluded. All 120 patients were subjected to the prespecified CT and MRI scans.

Figure 1
figure 1

Study flow diagram.

Patient characteristics are summarized in Table 1. The median age of the cohort was 76 years. In total, 81 males and 39 females were included. Most operations were performed in an open approach (65.8% vs 34.2%). Data regarding tumor localization and staging are also provided in Table 1. Moreover, 67 right colectomies, 12 extended right colectomies, 39 left colectomies, and 2 subtotal colectomies were performed. Perforation was documented in 16 cases. Histopathology reports confirmed a not otherwise specified (NOS) adenocarcinoma diagnosis in 117 cases. EMVI, serosa, and parietal invasion were confirmed in 43.3%, 14.2%, and 7.5% of all patients, respectively. The median lymph-node harvest was 22.

Table 1 Patient characteristics; reported as median (interquartile range) or N (percentage).

Tumor size

CT systematically overestimated the tumor size in all three dimensions (craniocaudal, transverse, and anteroposterior) and in both observer evaluations (Supplementary Material). The higher difference was noted in the anteroposterior diameter assessments (1st Observer: 0.845 cm, 2nd Observer: 0.821 cm). Interrater scan measurements were consistent; significant differences between observers were noted in the craniocaudal (− 0.482 cm, p < 0.001) and anteroposterior axes (− 0.367, p = 0.001).

T status

Both readers (Tables 2, 3) displayed a higher diagnostic performance in MRI (SE MRIReader 1&2: 100%, SP MRIReader 1&2: 96.6%) compared to CT (SE CTReader 1&2: 83.9%, SP CTReader 1&2: 81%). for T1/T2 vs T3/T4 differentiation. Overall, MRI had a significantly higher AUC (Table 4, Fig. 2) compared to CT (CTReader 1&2: 0.825 vs MRIReader 1&2: 0.983 p < 0.001). These results were reproduced in all subgroups (Supplementary Material). Interrater agreement (Supplementary Material Tables) for both techniques was very good (kappa: 1, p < 0.01).

Table 2 Diagnostic performance of computed tomography (CT).
Table 3 Diagnostic performance of magnetic diffusion resonance imaging (MRI).
Table 4 ROC curve analysis.
Figure 2
figure 2

ROC analysis. (a) T1/T2 vs T3/T4, (b) ≤ T3ab vs ≥ T3cd, (c) N positive, (d) serosa invasion, (e) retroperitoneal surgical margin, (f) EMVI.

For T3ab vs T3cd diagnosis (Tables 2, 3), MRI displayed higher sensitivity (SE CTReader 1: 50%, SE CTReader 2: 33.3%, SE MRIReader 1&2: 83.3%) and specificity values (SP CTReader 1&2: 77.2%, SP MRIReader 1: 82.5%, SP MRIReader 2: 86%) compared to CT. A significant overall difference (p = 0.01) between the MRI and CT AUC values was noted (Table 4, Fig. 2). This was also confirmed in left colon tumors (Supplementary Material). Interrater agreement (Supplementary Material) was very good (CT kappa: 0.885, p < 0.01; MRI kappa: 0.893, p < 0.01).

N status

Similarly, higher sensitivity (SE CTReader 1: 75.7%, SE CTReader 2: 64.9%, SE MRIReader 1&2: 100) and specificity values (SP CTReader 1: 77.1%, SP CTReader 2: 85.5%, SP MRIReader 1: 90.2%, SP MRIReader 2: 95.2%) for MRI were estimated for the diagnosis of N positive disease (Table 2). MRI (Table 4, Fig. 2) was consistently associated with significantly higher AUC values (CTReader 1: 0.763, CTReader 2: 0.751, MRIReader 1: 0.951, MRIReader 2: 0.976, p < 0.001). Interrater agreement (Supplementary Material) for both techniques was good (CT kappa: 0.799, p < 0.01; MRI kappa: 0.782, p < 0.001).

Serosal invasion

Sensitivity for serosa invasion (Tables 2, 3) was low in both modalities (SE CTReader 1: 23.5%, SE CTReader 2: 17.6%, SE MRIReader 1: 35.3%, SE MRIReader 2: 29.4%). In contrast to this, high specificity was identified in both CT (SE CTReader 1: 97.1%, SE CTReader 2: 96.1%) and MRI (SP MRIReader 1&2: 100%). Overall AUC values (Table 4, Fig. 2) were comparable (p = 0.174). Interrater agreement (Supplementary Material) was very good (CT kappa: 0.848, p < 0.01; MRI kappa: 0.905, p < 0.01). Respective clinical examples are provided in Fig. 3.

Figure 3
figure 3

Clinical examples (white arrow). (a) An axial CT scan image showing infiltration of serosa, (b) an axial CT scan image showing infiltration of retroperitoneal surgical margin, (c) a coronal CT scan image showing extramural vascular invasion, (d) an axial MRI scan image (T1 FSE FS) showing infiltration of serosa, (e) an axial MRI scan image (T1 FSPGR) showing infiltration of retroperitoneal surgical margin, (f) a coronal MRI scan image (T1) showing extramural vascular invasion.

Retroperitoneal surgical margin

Both CT and MRI (Table 4, Fig. 2) displayed optimal diagnostic performance (AUC CTReader 1&2: 0.817, AUC MRIReader 1: 0.944, AUC MRIReader 2: 0.889, p = 0.06) regarding the RSM invasion assessment. The sensitivity of MRI was 88.9% for Reader 1 and 77.8% for Reader 2, while the respective specificities were 100% for both readers (Table 3). MRI Interrater agreement (Supplementary Material) was good (kappa: 0.787, p < 0.01).

EMVI

Although both modalities (Table 3) displayed low sensitivity values for EMVI diagnosis (SE CTReader 1: 32.7%, SE CTReader 2: 34.6%, SE MRIReader 1: 36.5%, SE MRIReader 2: 30.8%), specificity was optimal (SP CTReader 1: 91.2%, SP CTReader 2: 89.7%, SP MRIReader 1&2: 100%). A significant overall difference in the AUC values was noted (p = 0.023), which was not confirmed in either subgroup analyses (Supplementary Material). CT Cohen’s kappa coefficient (Supplementary Material) for CT and MRI was 0.792 (p < 0.001) and 0.9 (p < 0.01), correspondingly.

Discussion

This study showed that the diagnostic performance of MRI surpasses that of CT in several colon cancer staging parameters, such as T, N staging and EMVI assessment. Additionally, MRI had an acceptable performance in identifying high risk T3 tumors that may be amendable for a neoadjuvant approach. Although serosa invasion results were moderate, MRI displayed an outstanding performance in confirming RSM invasion. These results further highlight the diagnostic efficacy of MRI in colon cancer staging, thus suggesting it has the potential to become the new gold standard, especially when considered in combination to its additional value in differentiating equivocal liver ‘lesions’.

The determination of tumor growth through the bowel wall is the first step in colon cancer staging. We validated the superiority of MRI over CT in terms of T1/T2 versus T3/T4 diagnosis. In our study, MRI achieved 100% sensitivity and 96.6% specificity for this endpoint. Despite multiple measurements, the interobserver agreement reached 100%. The optimal performance of MRI in T stage assessment was also shown in several other series15,19. The theoretical advantage of MRI is its higher soft tissue discrimination ability, thus enabling it to identify tumor invasion through the various layers of the bowel wall more accurately15,19. However, these were challenged by a recent publication, where CT and MRI displayed comparable diagnostic accuracy for T stage diagnosis18.

FOxTROT study is the first randomized controlled trial to highlight the benefits of neoadjuvant chemotherapy in locally advanced, resectable, colon cancer14. Initial results showed that neoadjuvant treatment significantly downstaged the primary tumor, increased R0 resection rates, and decreased recurrence risk, especially when dMMR cancers were excluded4. Additionally, a meta-analysis by Cheong et al.29, confirmed the overall survival and disease-free survival benefits of neoadjuvant therapy. For the successful implementation of a neoadjuvant protocol, the accurate identification of high-risk patients during staging is crucial13. More specifically, T3ab versus T3cd differentiation directly affects the management pathway, since the latter group has a higher chance of benefiting from neoadjuvant chemotherapy30. Pooled analyses showed that CT is associated with a 77% sensitivity and 70% specificity for T3ab versus T3cd diagnosis30. A subsequent trial by Nerad et al.15, supported that MRI may underestimate the depth of extramural invasion, thus underestimating a significant proportion of T3cd tumors15. Rafaelsen et al.18 though, reported a significant diagnostic superiority of MRI in T3cd diagnostic accuracy. In our cohort, MRI showed excellent diagnostic performance, with an 83.3% sensitivity and a specificity ranging from 82.5% to 86%. A possible explanation for these discrepancies could be the difficulty of correctly differentiating the desmoplastic reaction of surrounding tissues from true tumor cell infiltration15.

Lymph node metastasis is a significant negative prognostic factor for colonic cancer2. Due to the direct impact on survival, initial staging should identify stage III disease with the highest diagnostic accuracy available15. Pooled data analyses show that CT has a moderate sensitivity (71%) and specificity (67%) for nodal involvement30. Furthermore, these suboptimal results are also accompanied by considerable variation in terms of diagnostic accuracy performance30. A similar heterogeneity is also noted in the recorded sensitivity (26–86%) and specificity (64–86%) of MRI15. The etiology of this discrepancy is multifactorial and includes differences in both technical parameters (CT protocol and slice thickness, MRI field strength and sequences) and diagnostic criteria30. The later are still not standardized, with the size, shape, and border contour being the most important30. Criteria combinations have also been described30; for example, node length, node cluster, and node density30. Alterations in the cut-off points of these parameters significantly affect the diagnostic performance of the respective examinations and result in the reported outcome heterogeneity30. In our study, the combined presence of internal heterogeneity and an irregular outer border were considered predictors of node infiltration24. Interestingly, although CT sensitivity and specificity were comparable to the respective literature results30, MRI achieved almost optimal performance. This contradicts recent publications where MRI was shown to be equivalent to CT in terms of node staging18.

MRI did not perform well in identifying serosa involvement. Although specificity was 100% for both readers, sensitivity was minimal. In addition to these, MRI had comparable diagnostic performance to CT. These contradict previous trials, where MRI displayed a considerably high sensitivity (76–87%) for serosa invasion. CT is also associated with acceptable diagnostic accuracy for serosa involvement31. More specifically, Hennedige et al.31, reported that the diagnostic accuracy of multidetector CT for visceral peritoneum infiltration ranged from 63.6 to 66.7%, respectively. Besides divergences in imaging protocols, the role of histopathologic evaluation is another important factor that may affect diagnostic performance results. There are three distinct morphological patterns at the advancing tumor edge, including continuous invasion through an inflammatory focus, pushing border, and infiltrating cell clusters with serosa reaction32. According to Panarelli et al.32, interobserver agreement for serosa infiltration classification was notably low, even between specialized gastrointestinal pathologists, thus confirming the need for further diagnostic criteria clarification.

Despite this, MRI displayed excellent diagnostic performance in identifying tumors that infiltrated the RSM. To be more specific, the sensitivity of MRI was 88.9% and 77.8%, while the specificity was consistently 100%. In comparison, the estimated sensitivity and specificity of CT were 77.8% and 85.6%, respectively. Despite these differences, MRI did not manage to achieve significantly higher AUC values.

In terms of colorectal cancer, the presence of EMVI is associated with an increased risk of cancer—related mortality and tumor recurrence, mainly through distant disease33. In our cohort, MRI had a moderate diagnostic performance in terms of EMVI diagnosis. Despite outstanding specificity (100%), sensitivity was considerably low. This contrasts with the results by Nerad et al.15, where MRI displayed a very high sensitivity and a moderate specificity for EMVI. The authors of this study suggested that the difference between these findings was due to vessel traction, the presence of thrombus, and altered hemodynamics caused by local inflammation15. Hunter et al.19, using a 1.5 T field strength, reported significant interobserver heterogeneity in the MRI sensitivity for EMVI (Reader 1: 63% versus Reader 2: 26%). Similar differences, but within the acceptable range, were noted in the specificity results (Reader 1: 80% versus Reader 2: 91%). In a recent pooled analysis by Kim et al.34, pooled MRI sensitivity for EMVI diagnosis was 61%. However, a significant heterogeneity plagued these results, with the average trial sensitivity ranging from 28 to 94%34. Further explanatory analyses identified locally advanced tumors, high resolution MRI, and the administration of antispasmodic drugs as heterogeneity explanatory factors34. Interestingly, the authors reported that characteristics such as tumor side, administration of neoadjuvant therapy, definition standard, field strength, and functional MRI protocols did not result to inconsistent outcomes34.

Limitations

Prior to the appraisal of the results of our study, several limitations should be acknowledged. First, although the study sample was based on a predefined power analysis, the extension of the eligible patients’ pool may have resulted in a more accurate estimation of the diagnostic accuracy of the imaging techniques. Moreover, the inherent heterogeneity in terms of patient demographics and tumor characteristics further increased the overall amount of bias in our study. Finally, the fact that both CT and MRI were performed according to a standardized protocol does not presuppose the safe extrapolation of our findings to a wider clinical setting, with different radiologic units, image acquisition protocols, and technical expertise.

Conclusion

MRI displayed a higher diagnostic performance compared to CT in terms of T1/T2 vs T3/T4, N positive, and EMVI assessment. Optimal performance of MRI was noted in the T3ab vs T3cd and RSM diagnoses. In contrast to these, MRI did not perform well in the identification of serosa invasion. Given several study limitations, further large-scale studies are required to confirm these results.