Effects of Interobserver Variability on 2D and 3D CT- and MRI-Based Texture Feature Reproducibility of Cartilaginous Bone Tumors

Gitto, Salvatore; Cuocolo, Renato; Emili, Ilaria; Tofanelli, Laura; Chianca, Vito; Albano, Domenico; Messina, Carmelo; Imbriaco, Massimo; Sconfienza, Luca Maria

doi:10.1007/s10278-021-00498-3

Effects of Interobserver Variability on 2D and 3D CT- and MRI-Based Texture Feature Reproducibility of Cartilaginous Bone Tumors

Open access
Published: 17 August 2021

Volume 34, pages 820–832, (2021)
Cite this article

Download PDF

You have full access to this open access article

Journal of Digital Imaging Aims and scope Submit manuscript

Effects of Interobserver Variability on 2D and 3D CT- and MRI-Based Texture Feature Reproducibility of Cartilaginous Bone Tumors

Download PDF

Salvatore Gitto ORCID: orcid.org/0000-0002-3623-7822¹,
Renato Cuocolo^2,3,
Ilaria Emili⁴,
Laura Tofanelli⁵,
Vito Chianca^6,7,
Domenico Albano^8,9,
Carmelo Messina⁸,
Massimo Imbriaco¹⁰ &
…
Luca Maria Sconfienza^1,8

2780 Accesses
30 Citations
6 Altmetric
Explore all metrics

Abstract

This study aims to investigate the influence of interobserver manual segmentation variability on the reproducibility of 2D and 3D unenhanced computed tomography (CT)- and magnetic resonance imaging (MRI)-based texture analysis. Thirty patients with cartilaginous bone tumors (10 enchondromas, 10 atypical cartilaginous tumors, 10 chondrosarcomas) were retrospectively included. Three radiologists independently performed manual contour-focused segmentation on unenhanced CT and T1-weighted and T2-weighted MRI by drawing both a 2D region of interest (ROI) on the slice showing the largest tumor area and a 3D ROI including the whole tumor volume. Additionally, a marginal erosion was applied to both 2D and 3D segmentations to evaluate the influence of segmentation margins. A total of 783 and 1132 features were extracted from original and filtered 2D and 3D images, respectively. Intraclass correlation coefficient ≥ 0.75 defined feature stability. In 2D vs. 3D contour-focused segmentation, the rates of stable features were 74.71% vs. 86.57% (p < 0.001), 77.14% vs. 80.04% (p = 0.142), and 95.66% vs. 94.97% (p = 0.554) for CT and T1-weighted and T2-weighted images, respectively. Margin shrinkage did not improve 2D (p = 0.343) and performed worse than 3D (p < 0.001) contour-focused segmentation in terms of feature stability. In 2D vs. 3D contour-focused segmentation, matching stable features derived from CT and MRI were 65.8% vs. 68.7% (p = 0.191), and those derived from T1-weighted and T2-weighted images were 76.0% vs. 78.2% (p = 0.285). 2D and 3D radiomic features of cartilaginous bone tumors extracted from unenhanced CT and MRI are reproducible, although some degree of interobserver segmentation variability highlights the need for reliability analysis in future studies.

The diagnostic value of magnetic resonance imaging-based texture analysis in differentiating enchondroma and chondrosarcoma

Article 25 November 2022

Diagnostic value of MRI-based 3D texture analysis for tissue characterisation and discrimination of low-grade chondrosarcoma from enchondroma: a pilot study

Article 07 September 2017

Diffusion-weighted MRI radiomics of spine bone tumors: feature stability and machine learning-based classification performance

Article Open access 23 March 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Medical Imaging

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Cartilaginous tumors of the bone include a broad spectrum of lesions that range from benign to malignant entities [1, 2]. Reliable identification and grading are crucial, as clinical management varies widely. Specifically, asymptomatic benign enchondromas do not require any treatment, appendicular atypical cartilaginous tumors are managed with intralesional curettage or even watchful waiting, and appendicular higher grade lesions and axial skeleton chondrosarcomas are resected with free margins [3]. The diagnosis relies on a combination of clinical presentation, imaging, and biopsy [3, 4]. Imaging, and particularly magnetic resonance imaging (MRI), has good accuracy in discriminating atypical cartilaginous tumors from higher grade lesions [5] but is less reliable in differentiating the former from enchondromas [6]. Biopsy is considered the reference standard but has the disadvantages of sampling errors [7] and discrepancies even among specialized bone pathologists due to overlapping histological findings [8]. Additionally, the risk of biopsy-tract contamination remains a concern. Thus, the need for cutting-edge imaging-based tools, such as radiomics, is advocated to safely diagnose and grade cartilaginous bone tumors non-invasively [9].

Texture analysis is a post-processing method for quantification of tumor heterogeneity, which reflects adverse tumor biology but cannot be captured using conventional imaging modalities or sampling biopsies [10]. It belongs to the growing field of radiomics, which includes extraction, analysis, and interpretation of large amounts of quantitative parameters from medical images [11, 12]. To date, texture analysis has been used to discriminate tumor grades and types before treatment, monitor response to therapy, and predict outcome [13]. The resulting quantitative parameters, known as texture or radiomic features, may suffer however from interobserver variability, particularly with regard to tumor delineation while performing manual segmentation [14,15,16]. The influence of segmentation margins is also critical because of textural details of the peritumoral area, which may affect the reproducibility of texture features and therefore their diagnostic performance [17]. In literature, the intraclass correlation coefficient (ICC) is commonly employed to assess radiomic feature reproducibility [17,18,19,20,21].

The aim of this study is to investigate the influence of interobserver manual segmentation variability on the reproducibility of bidimensional (2D) and volumetric (3D) unenhanced computed tomography (CT)- and MRI-based texture analysis in cartilaginous bone tumors.

Materials and Methods

Design and Population

The local Institutional Review Board approved this retrospective study and waived the need for informed consent. According to the ICC guidelines by Koo et al. [22], we designed our study to meet the numerical requirements of a reliability analysis in terms of both patients and observers involved, namely 30 lesions and 3 different readers [22]. A search of the radiology information system was performed and 30 patients with cartilaginous bone tumors were recruited (median age 52 [range, 28–72] years), including 10 enchondromas, 10 atypical cartilaginous tumors, and 10 chondrosarcomas. Inclusion criteria were as follows: (i) enchondromas proven either by histology or minimum follow-up of 6 years without alteration in shape or size and typical imaging findings of lobulated morphology and T2-weighted hyperintensity on MRI; (ii) histology-proven atypical cartilaginous tumors; (iii) histology-proven primary conventional grades II–III or dedifferentiated chondrosarcomas; (iv) 1.5-T MRI including turbo spin echo T1-weighted and T2-weighted sequences and 64-slice CT performed within 1 month before biopsy, intralesional curettage, or surgical resection for tumors diagnosed by histology. Exclusion criteria were the presence of pathological fracture and ambiguous histology report.

Enchondromas were located in the femur (n = 5), fibula (n = 2), foot phalanx (n = 1), humerus (n = 1), and radius (n = 1); atypical cartilaginous tumors in the femur (n = 2), fibula (n = 2), and humerus (n = 6); chondrosarcomas in the calcaneus (n = 1), femur (n = 2), humerus (n = 1), pelvis (n = 2), spine (n = 3), and tibia (n = 1).

Image Segmentation

A musculoskeletal radiologist (S.G.) and two last-year radiology residents trained in musculoskeletal and oncologic imaging (I.E. and L.T.) independently performed manual image segmentation using the open-source software ITK-SNAP (v3.6) [23]. The readers knew the study would deal with cartilaginous bone tumors, but they were blinded to any other information regarding histological grade, disease course, and additional imaging studies. All tumors were segmented on axial CT scans and on axial MRI sequences as first choice and coronal or sagittal sequences as second choice. Manual contour-focused segmentation was performed on unenhanced bone-window CT and T1-weighted and T2-weighted MRI by drawing both a 2D region of interest (ROI) on the slice showing the largest tumor area and a 3D ROI including the whole tumor volume. The “polygon mode” ITK-SNAP tool was used for all segmentations. While segmenting the tumors on CT, the readers used the MRI sequences to aid contour identification of each tumor. Thereafter, margin shrinkage segmentation was computed by applying a marginal erosion to both 2D and 3D segmentations in order to evaluate the influence of segmentation margins on feature reproducibility (Fig. 1). In detail, ROI shrinkage was performed using the fslmaths erosion function of the FMRIB Software Library [24]. The default 2D and 3D kernels, which are 3 × 3 × 1 and 3 × 3 × 3 boxes centered at the target voxel, were employed as appropriate. During the erosion process, each voxel in the ROI is targeted sequentially, and its value is changed to 0 (i.e., removed from the ROI) if a zero-value voxel is found within the kernel. Therefore, the shrinkage was usually more extensive for 3D ROIs compared to 2D ones.

Texture Analysis

Image pre-processing consisted in resampling to a 2 × 2 isotropic pixel or 2 × 2 × 2 isotropic voxel, whole-image intensity normalization (mean value of 300 and standard deviation of 100), and discretization with a fixed bin width of 5. Original CT and MRI and 2D and 3D ROIs were used for feature extraction on PyRadiomics (v2.2.0) [25], an open-source Python software. The extracted features were grouped according to PyRadiomics official documentation (https://pyradiomics.readthedocs.io/en/latest/features.html), as follows:

18 first-order features, which describe the distribution of pixel or voxel gray-level values;
9 shape-based 2D and 14 shape-based 3D features, which respectively describe the 2D and 3D size and shape of the ROI;
22 Gy-level cooccurrence matrix (GLCM) features, which quantify how often pairs of pixels or voxels with certain values occur in a specified spatial range;
16 Gy-level size zone matrix (GLSZM) features, which quantify gray-level zones, i.e., the number of connected pixels or voxels sharing the same gray-level value;
16 Gy-level run length matrix (GLRLM) features, which quantify gray-level runs, i.e., the length in number of consecutive pixels or voxels having the same gray-level value;
14 Gy-level dependence matrix (GLDM) features, which quantify gray-level dependencies, i.e., the number of connected pixels or voxels within a set distance that are dependent on the center pixel and voxel.

In addition to the original CT and MRI, Laplacian of Gaussian (LoG)-filtered (sigma = 2, 3, 4, 5) and wavelet-transformed 2D and 3D images (all possible low- and high-pass filter combinations) were obtained for extraction of first-order and matrix features. Shape-based features are independent from gray-level value distribution and therefore were only computed on the original images. A total of 783 and 1132 features were extracted from original, LoG-filtered, and wavelet-transformed 2D and 3D images, respectively.

Statistical Analysis

Texture feature interobserver reliability was assessed using a two-way, random-effects, single-rater, absolute agreement ICC. Features were considered stable when achieving good (0.75 ≤ ICC < 0.9) to excellent (ICC ≥ 0.9) interobserver reliability [22]. Differences among variables were evaluated using Chi-square test. A 2-sided p-value < 0.05 indicated statistical significance [26]. Data analysis was performed using the pandas and numpy Python software and the “irr” R package [27, 28].

Machine Learning Analysis

To assess the potential value of CT and MRI texture features extracted from 2D and 3D annotations, an exploratory data analysis was performed with an Extra Trees (ET) ensemble model. The same pipeline was employed on all available datasets, consisting of feature selection through cross-validated recursive feature elimination (RFE) and random search hyperparameter tuning nested within a leave-one-out cross-validation on the entire dataset. RFE was conducted using tenfold cross-validation and an ET estimator with default hyperparameters. Then, in the training folds of the leave-one-out cross-validation, the synthetic oversampling technique was applied to balance the 3 classes (i.e., creating a synthetic instance to substitute the lesion in the test fold), followed by 100 iterations of ET hyperparameter random search. Given the presence of 3 classes with balanced cases, accuracy was used as the reference score for both RFE and ET tuning. The hyperparameter search space was as follows:

1.
Number of trees = 100–1000
2.
Criterion = entropy or Gini
3.
Max depth = 1–10
4.
Bootstrap = True or False
5.
Max samples = 0–100%

Results

In 2D contour-focused vs. margin shrinkage segmentation, the stable feature rates were 74.71% (n = 585) vs. 71.65% (n = 561), 77.14% (n = 604) vs. 76.12% (n = 596), and 95.66% (n = 749) vs. 96.42% (n = 755) for CT and T1-weighted and T2-weighted images, respectively. The number of stable features derived from 2D contour-focused segmentation showed no difference in comparison with 2D margin shrinkage segmentation (p = 0.343). Table 1 details the number and percentage of stable features that were obtained with 2D contour-focused segmentation, grouped according to feature class and image type.

Table 1 2D contour-focused segmentation. Number and percentage of stable features with good (0.75 ≤ ICC < 0.9) and excellent (ICC ≥ 0.9) interobserver reliability grouped according to feature class and image type. GLCM, gray-level cooccurrence matrix; GLDM, gray-level dependence matrix; GLRLM, gray-level run length matrix; GLSZM, gray-level size zone matrix; ICC, intraclass correlation coefficient; LoG, Laplacian of Gaussian

Full size table

In 3D contour-focused vs. margin shrinkage segmentation, the stable feature rates were 86.57% (n = 980) vs. 83.66% (n = 947), 80.04% (n = 906) vs. 71.47% (n = 809), and 94.97% (n = 1075) vs. 65.72% (n = 744) for CT and T1-weighted and T2-weighted images, respectively. The number of stable features derived from 3D contour-focused segmentation was higher compared to 3D margin shrinkage segmentation (p < 0.001). Table 2 details the number and percentage of stable features that were obtained with 3D contour-focused segmentation, grouped according to feature class and image type.

Table 2 3D contour-focused segmentation. Number and percentage of stable features with good (0.75 ≤ ICC < 0.9) and excellent (ICC ≥ 0.9) interobserver reliability grouped according to feature class and image type. GLCM, gray-level cooccurrence matrix; GLDM, gray-level dependence matrix; GLRLM, gray-level run length matrix; GLSZM, gray-level size zone matrix; ICC, intraclass correlation coefficient; LoG, Laplacian of Gaussian

Full size table

The rate of stable features derived from CT was higher for 3D compared to 2D contour-focused segmentation (p < 0.001), while no difference was found for features derived from T1-weighted and T2-weighted MRI between 3D and 2D contour-focused segmentation (p = 0.142 and 0.554, respectively). In Fig. 2, box and whisker plots show the interobserver reliability of feature classes derived from 3D and 2D contour-focused segmentation, grouped according to image type.

In 2D vs. 3D contour-focused segmentation, matching stable features derived from CT and MRI were 65.77% (n = 515) vs. 68.73% (n = 778), and those derived from T1-weighted and T2-weighted images were 75.99% (n = 595) vs. 78.18% (n = 885), respectively (p = 0.191 and 0.285). Tables 3 and 4 respectively detail the number and percentage of matching stable features obtained with 2D and 3D contour-focused segmentation, as well as overall interobserver reliability across different imaging modalities and MRI sequences, grouped according to feature class and image type. In Fig. 3, box and whisker plots show the overall interobserver reliability of matching feature classes derived 3D and 2D contour-focused segmentation of CT and MRI, as well as MRI including T1-weighted and T2-weighted sequences, grouped according to image type. Most shape-based 2D and 3D features were stable even across different imaging modalities and MRI sequences.

Table 3 2D matching features. Number and percentage of matching stable features obtained with 2D contour-focused segmentation, as well as number and percentage of matching stable features with good (ICC ≥ 0.75) overall interobserver reliability across different imaging modalities and MRI sequences, grouped according to feature class and image type. GLCM, gray-level cooccurrence matrix; GLDM, gray-level dependence matrix; GLRLM, gray-level run length matrix; GLSZM, gray-level size zone matrix; ICC, intraclass correlation coefficient; LoG, Laplacian of Gaussian

Full size table

Table 4 3D matching features. Number and percentage of matching stable features obtained with 3D contour-focused segmentation, as well as number and percentage of matching stable features with good (ICC ≥ 0.75) overall interobserver reliability across different imaging modalities and MRI sequences, grouped according to feature class and image type. GLCM, gray-level cooccurrence matrix; GLDM, gray-level dependence matrix; GLRLM, gray-level run length matrix; GLSZM, gray-level size zone matrix; ICC, intraclass correlation coefficient; LoG, Laplacian of Gaussian

Full size table

Regarding the machine learning pipeline, the number of selected features ranged from 1 (from 2D annotations on T2-weighted images) to 236 (2D annotations on CT images). The accuracy of the ET models was fair to good, ranging between 77% (2D annotations on CT images) and 90% (3D annotations on T2-weighted images). Table 5 reports the results of each annotation and image type combination.

Table 5 Feature selection process and exploratory machine learning pipeline in the reproducible feature datasets. The results of each annotation and image type combination are reported

Full size table

Discussion

The main finding of our study is that the rates of stable radiomic features extracted from unenhanced CT and MRI were 75% or higher for 2D and 80% or higher for 3D contour-focused segmentation. 3D CT-based texture analysis provided more stable features than 2D approach, while no difference in feature stability rates was found between 2 and 3D MRI-based texture analyses. Overall, a certain degree of segmentation variability highlighted the need to include a reliability analysis in future studies.

Despite its great potential as a non-invasive biomarker to quantify several tumor characteristics, radiomics still faces challenges to clinical implementation, both standalone and paired to machine learning [13, 29]. A great variability in radiomic features has emerged as a major issue across studies, and segmentation is the most critical step [12]. Image segmentation represents the basis of radiomic image analysis pipelines and can be time-consuming if performed manually. Therefore, methodological analyses are advisable prior to conducting radiomic studies in order to assess the robustness of different segmentation approaches and avoid biases due to non-reproducible, noisy features. These analyses have been previously performed in kidney [30, 31], lung, and head and neck [15] lesions. With regard to cartilaginous bone tumors, radiomic studies to date have focused on discriminating among benign, atypical, and malignant lesions [32,33,34,35], differentiating chondrosarcoma from other entities such as skull chordoma [36], or predicting recurrence of chondrosarcoma [37]. To our knowledge, our work is the first comprehensively addressing the influence of interobserver manual segmentation variability on the reproducibility of 2D and 3D CT- and MRI-based texture analysis in cartilaginous bone tumors. Nonetheless, Fritz et al. [33] and Gitto et al. [34] performed an interobserver reliability assessment as a feature-reduction method in their radiomic analysis, which provided a model for prediction of tumor grade. In particular, Fritz et al. found that most 2D features derived from unenhanced (15 out of 19) and contrast-enhanced (18 out of 19) T1-weighted MRI had at least good agreement between two observers, using an ICC cutoff of 0.6 [33]. In this study, however, the number of extracted features was only 19 per sequence, the impact of different feature classes was not analyzed, and filtered and transformed images were not used. Despite these issues, a common conclusion that can be drawn from this and our studies is that most MRI radiomic features of cartilaginous bone tumors have good reproducibility, even though a certain degree of segmentation variability exists. In a more recent study by Gitto et al., stability was assessed as a feature-reduction method and CT radiomic features were considered stable if ICC 95% confidence interval lower bound was 0.75 or higher. This resulted in a lower feature stability rate (30%) [34] compared to our current study.

In our study, all imaging modalities demonstrated good reproducibility both employing 2D and 3D annotations, with a robust feature percentage ranging from 75 to 96% for the former and 80 to 95% for the latter. Stable features also proved quite informative for predictive modeling at our preliminary analysis, with accuracies of 77–90%. Given the limited sample size and presence of 3 class labels, this result is promising and supports the use of radiomic data in this research domain. These findings are encouraging for future radiomic analyses, even though they confirm the need for a preliminary assessment of feature stability, and in line with recent literature emphasizing the importance of reproducibility in artificial intelligence and radiology [38]. The higher spatial resolution of CT did not seem to influence feature reproducibility and was probably offset by the better contrast resolution of T1-weighted and T2-weighted images. Furthermore, margin shrinkage did not lead to improvements in terms of feature reproducibility, contrary to a previous investigation on renal cell carcinoma CT images [17]. It should be noted that in this investigation, however, the authors reported that margin shrinkage produced less informative features even with improved reproducibility [17].

We found higher rates of stable features derived from CT for 3D compared to 2D segmentation, but no difference in the rates of 2D and 3D MRI-derived stable features. This finding is in favor of a 2D approach in future radiomic studies dealing with MRI-based texture analysis of cartilaginous bone tumors, as this is less time-consuming and easier to be employed in clinical practice, particularly in large atypical cartilaginous tumors and chondrosarcomas. Furthermore, most 2D (66–76%) and 3D (69–78%) stable features matched between CT and MRI, as well as T1-weghted and T2-weighted images. Finally, shape-based features were stable even across different imaging modalities and MRI sequences, and were thus reproducible and independent descriptors of tumor size and shape. On the other hand, overall interobserver reliability of other feature classes was unsurprisingly low across different imaging modalities and MRI sequences, indicating that their quantitative values depend on the specific image used.

Some limitations of our study should be acknowledged. First, it has a retrospective design as a prospective analysis is not strictly necessary for radiomic studies [13]. The retrospective design accounts for the exclusion of contrast-enhanced images, as they were not performed for all enchondromas. Contrast-enhanced and dynamic contrast-enhanced MRI improve the accuracy of cartilaginous bone tumor assessment [39,40,41] and future radiomic studies focusing on these sequences are warranted. Finally, due to its scope, this was a single-institution study and generalizability of our findings needs to be confirmed on more varied datasets.

Conclusions

In conclusion, radiomic features of cartilaginous bone tumors extracted from 2D and 3D segmentations on CT and MRI examinations are reproducible, although some degree of segmentation variability highlights the need to perform a preliminary reliability analysis in radiomic studies. 3D and 2D MRI-based texture analyses provide similar rates of stable features. Thus, a 2D approach can be favored in future studies, as this is easier to implement in clinical practice.

Abbreviations

2D:: Bidimensional
3D:: Volumetric
CT:: Computed tomography
ET:: Extra Trees
GLCM:: Gray-level cooccurrence matrix
GLDM:: Gray-level dependence matrix
GLRLM:: Gray-level run length matrix
GLSZM:: Gray-level size zone matrix
ICC:: Intraclass correlation coefficient
LoG:: Laplacian of Gaussian
MRI:: Magnetic resonance imaging
RFE:: Recursive feature elimination
ROI:: Region of interest

References

Murphey MD, Walker EA, Wilson AJ, Kransdorf MJ, Temple HT, Gannon FH: From the archives of the AFIP: imaging of primary chondrosarcoma: radiologic-pathologic correlation. Radiographics 23:1245–1278, 2003
Article Google Scholar
Albano D, Messina C, Gitto S, Papakonstantinou O, Sconfienza L: Differential Diagnosis of Spine Tumors: My Favorite Mistake. Semin Musculoskelet Radiol 23:26–35, 2019
Article Google Scholar
Casali PG, Bielack S, Abecassis N, Aro HT, Bauer S, Biagini R, Bonvalot S, Boukovinas I, Bovee JVMG, Brennan B, Brodowicz T, Broto JM, Brugières L, Buonadonna A, De Álava E, Dei Tos AP, Del Muro XG, Dileo P, Dhooge C, Eriksson M, Fagioli F, Fedenko A, Ferraresi V, Ferrari A, Ferrari S, Frezza AM, Gaspar N, Gasperoni S, Gelderblom H, Gil T, Grignani G, Gronchi A, Haas RL, Hassan B, Hecker-Nolting S, Hohenberger P, Issels R, Joensuu H, Jones RL, Judson I, Jutte P, Kaal S, Kager L, Kasper B, Kopeckova K, Krákorová DA, Ladenstein R, Le Cesne A, Lugowska I, Merimsky O, Montemurro M, Morland B, Pantaleo MA, Piana R, Picci P, Piperno-Neumann S, Pousa AL, Reichardt P, Robinson MH, Rutkowski P, Safwat AA, Schöffski P, Sleijfer S, Stacchiotti S, Strauss SJ, Sundby Hall K, Unk M, Van Coevorden F, van der Graaf WTA, Whelan J, Wardelmann E, Zaikova O, Blay JY: Bone sarcomas: ESMO–PaedCan–EURACAN Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol 29:iv79–iv95, 2018
Cannavò L, Albano D, Messina C, Corazza A, Rapisarda S, Pozzi G, Di Bernardo A, Parafioriti A, Scotto G, Perrucchini G, Luzzati A, Sconfienza LM: Accuracy of CT and MRI to assess resection margins in primary malignant bone tumours having histology as the reference standard. Clin Radiol 74:736.e13-736.e21, 2019
Article Google Scholar
Douis H, Singh L, Saifuddin A: MRI differentiation of low-grade from high-grade appendicular chondrosarcoma. Eur Radiol 24:232–240, 2014
Article Google Scholar
Crim J, Schmidt R, Layfield L, Hanrahan C, Manaster BJ: Can imaging criteria distinguish enchondroma from grade 1 chondrosarcoma? Eur J Radiol 84:2222–2230, 2015
Article Google Scholar
Hodel S, Laux C, Farei-Campagna J, Götschi T, Bode-Lesniewska B, Müller DA: The impact of biopsy sampling errors and the quality of surgical margins on local recurrence and survival in chondrosarcoma. Cancer Manag Res 10:3765–3771, 2018
Article Google Scholar
Eefting D, Schrage YM, Geirnaerdt MJA, Le Cessie S, Taminiau AHM, Bovée JVMG, Hogendoorn PCW: Assessment of Interobserver Variability and Histologic Parameters to Improve Reliability in Classification and Grading of Central Cartilaginous Tumors. Am J Surg Pathol 33:50–57, 2009
Article Google Scholar
van de Sande MAJ, van der Wal RJP, Navas Cañete A, van Rijswijk CSP, Kroon HM, Dijkstra PDS, Bloem JL: Radiologic differentiation of enchondromas, atypical cartilaginous tumors, and high‐grade chondrosarcomas—Improving tumor‐specific treatment: A paradigm in transit? Cancer 125:3288–3291, 2019
Article Google Scholar
Davnall F, Yip CSP, Ljungqvist G, Selmi M, Ng F, Sanghera B, Ganeshan B, Miles KA, Cook GJ, Goh V: Assessment of tumor heterogeneity: an emerging imaging tool for clinical practice? Insights Imaging 3:573–589, 2012
Article Google Scholar
Codari M, Melazzini L, Morozov SP, van Kuijk CC, Sconfienza LM, Sardanelli F: Impact of artificial intelligence on radiology: a EuroAIM survey among members of the European Society of Radiology. Insights Imaging 10:105, 2019
Article Google Scholar
Gillies RJ, Kinahan PE, Hricak H: Radiomics: Images Are More than Pictures, They Are Data. Radiology 278:563–577, 2016
PubMed Google Scholar
Lubner MG, Smith AD, Sandrasegaran K, Sahani D V., Pickhardt PJ: CT Texture Analysis: Definitions, Applications, Biologic Correlates, and Challenges. Radiographics 37:1483–1503, 2017
Article Google Scholar
Berenguer R, Pastor-Juan M, Canales-Vázquez J, Castro-García M, Villas MV, Mansilla Legorburo F, Sabater S: Radiomics of CT Features May Be Nonreproducible and Redundant: Influence of CT Acquisition Parameters. Radiology 288:407–415, 2018
Article Google Scholar
Pavic M, Bogowicz M, Würms X, Glatz S, Finazzi T, Riesterer O, Roesch J, Rudofsky L, Friess M, Veit-Haibach P, Huellner M, Opitz I, Weder W, Frauenfelder T, Guckenberger M, Tanadini-Lang S: Influence of inter-observer delineation variability on radiomics stability in different tumor sites. Acta Oncol 57:1070–1074, 2018
Article Google Scholar
Bologna M, Corino VDA, Montin E, Messina A, Calareso G, Greco FG, Sdao S, Mainardi LT: Assessment of Stability and Discrimination Capacity of Radiomic Features on Apparent Diffusion Coefficient Images. J Digit Imaging 31:879–894, 2018
Article Google Scholar
Kocak B, Ates E, Durmaz ES, Ulusan MB, Kilickesmez O: Influence of segmentation margin on machine learning–based high-dimensional quantitative CT texture analysis: a reproducibility study on renal clear cell carcinomas. Eur Radiol 29:4765–4775, 2019
Article Google Scholar
Gitto S, Cuocolo R, Albano D, Morelli F, Pescatori LC, Messina C, Imbriaco M, Sconfienza LM: CT and MRI radiomics of bone and soft-tissue sarcomas: a systematic review of reproducibility and validation strategies. Insights Imaging 12:68, 2021
Schwier M, van Griethuysen J, Vangel MG, Pieper S, Peled S, Tempany C, Aerts HJWL, Kikinis R, Fennessy FM, Fedorov A: Repeatability of Multiparametric Prostate MRI Radiomics Features. Sci Rep 9:9441, 2019
Article Google Scholar
Ugga L, Cuocolo R, Solari D, Guadagno E, D’Amico A, Somma T, Cappabianca P, del Basso de Caro ML, Cavallo LM, Brunetti A: Prediction of high proliferative index in pituitary macroadenomas using MRI-based radiomics and machine learning. Neuroradiology 61:1365–1373, 2019
Article Google Scholar
Zwanenburg A, Vallières M, Abdalah MA, Aerts HJWL, Andrearczyk V, Apte A, Ashrafinia S, Bakas S, Beukinga RJ, Boellaard R, Bogowicz M, Boldrini L, Buvat I, Cook GJR, Davatzikos C, Depeursinge A, Desseroit M, Dinapoli N, Dinh CV, Echegaray S, El Naqa I, Fedorov AY, Gatta R, Gillies RJ, Goh V, Götz M, Guckenberger M, Ha SM, Hatt M, Isensee F, Lambin P, Leger S, Leijenaar RTH, Lenkowicz J, Lippert F, Losnegård A, Maier-Hein KH, Morin O, Müller H, Napel S, Nioche C, Orlhac F, Pati S, Pfaehler EAG, Rahmim A, Rao AUK, Scherer J, Siddique MM, Sijtsema NM, Socarras Fernandez J, Spezi E, Steenbakkers RJHM, Tanadini-Lang S, Thorwarth D, Troost EGC, Upadhaya T, Valentini V, van Dijk LV, van Griethuysen J, van Velden FHP, Whybra P, Richter C, Löck S: The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology 295:328–338, 2020
Article Google Scholar
Koo TK, Li MY: A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med 15:155–163, 2016
Article Google Scholar
Yushkevich PA, Piven J, Hazlett HC, Smith RG, Ho S, Gee JC, Gerig G: User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. Neuroimage 31:1116–1128, 2006
Article Google Scholar
Jenkinson M, Beckmann CF, Behrens TEJ, Woolrich MW, Smith SM: FSL. Neuroimage 62:782–790, 2012
Article Google Scholar
van Griethuysen JJM, Fedorov A, Parmar C, Hosny A, Aucoin N, Narayan V, Beets-Tan RGH, Fillion-Robin J, Pieper S, Aerts HJWL: Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res 77:e104–e107, 2017
Article Google Scholar
Di Leo G, Sardanelli F: Statistical significance: p value, 0.05 threshold, and applications to radiomics—reasons for a conservative approach. Eur Radiol Exp 4:18, 2020
R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, 2020
van der Walt S, Colbert SC, Varoquaux G: The NumPy Array: A Structure for Efficient Numerical Computation. Comput Sci Eng 13:22–30, 2011
Article Google Scholar
Cuocolo R, Caruso M, Perillo T, Ugga L, Petretta M: Machine Learning in oncology: A clinical appraisal. Cancer Lett 481:55–62, 2020
Article CAS Google Scholar
Kocak B, Durmaz ES, Kaya OK, Ates E, Kilickesmez O: Reliability of Single-Slice–Based 2D CT Texture Analysis of Renal Masses: Influence of Intra- and Interobserver Manual Segmentation Variability on Radiomic Feature Reproducibility. AJR Am J Roentgenol 213:377–383, 2019
Article Google Scholar
Kocak B, Durmaz ES, Erdim C, Ates E, Kaya OK, Kilickesmez O: Radiomics of Renal Masses: Systematic Review of Reproducibility and Validation Strategies. AJR Am J Roentgenol 214:129–136, 2020
Article Google Scholar
Gitto S, Cuocolo R, Albano D, Chianca V, Messina C, Gambino A, Ugga L, Cortese MC, Lazzara A, Ricci D, Spairani R, Zanchetta E, Luzzati A, Brunetti A, Parafioriti A, Sconfienza LM: MRI radiomics-based machine-learning classification of bone chondrosarcoma. Eur J Radiol 128:109043, 2020
Fritz B, Müller DA, Sutter R, Wurnig MC, Wagner MW, Pfirrmann CWA, Fischer MA: Magnetic Resonance Imaging–Based Grading of Cartilaginous Bone Tumors. Invest Radiol 53:663–672, 2018
Article Google Scholar
Gitto S, Cuocolo R, Annovazzi A, Anelli V, Acquasanta M, Cincotta A, Albano D, Chianca V, Ferraresi V, Messina C, Zoccali C, Armiraglio E, Parafioriti A, Sciuto R, Luzzati A, Biagini R, Imbriaco M, Sconfienza LM: CT radiomics-based machine learning classification of atypical cartilaginous tumours and appendicular chondrosarcomas. EBioMedicine 68:103407, 2021
Lisson CS, Lisson CG, Flosdorf K, Mayer-Steinacker R, Schultheiss M, von Baer A, Barth TFE, Beer AJ, Baumhauer M, Meier R, Beer M, Schmidt SA: Diagnostic value of MRI-based 3D texture analysis for tissue characterisation and discrimination of low-grade chondrosarcoma from enchondroma: a pilot study. Eur Radiol 28:468–477, 2018
Article Google Scholar
Li L, Wang K, Ma X, Liu Z, Wang S, Du J, Tian K, Zhou X, Wei W, Sun K, Lin Y, Wu Z, Tian J: Radiomic analysis of multiparametric magnetic resonance imaging for differentiating skull base chordoma and chondrosarcoma. Eur J Radiol 118:81–87, 2019
Article Google Scholar
Yin P, Mao N, Liu X, Sun C, Wang S, Chen L, Hong N: Can clinical radiomics nomogram based on 3D multiparametric MRI features and clinical characteristics estimate early recurrence of pelvic chondrosarcoma? J Magn Reson Imaging 51:435–445, 2020
Article Google Scholar
Mongan J, Moy L, Kahn CE: Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiol Artif Intell 2:e200029, 2020
De Coninck T, Jans L, Sys G, Huysse W, Verstraeten T, Forsyth R, Poffyn B, Verstraete K: Dynamic contrast-enhanced MR imaging for differentiation between enchondroma and chondrosarcoma. Eur Radiol 23:3140–3152, 2013
Article Google Scholar
Geirnaerdt MJA, Hogendoorn PCW, Bloem JL, Taminiau AHM, van der Woude H-J: Cartilaginous Tumors: Fast Contrast-enhanced MR Imaging. Radiology 214:539–546, 2000
Article CAS Google Scholar
Yoo HJ, Hong SH, Choi J, Moon KC, Kim H, Choi J, Kang HS: Differentiating high-grade from low-grade chondrosarcoma with MR imaging. Eur Radiol 19:3008–3014, 2009
Article Google Scholar

Download references

Funding

Open access funding provided by Università degli Studi di Milano within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Dipartimento Di Scienze Biomediche Per La Salute, Università Degli Studi Di Milano, Via Luigi Mangiagalli 31, 20133, Milan, Italy
Salvatore Gitto & Luca Maria Sconfienza
Dipartimento Di Medicina Clinica E Chirurgia, Università Degli Studi Di Napoli “Federico II”, Naples, Italy
Renato Cuocolo
Laboratory of Augmented Reality for Health Monitoring (ARHeMLab), Dipartimento Di Ingegneria Elettrica E Delle Tecnologie Dell’Informazione, Università Degli Studi Di Napoli “Federico II”, Naples, Italy
Renato Cuocolo
Unità di Radiodiagnostica, Presidio CTO, ASST Pini-CTO, Milan, Italy
Ilaria Emili
Dipartimento di Radiologia Diagnostica ed Interventistica, Università degli Studi di Milano, Ospedale San Paolo, Milan, Italy
Laura Tofanelli
Ospedale Evangelico Betania, Naples, Italy
Vito Chianca
Clinica Di Radiologia, Istituto Imaging Della Svizzera Italiana - Ente Ospedaliero Cantonale, Lugano, Switzerland
Vito Chianca
IRCCS Istituto Ortopedico Galeazzi, Milan, Italy
Domenico Albano, Carmelo Messina & Luca Maria Sconfienza
Sezione Di Scienze Radiologiche, Dipartimento Di Biomedicina, Neuroscienze E Diagnostica Avanzata, Università Degli Studi Di Palermo, Palermo, Italy
Domenico Albano
Dipartimento Di Scienze Biomediche Avanzate, Università Degli Studi Di Napoli “Federico II”, Naples, Italy
Massimo Imbriaco

Authors

Salvatore Gitto
View author publications
You can also search for this author in PubMed Google Scholar
Renato Cuocolo
View author publications
You can also search for this author in PubMed Google Scholar
Ilaria Emili
View author publications
You can also search for this author in PubMed Google Scholar
Laura Tofanelli
View author publications
You can also search for this author in PubMed Google Scholar
Vito Chianca
View author publications
You can also search for this author in PubMed Google Scholar
Domenico Albano
View author publications
You can also search for this author in PubMed Google Scholar
Carmelo Messina
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Imbriaco
View author publications
You can also search for this author in PubMed Google Scholar
Luca Maria Sconfienza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Salvatore Gitto.

Ethics declarations

Compliance with ethical standards

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The local Institutional Review Board approved this retrospective study and waived the need for informed consent.

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Salvatore Gitto and Renato Cuocolo contributed equally to this work.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gitto, S., Cuocolo, R., Emili, I. et al. Effects of Interobserver Variability on 2D and 3D CT- and MRI-Based Texture Feature Reproducibility of Cartilaginous Bone Tumors. J Digit Imaging 34, 820–832 (2021). https://doi.org/10.1007/s10278-021-00498-3

Download citation

Received: 12 August 2020
Revised: 27 May 2021
Accepted: 19 July 2021
Published: 17 August 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s10278-021-00498-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Effects of Interobserver Variability on 2D and 3D CT- and MRI-Based Texture Feature Reproducibility of Cartilaginous Bone Tumors

Abstract

Similar content being viewed by others

The diagnostic value of magnetic resonance imaging-based texture analysis in differentiating enchondroma and chondrosarcoma

Diagnostic value of MRI-based 3D texture analysis for tissue characterisation and discrimination of low-grade chondrosarcoma from enchondroma: a pilot study

Diffusion-weighted MRI radiomics of spine bone tumors: feature stability and machine learning-based classification performance

Introduction