Introduction

Optical coherence tomography (OCT) is an indispensable ophthalmic imaging technology that effectively identifies retinal structural alterations. OCT technologies have undergone longitudinal development from time-domain OCT to spectral-domain OCT (SD-OCT) and swept-source OCT (SS-OCT). The recently developed SS-OCT uses a tunable light source with a central wavelength of 1,050 nm, and a photodiode detector with a semiconductor camera for light detection. These features permit a high scanning speed and a deep imaging range with uniform sensitivity. In glaucoma cases, these technological advances in OCT device have eased the measurement of changes in the peripapillary retinal nerve fibre layer (PP-RNFL) and ganglion cell-inner plexiform layer (GC-IPL) thickness. Both these layers are critical to evaluate the extent of damage of the glaucomatous optic nerve.

Diagnostic precision is of utmost importance when diagnosing a disease or monitoring its progression using OCT. Both image quality and repeatability/reproducibility of an OCT measurement affect its overall diagnostic precision. Segmentation error and misalignment of measurement area generate artefacts that affect the image quality and, ultimately, the OCT measurement values1,2,3. Repeatability and reproducibility relate to the scatter of measured values and indicate whether a constant value is obtained when the same object is measured repeatedly. These parameters are helpful to monitor disease progression because repeated measurements are performed over time at the same anatomical region of an individual patient. Both SD-OCT and SS-OCT have demonstrated repeatability and reproducibility for clinical use4,5,6,7, which is an important reason for the widespread use of OCT in the diagnosis and management of various ocular conditions, including glaucoma.

Although image quality and repeatability/reproducibility of OCT images are important factors when interpreting the results, only few studies have previously investigated the effect of image quality fluctuations on repeatability or reproducibility8,9. Moreover, these studies were implemented using time-domain OCT or SD-OCT at the peripapillary area. Therefore, this study aimed to evaluate the effect of image quality fluctuations on the repeatability of SS-OCT measurement values in both the macular and peripapillary areas. The results of this study indicate the importance of maintaining image quality in SS-OCT while performing repeated measurements.

Results

Of the 58 healthy subjects who were selected for OCT imaging, two were excluded based on their image quality scores. SS-OCT data of 56 subjects (25 men and 31 women), comprising 168 results from the three consecutive OCT examinations, were analysed. Based on the tertile values of the mean absolute difference of image quality score, the subjects were stratified into three groups—low image quality score difference group (LIQD; with scores ranging between 0.06 and 0.86 in PP-RNFL, and between 0.067 and 0.747 in GC-IPL), moderate image quality score difference group (MIQD; with scores ranging between 0.873 and 1.927 in PP-RNFL, and between 0.753 and 1.227 in GC-IPL), and high image quality score difference group (HIQD; with scores ranging between 1.947 and 10.053 in PP-RNFL, and between 1.253 and 8.3 in GC-IPL). The three groups showed no significant differences in their demographic or clinical characteristics (Table 1).

Table 1 Comparison of demographics and clinical characteristics among groups.

Comparison of PP-RNFL and GC-IPL thicknesses among the three groups

Table 2 shows results for the comparison of PP-RNFL and GC-IPL thicknesses among the three groups at each measurement sector. The linear mixed model showed no significant differences in PP-RNFL and GC-IPL thicknesses of different sectors among the three groups (Table 2). However, when the difference in image quality between OCT examinations was large, GC-IPL tended to be thick; this tendency was not seen in the peripapillary sectors.

Table 2 Comparison of thickness values measured using SS-OCT among the three groups.

Correlations between image quality and SS-OCT results at each measurement sector

Correlation analyses between image quality and OCT results at each measurement sector were performed for repeated measurements (Table 3). After adjusting for age and sex, five sectors showed significant negative correlations between image quality and PP-RNFL (average PP-RNFL, superotemporal, superior, inferior, and temporoinferior sectors) or GC-IPL (average GC-IPL, temporosuperior, nasoinferior, inferior, and temporoinferior sectors).

Table 3 Correlations and partial correlations between image quality and SS-OCT results.

Comparisons of repeatability among the three groups at each measurement sector

ICC of three consecutive measurement values was calculated and compared among the groups (Table 4). The overall repeatability was high in all sectors for all groups (ICC > 0.8). The ICC values were the lowest for the HIQD group in every measurement sector. Figure 1 shows the representative results for difference in thickness at each measurement sectors of PP-RNFL by image quality difference. With increase in the image quality difference value, the difference between the measured values increased accordingly. Results of between-group comparisons showed significant differences in repeatability at only two sectors (temporoinferior for PP-RNFL; inferior for GC-IPL) in the LIQD and MIQD groups. In addition, results of comparisons between LIQD and HIQD groups, and between MIQD and HIQD groups, showed significant differences in repeatability at most sectors for PP-RNFL, except at the superior, nasal, superior nasal, and nasoinferior sectors. On comparison of repeatability in GC-IPL sectors, significant differences were seen at the temporosuperior, inferior, and temporoinferior sectors between LIQD and HIQD groups, and at the average GC-IPL, nasoinferior, and inferior sectors between MIQD and HIQD groups. No sector showed significant differences in repeatability when compared between LIQD and MIQD groups. The proportion of sectors affected by image quality fluctuations was higher in PP-RNFL than in GC-IPL.

Table 4 Comparison of repeatability among the three groups.
Figure 1
figure 1

The representative results of differences between the measured values by image quality difference at each sector of PP-RNFL. The image quality difference for a, b, c, and d was 0.107, 0.88, 5.233, and 8.087, respectively. X-axis indicates the measurement sectors, and Y-axis indicates the difference between the masured values. The solid line indicates the difference between the first and second measurements. The thick dotted line indicates the difference between the second and third measurements. The thin dotted line indicates the difference between the first and third measurements. PP Aver average PP-RNFL thickness, T temporal, S superior, N nasal, I inferior, TS temporosuperior, ST superotemporal, SN superonasal, NS nasosuperior, NI nasoinferior, IN inferonasal, IT inferotemporal, TI temporoinferior, PP-RNFL peripapillary retinal nerve fibre layer.

Discussion

The results of this study, which investigated the association between image quality fluctuations and repeatability of SS-OCT measurements, showed that repeatability decreases with an increase in image quality fluctuation in several sectors of PP-RNFL and GC-IPL. These observations were made in healthy subjects with an OCT image quality > 60, which was calculated as per manufacturer’s recommendation for clinical use. Therefore, it can be said that our study was conducted under settings wherein the factors affecting OCT results, such as low image quality (image quality score < 60) and structural alteration by ocular disease, were controlled. In addition, when the study groups were compared based on the mean absolute difference among three consecutive OCT measurements, no significant differences were noted in the measured thickness at any of the measurement sectors (Table 2). This result also indicates that there was no large deviation in the measured values of our data set. Nevertheless, even with good image quality (recommended for clinical use) and high repeatability (based on ICC), the measurement repeatability was affected by image quality fluctuations in several sectors, especially in comparisons involving the HIQD group. Moreover, this phenomenon affected sectors that are considered important in glaucoma management. Thus, it is crucial to maintain not only a high level of image quality but also a constant value of image quality for the clinical application of SS-OCT.

Interestingly, although the HIQD group had the lowest ICC value of each measurement sector among the three groups, not all sectors showed significant differences on comparison with the LIQD or MIQD groups. In addition, only five sectors of the clock-hour map for PP-RNFL (superotemporal, nasal, inferonasal, inferotemporal, and temporoinferior sectors) showed ICC values under 0.9. If repeatability is exclusively determined by image quality, the repeatability of the OCT results obtained from subjects of HIQD group should be lower regardless of location of the measurement sectors. Segmentation is important for analysing the thickness of the retinal layer using OCT results. Although image quality is a critical factor for segmentation, ocular structural factors such as axial length, shape of optic disc, or tortuosity of retinal vessel also affect segmentation3,10,11. The superotemporal, inferonasal, inferotemporal, and temporoinferior sectors contain retinal blood vessels, which contribute to the structural variation of the parapapillary area. Thus, the anatomic structure around the optic disc, which varies largely even in healthy eyes, could have influenced the repeatability.

Inter-individual diversity in the optic disc shape and peripapillary structures contribute to inaccuracies in the measurement of PP-RNFL thickness by OCT. In contrast, the macular area is well-known for its inter-individual similarities12,13,14. Such inaccuracies might influence clinical decision-making in glaucoma management. Therefore, several studies have emphasised on the usefulness of GC-IPL parameters for the diagnosis of glaucoma in myopic eyes15,16,17. In the present study, the repeatability of GC-IPL sectors was relatively less affected by image quality fluctuations as compared to PP-RNFL sectors. This result further supports the usefulness of macular GC-IPL thickness evaluation for estimating glaucoma status, although further studies on patients with glaucoma are required to confirm this occurrence. Previous studies have shown a positive correlation between image quality and OCT-based measurement of macular or PP-RNFL thickness18,19,20,21, i.e., a reduction in image quality decreases the macular or PP-RNFL thickness, thereby leading to incorrect OCT interpretations of glaucoma progression. In this study, image quality correlated significantly in several sectors for both PP-RNFL and GC-IPL thickness, and this result did not change even after adjusting for age and sex. Therefore, image quality remains an essential factor in the interpretation of SS-OCT results. Unlike the correlation results reported previously, the negative correlation between the thickness values and image quality may be due to repeated measurements, small sample size, or unknown intrinsic characteristics of SS-OCT. It is possible that a study on patients with glaucoma may yield negative correlation between the thickness values and image quality.

Studies on the relationship between image quality fluctuations and repeatability of OCT measurements are limited. Lee et al. reported the effect of signal strength difference on the repeatability of PP-RNFL thickness in time-domain OCT8, and Kim et al. reported the effect of signal strength on PP-RNFL thickness and colour-coded classification in SD-OCT9. Both studies inferred that substantial differences in the signal strength lower the repeatability. Our study presents similar results using SS-OCT. Compared to previous studies, the use of three consecutive measurements for statistical analysis provide more reliability to this study, and this strategy is more appropriate for identifying the impact of image quality fluctuation on OCT results.

This study has several limitations. First, although the data were collected prospectively, the number of subjects included was relatively small. Second, the effect of image quality fluctuation on repeatability was studied in healthy subjects. A similar study on patients with glaucoma will help to understand the clinical significance of image quality fluctuations on SS-OCT results. Third, the results of our study cannot be applied directly to other studies focused on other types of OCT. This is because the image quality score which was used for calculating image quality fluctuation in the present study was developed by the manufacturer of DRI OCT, although it is not difficult to predict that the accuracy of segmentation of the OCT will be lowered if the quality of the image deteriorates. Further studies involving other types of OCT seem necessary to clarify the effect of image quality fluctuation on repeatability in each type of OCT. Despite these limitations, our findings are meaningful because this is the first study to investigate the effect of image quality fluctuation on repeatability in SS-OCT using prospectively collected data.

In conclusion, this study reported that higher image quality fluctuation leads to lower repeatability of SS-OCT results in several sectors of PP-RNFL and GC-IPL. Interestingly, the identified sectors were clinically important for glaucoma management. In addition, the repeatability of GC-IPL sectors was relatively less affected than that of PP-RNFL sectors by image quality fluctuations. Thus, maintaining a high-quality image status is vital to enhance the reliability of SS-OCT for PP-RNFL and GC-IPL measurements, more so in the PP-RNFL region.

Methods

This study collected raw data retrospectively from the dataset used in a previous study to compare the repeatability and agreement between SD-OCT and SS-OCT in healthy eyes5. The institutional review board of Yonsei University Severance Hospital, Seoul, Korea, approved this study (1-2019-0043), and the need for written informed consent was waived because of the retrospective study design. The study adhered to the tenets of the Declaration of Helsinki. The detailed characteristics of the subjects in dataset have been described previously5. Normal subjects who had visited the glaucoma clinic at our hospital between August 2014 and December 2014 were enrolled Medical history, Snellen best-corrected visual acuity (BCVA), slit-lamp biomicroscopy findings, intraocular pressure (IOP; Goldmann applanation tonometry), and indirect ophthalmoscopy findings were obtained. In addition, the following data were acquired: axial length estimated using the IOL Master (Carl Zeiss Meditec AG, Jena, Germany); central corneal thickness calculated using ultrasound pachymetry (DGH-1000; DGH Technology Inc., Frazer, PA, USA); optic disc and RNFL thickness measurements performed using a + 90 diopter (D) lens, colour disc, and red-free photography (VISUCAM200, Carl Zeiss Meditec AG, Jena, Germany). Optic nerve function had been estimated using a Humphrey Visual Field analyser (24-2 Swedish Interactive Threshold Algorithm; Carl Zeiss Meditec, Inc., Dublin, CA, USA).

Healthy subjects of age > 19 years with a BCVA ≥ 20/25 and no evidence of glaucomatous optic disc changes, RNFL defects, or visual field changes with IOP < 21 mmHg were included retrospectively. The eye that was analysed in each patient was selected randomly. Exclusion criteria were the presence of cataract grade of Lens Opacities Classification System III > 3, axial length > 24.5 mm, refractive errors with spherical equivalent >  ±5D, or cylindrical error >  ±3D, and any medical or ophthalmic conditions that influenced the optic disc, RNFL, and visual field measurements.

Thickness measurement using SS-OCT for repeatability

In this study, we used the DRI OCT-1 system (Topcon, Tokyo, Japan, analysis software version 9.1.2.28693), which had a high-speed wavelength tuning laser source with central wavelength of 1,050 nm. This SS-OCT system had an image acquisition speed of 100,000 A-scan/second, with an axial and transverse resolutions of 8 and 20 µm, respectively. Three consecutive SS-OCT scans were acquired on the same day with an interval of at least 5 min between the scans. A single technician performed all scans using an internal fixation target. Pupillary dilation was performed in all subjects. A three-dimensional (3D) optic disc and 3D wide scan protocols were used to measure PP-RNFL and GC-IPL thicknesses, respectively. The 3D optic disc scan covered a 6 × 6-mm area on the optic disc and comprised 512 A-scans × 256 B-scans. PP-RNFL thickness was measured in a 3.4-mm-diameter scan circle centred on the optic disc. The 3D wide scan protocol covered a 12 × 9-mm rectangular area centred between the optic disc and fovea and comprised 512 A-scans × 256 B-scans. PP-RNFL thicknesses was measured in each quadrant (evenly spaced 4 sectors), 12 clock-hour sectors (evenly spaced 12 sectors), and as an average. The quadrant PP-RNFL sector names started with the number 4, while the clock-hour sector names started with the number 12. The average GC-IPL thickness and measurement in each of six sectors (evenly configured sectors centred on the fovea) were collected. Built-in automated segmentation algorithms were used to distinguish each retinal layer. Two investigators (S.Y.L. and Y.H.) independently reconfirmed the image quality, segmentation, and alignment of the measurement window. SS-OCT images with image quality scores > 60 were selected for analysis according to the manufacturer’s recommendation.

The mean absolute difference among three consecutive OCT measurements were calculated as follows:

$$\begin{aligned} & Mean\,absolute\,difference\,of\,image\,quality\,score{\text{:}} \\ & \quad \quad \left( {\left| {IQ1 - IQ2\left| + \right|IQ2 - IQ3\left| + \right|IQ1 - IQ3} \right|} \right)/3 \\ \end{aligned}$$

where IQn—image quality score at the nth measurement.

The subjects were stratified into three groups based on the tertile values of the mean absolute difference of image quality score—LIQD (n = 18), MIQD (n = 19), and HIQD (n = 19). Because subjects in the LIQD group were included in the first third when the mean absolute difference of image quality score was listed in ascending order, they had similar image quality scores among the three consecutive OCT results. In contrast, subjects in the HIQD group showed substantial variation among the three image quality scores because these subjects were the last third subjects.

Statistical analyses

Analyses of variance and chi-square tests were performed for the comparison of continuous and categorical variables between the groups. A linear mixed model compared the thickness values among the three groups. To determine the repeatability of three consecutive measurements, intraclass correlation coefficients (ICCs) were used. The degree of repeatability was decided according to the ICC value—almost perfect (0.81–1), substantial (0.61–0.8), moderate (0.41–0.6), fair (0.21–0.4), and slight (0–0.2)22. To compare the between-group ICC values, the z-score test was used22,23,24. Pearson’s correlation coefficients with and without adjustment of age and sex were used to investigate correlation between the image quality and thickness value. Correlation coefficients were estimated using a linear mixed-effects model to consider three datasets in one individual. All statistical analyses were performed using SAS version 9.4 software (SAS Institute Inc., Cary, NC, USA) by a statistician (H.S.L). Statistical significance was defined as p value < 0.05.