Background

Assessing changes in clinical status for patients with chronic illness is important. This provides clinicians and caretakers a method to gauge treatment response and deterioration in condition. The patients’ health status may not always be apparent to clinicians. Self-reported health-related quality of life (HRQoL) scores provide a subjective assessment of patient health. One chronic illness that is particularly relevant is spondyloarthritis (SpA). SpA encompasses a group of interrelated rheumatic conditions including ankylosing spondylitis (AS), psoriatic arthritis (PsA), spondyloarthritis associated with inflammatory bowel disease (IBD) and reactive arthritis. AS, regarded as the prototype of SpA, has been shown to be associated with greater work disability (WD) compared to the general population, with WD rates varying from 3-50% in western countries [1,2,3]. Patients with AS are 3.1 times more likely to have withdrawal from work than expected in the general population and they are also more likely to experience a lower quality of life (QoL) [4, 5]. This in turn will result in loss of work productivity and increased socioeconomic burden. Studies have also shown that patients with axial SpA report a lower HRQoL than do healthy controls and this reduction in HRQoL is associated with fatigue, pain, increased disease activity, and decreased daily activity and exercise [6,7,8]. In addition, a lower HRQoL in SpA patients is associated with adverse psychological outcomes and a higher prevalence of anxiety and depression [9].

There are mainly two different types of HRQoL instruments, namely disease-specific and generic, to assess patients of chronic diseases. For axial SpA, disease-specific tools for assessing functional disability include Bath Ankylosing Spondylitis Functional Index (BASFI), the Leeds Disability Questionnaire (LDQ) and the Dougados Functional Index (DFI). Generic instruments are more useful for assessments of the disease impact by allowing comparisons between different disease populations.

The EuroQoL 5-dimension (EQ-5D) is a generic health measure instrument developed by the EuroQoL group, which allows a quantitative expression of the individual’s perception of their overall health status [10]. It serves as an important utility measure for clinical and economic appraisal, particularly in the cost-utility analysis of various health care interventions, and the calculation of quality-adjusted life-years (QALYs). It has been applied to the Chinese population previously [11] and has been shown to be useful in assessing QoL in patients with SpA [12]. However, the responsiveness of EQ-5D to changes in disease status over time in patients with SpA is unclear. Responsiveness refers to the ability of a score to capture underlying changes in a patients’ health status over time. It is essential for clinicians to assess whether the treatment provided has improved the QoL in patients and whether further escalation of treatment is required. EQ-5D is also a valuable tool as it allows cross comparison with other rheumatological diseases. Hence, the aim of this study is to test the responsiveness of the EQ-5D in patients with SpA.

Methods

A total of 151 consecutive patients of Chinese ethnicity were prospectively recruited from two rheumatology specialist clinics between May to December 2017 and subsequently reassessed at a follow-up of 6 months later (November 2017 to June 2018). All recruited patients were diagnosed to have either axial SpA or peripheral SpA by rheumatologists based on the Assessment of Spondyloarthritis international Society (ASAS) criteria [13,14,15] and by expert opinion. All recruited patients were 18 years old or above. Patients who did not give consent for participation, non-Chinese, illiterate and unable to comprehend the instruments were excluded. Subjects who consented were interviewed for a panel of sociodemographic and disease-associated parameters, disease activity and severity factors, and HRQoL scores that highlight the functional and mental health status. Both baseline and follow-up interviews were conducted in person at the consultation clinic. At the follow-up interview, subjects were assessed by the same research personnel for a reassessment of the same study questionnaires as well as the global rating of change scale. To provide good quality of psychometric evidence, sample size of at least 100 was recommended by Terwee et al [16]. Ethics was approved by the local institutional review board. All methods were carried out in accordance with relevant guidelines and regulations.

Sociodemographic and disease-associated data

Patients’ smoking and drinking habits, education level, income and occupation were recorded. Disease-associated data including disease duration, presence of back pain and/or peripheral arthritis, dactylitis, enthesitis, and extra-articular manifestations such as uveitis, psoriasis, and IBD were collected. Baseline treatment including the use of non-steroidal anti-inflammatory drugs (NSAIDs) or cyclooxygenase-2 (cox-2) inhibitors, disease modifying anti-rheumatic drugs (DMARDs) and biologics and any subsequent change in treatment after 6 months were documented. Physical examination was performed to determine the number of tender joint count and swollen joint count, the dactylitis and enthesitis scores. Antero-posterior radiograph of the lumbosacral spine was utilized for grading of sacroiliitis according to the modified New York criteria [17] by a rheumatologist (HYC) who was blinded to the clinical data. Radiological sacroiliitis was graded as: 0, normal; 1, suspicious; 2, minimal sclerosis with some erosions; 3, erosion with widening of joint space and possible partial ankyloses; 4, complete ankyloses. Bilateral sacroiliitis of grade 2 or above, or unilateral sacroiliitis of grade 3 or above was defined as AS. Patients were treated by the attending rheumatologist according to their disease activity and severity.

Disease activity and severity scores

All recruited patients filled in the Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) [18] and BASFI [19] to determine the disease activity and functional disability respectively. Spinal mobility was assessed clinically to determine the Bath Ankylosing Spondylitis Metrology Index (BASMI) score [20]. The Bath Ankylosing Spondylitis Global Index (BASGI) [21] and C-reactive protein (CRP) were measured for calculation of the Ankylosing Spondylitis Disease Activity Score-CRP (ASDAS-CRP) [22], which is a composite disease activity measure of SpA. Human leucocyte antigen (HLA) B27 status was also checked as a poor prognostic marker. BASDAI and ASDAS are more often used for patients with axial disease. However, both tools have demonstrated good discriminatory ability in patients with peripheral SpA as well [23].

Functional and mental health status

The SF-36 [24,25,26] was used for assessment of mental and physical health and as a comparable generic questionnaire marker of EQ-5D changes. Hospital Anxiety and Depression Scale (HADS) [27] is a fourteen-item scale with seven items each for anxiety and depression subscales. It has been validated in Chinese axial SpA patients and is found to useful in screening for depressive and anxiety disorders in SpA [28].

The main study parameter was the EQ-5D which is a standardized measure of health status developed by the EuroQoL group that allows a generic assessment of health status for clinical and economic appraisal [10]. It has been useful in assessing the HRQoL in patients with musculoskeletal problems [29,30,31,32,33]. It consists of a 2-page questionnaire, the EQ-5D descriptive system and the EQ visual analogue scale (EQ VAS). The descriptive system is comprised of 5 domains, including mobility, self-care, usual activities, pain/discomfort and anxiety/depression. There are 2 versions of EQ-5D, namely the EQ-5D-3 level (EQ-5D-3L) and the EQ-5D-5 level (EQ-5D-5L) versions. For the EQ-5D-3L, each domain will be scored by 3 levels (no problem, some problem and extreme problem). We utilized the EQ-5D-5L version for this study and each domain of this parameter was scored by 5 levels with 1 representing no problem and 5 representing extreme problem. Previous studies published by EuroQoL group have shown that the 5 level version could significantly increase reliability and sensitivity while maintaining the feasibility of the test and it could potentially reduce ceiling effects [10]. The scores of the 5 domains are combined into a 5-digit number which is converted into a single index value. The EQ-VAS allows patients to self-report their own perceived quality of life from a scale of 0 (worst) to 100 (best). We applied Chinese-specific EQ-5D-5L value set ranging from -0.391 for the worst health status (‘55555’) to 1 for the best health status (‘11111’) to estimate EQ score [34].

Generic and Clinical Anchors

It was necessary to include an external anchor to act as a reference for indicating patient improvement or deterioration. To test the responsiveness of EQ-5D, this anchor represented the patient-reported assessment of health change over time and thus indicate whom change in health occurred [35]. The global rating of change (GRC) scale is a single-item outcome measure for independent scoring of self-perceived improvement in a patient retrospectively and has been used in musculoskeletal research [36]. All subjects answered the question “Compared to the previous visit, how would you rate your overall health now?” [36]. The response scale was a seven-point Likert scale ranging from -3 to 3 corresponding to the ‘much worse’ to the ‘much better’ options with 0 for ‘no change’. Three groups were defined using this scale: ‘worse’ (-3 to -1), ‘unchanged’ (0) and ‘improved’ (1 to 3) and such re-grouping or categorization was applied in previous studies to evaluate responsiveness [30, 37].

The GRC was a generic anchor used to test the overall patient improvement or deterioration. Clinical anchors were also applied namely BASDAI and ASDAS-CRP to assess the changes in disease activity. These differences were more representative of actual improvement or deterioration in the disease as compared to the GRC scale which may be subjected to mental and psychological influences.

Statistical analysis

Overall descriptive characteristics were reported with mean ± standard deviation (SD). Any differences between baseline and follow-up were compared using independent t-test and Chi-squared test where appropriate. The responsiveness of the EQ-5D was assessed using the effect size statistics. Differences between baseline and follow-up of the utility score was evaluated by standardized effect size (SES) and standardized response mean (SRM) separately for GRC, BASDAI, BASFI and ASDAS-CRP. We have adopted the minimum clinically important improvement (MCII) of 1.1 for BASDAI and 0.6 for BASFI [38]. Change in the MCII of BASDAI and BASFI will be correlated with change in EQ-5D. As for ASDAS-CRP, it is categorized as inactive disease (<1.3), moderate disease activity (1.3-<2.1), high disease activity (2.1-<3.5), and very high disease activity (>3.5). A change of 1.1 is considered as clinically significant change [39]. The SES and SRM results were interpreted as trivial for values <0.2, small for values ≥0.2 to <0.5, moderate for values ≥0.5 to <0.8, and large for values ≥0.8 [40]. Differences in mean change at follow-up by disease activity assessment with BASDAI and BASFI, and GRC, were performed along with area under the curve analysis (AUC).

Spearman’s correlation was performed to assess the relationship between changes in EQ-5D scores with erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), ASDAS-CRP, ASDAS-ESR, BASDAI, BASFI, SF-36, and HADS. Spearman’s correlation was used because the data was not normally distributed as reviewed by the Shapiro-Wilk normality test. The correlation coefficient is considered weak at 0.3, moderate at 0.5 and strong at 0.7. All statistical analyses were conducted using STATA version 13.0. A p-value of <0.05 was considered as statistically significant and 95% confidence intervals (CIs) were listed as appropriate.

Results

From a total of 151 Chinese patients with SpA recruited consecutively at baseline, 113 (74.8%) completed the follow-up assessments. The baseline demographics are listed in Table 1. The mean age of subjects who completed all assessments was 44.7±13.0 years, and 66.4% of them were male patients. Most patients (61.6%) had low disease activity with BASDAI of <4 and 39.7% of patients had inactive disease by ASDAS-CRP. For the baseline treatment, 75.5% of the patients were on NSAIDs or cox-2 inhibitors, 31.8% were on DMARDs (including sulphasalazine, methotrexate and/or leflunomide) and 25.8% were on biologics (including tumour necrosis factor inhibitors, secukinumab or ustekinumab).

Table 1 Demographic and clinical characteristics of patients

The mean change of EQ-5D and EQ-VAS scores by disease activity and GRC are shown in Tables 2 and 3. Improved and worsened EQ-5D scores discriminated well with change in disease activity level measured by BASDAI (improved: p=0.012, SES=0.84, SRM=0.87, RS=1.05; worsened: p=0.004, SES=-0.70, SRM=1.00, RS=-0.74). Using the MCII, the EQ-5D scores discriminated well with BASDAI (p=0.001, SES=1.07, SRM=1.19, RS=1.03) and with BASFI (p=0.001, SES=0.79, SRM=1.12, RS=0.73). Post-hoc power analysis showed that sample sizes of 13 in a group of worsened disease activity measured by BASDAI achieved 96% to detect a difference of -0.08 with an estimated SD of 0.08 and a significance 0.05 using one-sided one sample t-test, and sample size of 12 in an improved group achieved 86% to detect a difference of 0.11 assuming an estimated SD of 0.13 using one-sided one sample t-test. Up to 88 patients did not have a change in disease activity level based on BASDAI. For BASDAI and BASFI MCII, the mean difference detected was 0.16 and 0.15 respectively. The effect size (1.36 and 1.29) and AUC (0.85 and 0.83) were acceptable. There were no patients listed as clinically improved with the ASDAS-CRP. No significant findings were observed for the GRC. When comparing the EQ-5D scores at baseline and follow-up, no significant ceiling or floor effects were observed (Table 4). Comparing the differences in EQ-5D-5L scores from baseline to follow-up (Fig. 1), there was overall improvement in various domains: mobility (31.4% with one level reduction), usual activities (22.9% with one level reduction), pain/discomfort (22.9% with one level reduction), depression/anxiety (17.1% with one level reduction) and self-care (17.1% with one level reduction). Change in EQ-5D score correlates with changes in the SF36 domains of physical function (r=-0.202; p=0.036), role limitation due to physical function (r=-0.205; p=0.033) and role limitation due to emotional problems (r=-0.247; p=0.009). Similarly, change in EQ-5D scores significantly correlated with both anxiety and depression domains of HADS. There was no correlation between the change in EQ5D scores and change in treatment at 6 months but the addition of NSAIDs/cox-2 inhibitor was significantly associated with improvement in EQ-VAS score (Table 5).

Table 2 Mean Change, Standardized Effect Size, Standardized Response Mean and Responsiveness Statistic of EQ-5D Score and EQ-VAS by Disease Activity and GRS
Table 3 Difference in Mean Change at follow-up by Disease activity (BASDAI), Clinical Improvement (Minimum Clinically Important Improvement (MCII)) and Global Rating of Change Scale
Table 4 Descriptive statistics of EQ-5D-5L utility score and EQ-VAS at baseline and follow-up
Fig. 1
figure 1

Distribution of EQ-5D-5L responses in the study cohort

Table 5 Change of treatment status at time of follow-up in relation to EQ-5D

Discussion

SpA is a chronic debilitating disease that significantly reduces a patient’s QoL. The disease cannot be eradicated and thus patients require prolonged treatment to control the disease process and reduce symptomatology. Constant monitoring is necessary as symptoms and disease activity may fluctuate and warrant prompt adjustment of medications. This carries a heavy toll on patients’ physical and mental wellness as they are faced with changing treatment outcomes, for better or worse, and facing new concerns and complications. With the high cost for various disease-modifying drugs, it is important for the patients and medical practitioners to design the most cost-effective strategies. Determining QALYs aid in this understanding of disease burden on the healthcare system, which will in turn drive various institutional policies based on cost-utility analyses. The EQ-5D has been shown to be an effective utility score for SpA. We have found the EQ-5D to discriminate improved and worsened disease activity levels well in patients with SpA.

The EQ-5D instrument is a good measure of disease activity change as shown by its strong association with clinically significant changes in BASDAI and BASFI scores shown by the SES and SRM. The SES was near 0 in the unchanged group which verifies its accuracy in detecting change. The SES of EQ-5D for the improved group was 0.84 and for the worsened group was -0.70. These results were similar to that of other chronic musculoskeletal disorders like scoliosis deformities [37]. The higher disease activities supported by increased BASDAI score was identified by a reduction in EQ-5D. Similarly, reduced disease activity shown by reduced BASDAI score is matched by an increased EQ-5D score. No change in disease activity was also supported by no change in EQ-5D scores. Despite a small percentage of individuals with a ceiling effect at baseline and follow-up, the scores are representative of disease status changes. The ceiling effect indicates the highest possible score on the instrument and normally refers to clustering of scores at a certain extreme. This corresponded to the low disease activity scores that are unlikely to experience further improvement in health at follow-up. Conversely, there is no floor effect indicating that the instrument is sensitive to deteriorations in disease status that warrants treatment regimen changes. Hence, the EQ-5D is an appropriate tool for studying patients with SpA.

Due to the lack of clinically improved patients by ASDAS-CRP, we were unable to formulate any useful conclusions. This may be the limitation of its score to detect patient perceived QoL. Although we followed the clearly established cut-off value of ASDAS-CRP to determine improvement or worsened scores [39], we were unable to identify any individuals with improved ASDAS-CRP despite improved patients categorized by BASDAI. Despite ASDAS-CRP being a more objective assessment of SpA disease activity, it may not reflect the patient’s perceived health as well. The components of BASDAI describes more subjective self-perceived components of pain, discomfort, and other disease manifestations. Hence, it is expected for BASDAI and EQ-5D to match better since they are both patient perceived HRQoL scores.

It is also interesting to see the GRC scale as an unsatisfactory anchor for EQ-5D changes. This is not an unusual finding. Some HRQoL measures may be more responsive to a clinical anchor rather than GRC [41]. Moreover, the GRC may be affected by many other factors whereas BASDAI is more targeted to the various facets of the disease. Various external factors such as the rapport with the doctor and mental status of the patient may influence the reporting of GRC. Comparatively, BASDAI is more appropriate for the disease status and it appears that the EQ-5D is able to capture changes in disease status as well.

We also found a significant correlation between EQ-5D and HADS. HADS is a tool commonly used by psychiatrists for assessing risks of depression and anxiety. Various studies have demonstrated a prevalence of depression ranging from 11 to 31% in patients with SpA [28, 42, 43]. By correlating EQ-5D with the HADS scores, we can associate a worse HRQoL in the presence of anxiety and depression. The EQ-5D score can also help identify patients with a higher risk of developing depression and anxiety. We found that the addition of NSAIDs or cox-2 inhibitors correlate with an improvement in EQ-VAS but it did not correlate with the change in EQ-5D scores. This suggests that pain reduction is not the sole determinant of HRQoL for patients with SpA.

The main limitation of this study is an incomplete follow-up of 25%. Nevertheless, the proportion of disease activity categories and patient profiles remain similar. There is also sample heterogeneity with variable presentations of axial or peripheral involvement. Nevertheless, we did have a reasonable effect size generated from the EQ-5D results. It is also important to note that the disease-specific Assessment of SpondyloArthritis international Society (ASAS) health index [44] was not used in this study. This is an instrument that should be compared with EQ-5D in future study.

Conclusion

The EQ-5D-5L demonstrates satisfactory responsiveness properties for assessment of changes in health status in patients with SpA. It appears to represent the patient reported HRQoL better than more objective assessments. Future study should assess the versatility of the utility score to compare different treatment regimens and its cost-utility with other chronic diseases.