1.1 The EQ-5D as an Instrument for Measuring and Valuing Health

Since the 1990s, the EQ-5D instrument has held a pivotal role in the measurement of self-reported health status and health-related quality of life (HRQoL) (Devlin and Brooks 2017). The availability of a concise generic instrument for measuring patients’ and population self-reported healthFootnote 1 meant that it could be included, with minimal responder burden, in clinical trials, observational studies, and population health surveys. More recently, it has become the cornerstone of routine outcomes measurement in health care systems such as the English NHS PROMs programme and Sweden’s national quality registers. The ability of the EQ-5D to measure HRQoL in a generic manner has the important advantage of yielding data that can readily be compared across disease areas and between patient and population sub-groups, and against population norms. This broad comparability of EQ-5D data is particularly crucial in providing evidence that quantifies health benefits in a standardised and transparent manner to inform decisions regarding alternative ways of using health care resources.

The EQ-5D was developed by the EuroQol Group, then a small group of academics which has now grown into an international network of multidisciplinary researchers with more than 100 members worldwide (Devlin and Brooks 2017). The development of the EQ-5D was motivated in part by the specific goal of providing evidence on the outcomes of health care programmes in a manner that would facilitate economic evaluation. One of the considerations underpinning the development of the instrument was that it would be accompanied by the ‘values’ (sometimes also referred to as ‘utilities’, ‘quality of life weights’, the ‘EQ-5D Index’ or ‘EQ Index weights’) that would enable the quality adjustment of life years as required for the estimation of quality-adjusted life-years (QALYs) used in cost effectiveness analysis (Drummond et al. 2015). The availability of value sets for this purpose has been a notable part of the success and uptake of EQ-5D instruments.

The value sets that accompany EQ-5D instruments provide a means of summarising, via a single number, how good or bad health status is as described by the EQ-5D. The responses to the EQ-5D instrument – that is, the particular combination of levels which are indicated on each of the five dimensions (mobility, self-care, usual activities, pain/discomfort, and anxiety/depression), by those completing it – can be described as EQ-5D ‘profiles’ (see Box 1.1). The original version of the EQ-5D, the EQ-5D-3L, has three response levels for each of the five dimensions, describing a total of 35 = 243 possible profiles (Brooks 1996). The focus of this book is on the later five level version, the EQ-5D-5L, development of which is described in more detail in the following section – which describes a total of 55 = 3125 profiles (Herdman et al. 2011).

The value sets for these instruments provide a single value for each of the possible profiles described by them. These values lie on a scale anchored at 1 (full health) and 0 (dead), as is required for the estimation of QALYs. The values are built up from a set of sub-weights which represent the relative importance of each level of problem in each dimension, and indicate how good or bad these are overall, when combined in EQ-5D-5L profiles. The term value set refers to a set of values for all possible profiles defined by a particular EQ-5D instrument, and is occasionally also referred to by other names, such as an EQ-5D ‘tariff’ or ‘social values.’ For the purposes of this book, we will use the terms value and value set.

These values are usually based on the average preferences of the relevant adult general population, obtained using stated preference methods such as the Time Trade-Off (TTO). These stated preference methods aim to elicit values which have the desired properties for estimating QALYs (see Box 5.1 in Chap. 5). Indeed, the availability of EQ-5D values which are suitable for this purpose has led to the EQ-5D being the most widely recommended questionnaire for use in the cost effectiveness evidence submitted to Health Technology Appraisal (HTA) bodies. The EQ-5D is recommended in 85% of HTA guidelines (Kennedy-Martin et al. 2020), including those of the UK’s National Institute for Health and Care Excellence (NICE 2013).

The EuroQol Group was and continues to be a pioneer in the development of local/national value sets. The development of EQ-5D-3L value sets was, as an international research effort, unparalleled in the availability of country-specific values (Szende et al. 2007). There are currently EQ-5D-3L value sets available for 35 countries and, for the EQ-5D-5L, 25 countries, with still further value set studies underway or planned. Both the EQ-5D-3L and the EQ-5D-5L, and the value sets which accompany them, continue to be used next to each other in many countries. The value sets facilitated the use of data from EQ-5D instruments in the estimation of QALYs based on local preferences, as well as in other, ‘non-economic’ applications where EQ-5D profile data are summarised in a way that reflects the relative importance of the different dimensions.

Box 1.1: EQ-5D Questionnaires, EQ-5D Profiles and Values

EQ-5D questionnaires comprise two key parts:

  1. (i)

    the EQ-5D descriptive system, as shown below for the EQ-5D-5L. Respondents are asked to indicate the level of problem they experience on each of the five dimensions today. The combination of these ticks describes that person’s EQ-5D self-reported health state, referred to as an ‘EQ-5D profile’

  2. (ii)

    the EQ VAS, a vertical visual analogue scale capturing respondents’ overall assessment of their health on a scale from 0 (worst possible health you can imagine) to 100 (best possible health you can imagine) (not shown here).

  • The EQ-5D-5L questionnaire

figure a

© EuroQol Research Foundation. EQ-5DTM is a trademark of the EuroQol Research Foundation. Reproduced by permission of EuroQol Research Foundation. Reproduction of this version is not allowed. For reproduction, use or modification of the EQ-5D (any version), please register your study by using the online EQ registration page: www.euroqol.org.

The EQ VAS is an important part of the questionnaire and provides the patients’ overall assessment of their own health on a visual analogue scale. However, many applications of EQ-5D data, including the estimation of QALYs for economic evaluation, focus instead on the use of EQ-5D profile data. The profile data, and the use of value sets to summarise those data, is the focus of this book, and therefore the EQ VAS is not discussed further. There are in fact many ways of analysing EQ-5D profile data, as detailed in Devlin et al. (2020). One of these ways is by weighting the profile using values sets. This is the most common way of using EQ-5D data in cost effectiveness analysis. This book focuses on the value sets available for the EQ-5D-5L and their use in weighting EQ-5D-5L profile data.

The value sets provide a way of converting the profiles into a single number that reflects how good or bad people think they are. The values are usually obtained using stated preference methods, and yield values that lie on a scale anchored by the value of 1 for full health, and 0 for dead. EQ-5D values cannot be higher than 1, but values <0 are possible, and indicate health states considered on average to be worse than dead (WTD). Value sets are generally intended to represent the average preferences of local/national populations – so EQ-5D value sets differ between countries. See Chap. 4 for a summary of the available value sets for EQ-5D-5L, and Chap. 6 for information about the differences and similarities between them.

1.2 The Development of the EQ-5D-5L

In 2005 the EuroQol Group initiated efforts to develop an expanded-level version of the EQ-5D-3L. This was motivated by concerns by some stakeholders about limitations of the original instrument, particularly ceiling effects and changes in health that were too small to be detected by the three-level version. Studies which had been undertaken by EuroQol Group members prior to 2005 had shown that various experimental five-level versions of EQ-5D could reduce ceiling effects while at the same increasing reliability and sensitivity (discriminatory power) and maintaining feasibility (Janssen et al. 2008a, b; Pickard et al. 2007a, b).

The development and testing of the EQ-5D-5L is reported in Herdman et al. (2011). A decision was made early in the new instrument’s development to retain the same five dimensions as the EQ-5D-3L, but to expand the number of response levels. This could in principle have been achieved simply by adding two ‘unlabelled’ intermediate levels between the existing three. However, in order to arrive at values for the EQ-5D-5L profiles, each health state to be evaluated by respondents needed to be capable of being described by five sentences. This in turn required a label for each level. An example of an EQ-5D-5L health state, displayed in the manner it might be presented in a stated preference task, is shown in Fig. 1.1. The state described in Fig. 1.1 is the same combination of levels and dimensions as the example in Box 1.1 i.e., it is EQ-5D-5L profile 21325.

Fig. 1.1
figure 1

An example of an EQ-5D-5L health ‘state’ described by five sentences

Herdman et al. (2011) describe the process by which these labels were established, using both English and Spanish as root languages in order to support further translation and adaptation of the new instrument. Severity labels for 5 levels in each dimension were identified using response scaling. Selecting labels at approximately the 25th, 50th, and 75th centiles produced two alternative 5-level versions. Focus groups were used to investigate the face and content validity of the two versions, including hypothetical health states generated from those versions. This showed evidence in favour of the wording ‘slight-moderate-severe’ problems, with level one described as ‘no problems’ in each dimension, and level five being ‘unable to’ in the EQ-5D functional dimensions (mobility, self-care, usual activities) and ‘extreme problems’ in the pain/discomfort and anxiety/depression dimensions.

The final version of the five-level instrument which emerged from this work is described in the EQ-5D-5L User Guide (EuroQol Group 2019). Beside the increased number of levels of the dimensions, the 5-level version of the EQ-5D has other notable features which represent improvements on the EQ-5D-3L. Most importantly, the wording of the mobility dimension is improved: the most severe level of the mobility dimension of the EQ-5D-3L is ‘confined to bed’, which means that it cannot capture severe problems with mobility that do not involve being confined to bed. This acts to limit its usefulness both in detecting problems with mobility and in capturing improvements in mobility resulting from treatment (Oppe et al. 2011). In the EQ-5D-5L, the most severe level of mobility has been changed to ‘unable to walk about’.

These improvements have yielded a number of advantages for the EQ-5D-5L over the EQ-5D-3L. These are summarised by Devlin et al. (2018) and include:

  1. (a)

    A reduction in the ceiling effect: Using the EQ-5D-5L, compared to the EQ-5D-3L, fewer respondents report no problems on any dimension (e.g., see Feng et al. 2015 and Craig et al. 2014).

  2. (b)

    Reduced clustering on just a few states: The lack of granularity in the EQ-5D-3L descriptive system imposes constraints on the self-report of health. Observations tend to cluster on a few health states (Devlin et al. 2020). The EQ-5D-5L consistently produces considerably more unique health states than the EQ-5D-3L, as shown by Buchholz et al. (2018).

  3. (c)

    Improved ability to discriminate between patient groups/subgroups: The EQ-5D-5L has better discriminative ability, as demonstrated by improved ability to detect differences between subgroups defined by severity at a given sample size (Janssen et al. 2018). EQ-5D-5L users thus benefit from lower sample size requirements within samples of patients (Pickard et al. 2007b). The EQ-5D-5L has improved ability to measure health accurately at the top of the scale and therefore captures finer differences between mild states of ill health and full health at the top of the scale, whereas the EQ-5D-3L has much larger steps between levels 2 and 1.

  4. (d)

    Improvements in the EQ-5D-5L with respect to problems with mobility: As noted above, changing the EQ-5D-3L level 3 descriptor ‘confined to bed’ constitutes an important improvement in the EQ-5D-5L. Level 3 problems on mobility are rarely observed in EQ-5D-3L data. For example, among patients about to receive hip replacement surgery in the English National Health Service, none reported a level 3 problem (Devlin et al. 2010). In effect, in most settings, the EQ-5D-3L only has two levels on mobility: no and some problems. Consequently, the EQ-5D-3L will underestimate benefits of treatments that improve severe problems with mobility (Oppe et al. 2011).

Overall, the evidence suggests that the EQ-5D-5L retains the principal benefits of EQ-5D-3L—its brevity and validity in a wide range of conditions—and produces a more accurate measurement of patient health than the EQ-5D-3L (Devlin et al. 2018). These advantages have been recognised by users and use of the EQ-5D-5L has rapidly increased. There are now more than 130 language versions of the EQ-5D-5L available.

1.3 The need for EQ-5D-5L Values

The availability of the EQ-5D-5L, and the supporting evidence of its improved measurement system, generated a demand for values to accompany it, to allow use of its data in the estimation of QALYs and any other applications where EQ-5D-5L profile data need to be summarised by a preference-weighted single number.

In anticipation of the need to provide EQ-5D-5L value sets, the EuroQol Group initiated an ambitious programme of methodological research, running in parallel with the development of the EQ-5D-5L instrument, and aimed at producing an internationally standardised state-of-the-art valuation protocol. This was timely, as most of the EQ-5D-3L value sets were based on the so called ‘MVH-protocol’ developed in the early 1990s (Dolan 1997). There was a lack of consistency in the design and implementation of that protocol between value sets studies. Furthermore, limitations of the MVH protocol had been recognised, suggesting improved methods were required for valuation of the EQ-5D-5L.

The aim was therefore not just to improve on the instrument, but to also ensure that the valuation of EQ-5D-5L profiles would be based on the best possible stated preference methods – and to provide a well-described, standard valuation study protocol which could be fielded in a consistent way across different countries. This would ensure that the value sets generated for the new instrument would, as far as possible, be comparable across countries. That is, that differences between the EQ-5D-5L value sets which are observed would reflect the local variations in preferences and opinions which they are intended to capture, rather than being confounded by differences in methods.

As it was anticipated that value sets would take several years to be developed and disseminated, an interim solution was to map EQ-5D-5L data to the EQ-5D-3L instrument by linking descriptive systems, and to use the value sets that already existed for the EQ-5D-3L (van Hout et al. 2011) (further explanation of mapping is provided in Chap. 5). While this provided a practical stop-gap means of summarising EQ-5D-5L data, these mapped values were recognised to be suitable only as a temporary solution as these indirect methods introduce additional error variance, and would still rely upon old and non-standardised value sets. Further, one might question whether values sets for the EQ-5D-3L, developed in the 1990s, would be an adequate representation of the average preferences of today’s societies. There are numerous reasons to consider the need to update value sets, including changes in the underlying preferences of populations, improvements in the methods available to value health; changes in the distribution of population demographics; and concerns about potential bias in previous studies (Pickard 2015) – these issues are discussed further in Chap. 7.

In order to arrive at an improved and standardised valuation protocol, the EuroQol Group therefore commissioned a substantial programme of research to develop and test methods suitable for creating new value sets for the EQ-5D-5L that was initiated while the descriptive system was under development. The program of research – which is detailed in Chap. 2 – was started with the intention of providing investigators around the world with the tools to conduct a valuation study that would follow a standardised protocol and produce high quality data based on validated methods that supported comparisons between countries. These efforts culminated in an international protocol for conducting EQ-5D-5L valuation studies, which has been used to produce the 25 country-specific value sets which are summarised in Chap. 4 of this book (Fig. 1.2).

Fig. 1.2
figure 2

Countries/regions that have currently published their EQ-5D-5L value set study in a peer reviewed journal (Other valuation studies for the EQ-5D-5L were not yet published at the time of writing this book (Belgium and India). Furthermore, other countries/regions have completed the data collection for their valuation study for the EQ-5D-5L, and may publish their value set in a peer reviewed journal in the future)

This endeavour is unique in scale and ambition in the field of HRQoL valuation and represents a significant body of work with direct relevance to decision makers and impact on health care policy internationally.

1.4 The Aims of this Book

The book draws together and summarises, for the first time, the body of evidence on EQ-5D-5L value sets that has been produced internationally from the EuroQol Group’s programme of research and protocol development.

The primary aim of the book is to provide an accessible source of information and guidance to support users of EQ-5D-5L and its value sets. Specifically, we aim to improve users’ understanding of how value sets are generated; raise awareness of the characteristics and properties of value sets; and inform users’ choice of which value set to select for particular application, and how that choice may affect their analysis and conclusions. Moreover, the book will also be useful to health economics and outcomes researchers specialising in HRQoL who want to obtain information on the research practises and protocols developed by the EuroQol Group to support EQ-5D-5L valuation.

We begin in Chap. 2 by detailing the process of developing the research protocol underpinning EQ-5D-5L valuation studies. This included a methodological programme of work and international pilot testing; development of a protocol; the first wave of studies and the conclusions drawn from those early studies; modification and strengthening of the protocol and quality assurance processes; and use of the revised protocol in subsequent waves of value set studies. The chapter indicates the considerable learning and progress that was made through this journey of designing and refining the protocol.

Chapter 3 sets out the various aspects of the study design and the basis on which methodological choices were made with respect to the stated preference methods to use; the sub-set of states to value using these methods; and minimum sample size needed.

Chapter 4 provides a reference source and ‘thumbnail overview’ of the characteristics of the value set in each of 25 countries. In each case, we provide a summary of the value set itself and its characteristics, a worked example of the calculation of values from it; information on the sample from which values were obtained; the methods used in analysing the data and modelling the value set; and the uptake by local HTA bodies and other health care decision makers.

Chapter 5 provides guidance to those who have collected EQ-5D-5L data and want to know how to choose between the value sets reported in Chap. 4. This includes consideration of the purpose of using value sets, how to proceed when there is no value set for a specific country of interest or where there is more than one value set for a given country; and when it is appropriate to use mapping to obtain EQ-5D-5L values.

Chapter 6 draws together the value sets summarised in Chap. 4, and compares and contrasts their characteristics, reporting original comparative analysis undertaken specifically for this book. To what extent are there similarities between EQ-5D-5L value sets across countries – and are there important differences between them? Our intention in Chap. 6 is to encourage users to be aware of the specific properties of the value sets they select to use.

We conclude in Chap. 7 by reflecting on the value sets produced to date and considering a number of questions about future directions for this body of work. For example, what is the ‘shelf-life’ of a value set – and what factors should prompt an update, in order to ensure that value sets represent an adequate representation of the average preferences of a society? What methodological questions remain – and how are improvements or variations in methods reconciled with the need for consistency in the evidence presented to HTA bodies and other users?

The book includes a glossary of terms for those unfamiliar with the EQ-5D and the valuation of the EQ-5D-5L.

The stated vision of the EuroQol Group is the aim “to improve decisions about health and health care throughout the world by developing, promoting and supporting the use of instruments with the widest possible applicability for the measurement and valuation of health” (EuroQol Group 2021). We hope this book contributes to that aim, and that it supports your use of EQ-5D-5L to provide evidence for better health care decision making.