Introduction

Brain tumour patients may suffer from various physical, neurological, and neurocognitive impairments. These impairments can have a substantial negative impact on a patient’s ability to function in everyday life. Everyday life, or daily functioning, can be measured on two levels: basic activities of daily living (BADL), which are related to self-maintenance (e.g. eating or dressing), and instrumental activities of daily living (IADL), which are related to autonomous functioning in society (e.g. household activities or using a computer) [1]. IADL rely more heavily on higher order cognitive functioning. Deterioration of cognition is associated with worse performance on IADL in the general [2] and elderly population [1], and in patients prone to cognitive impairments, such as dementia [3].

Despite the fact that preserving the ability to function independently as long as possible is particularly important to brain tumour patients due to the incurable nature of the disease, and cognitive decline is frequently observed in this patient group [4], IADL is almost never measured in brain tumour trials or used in clinical practice because no reliable and validated measures are available for this patient group [5].

One study evaluated the applicability of a reliable and valid IADL questionnaire developed for dementia patients (i.e. Amsterdam IADL Questionnaire© (A-IADL-Q)) [3, 6,7,8], for the brain tumour population [9], since both patient groups experience similar cognitive problems [10, 11]. However, this instrument did not appear to be entirely relevant to brain tumour patients, warranting a brain tumour-specific IADL questionnaire.

The general consensus is that patients are the best source to rate their functioning and well-being [12]. The A-IADL-Q, however, was developed as a proxy-based questionnaire, as it was hypothesized that cognitive deficits could potentially limit the dementia patients’ ability to rate their own level over daily functioning. Brain tumour patients could be similarly limited in their ability to rate their level of functioning [13]. Therefore, both a patient-based and proxy-based version are being developed, analysed, and compared, after which it can be decided what version is most appropriate in the brain tumour setting.

The aim of this study is to develop a reliable and valid IADL questionnaire that can be implemented in both brain tumour trials and clinical practice. Here, the first three phases of the development of the questionnaire are described.

Methods

Study design

The European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group (QLG) guidelines for module development [14] were followed, consisting of four phases: (I) Generation of items, (II) Construction of the item list, (III) Pre-testing, and (IV) Field testing. This paper reports the results of phases I–III of the developmental process (for more details, see Supplemental File 1; Fig. 1).

Fig. 1
figure 1

Flowchart item selection phases I-III

Study population

The study population consisted of patients with a primary or metastatic brain tumour, their informal caregivers as proxies, and centre-affiliated health care professionals (HCPs) in the field of neuro-oncology. Patients were eligible if they had either a histologically confirmed glioma (based on WHO 2007 criteria) or brain metastases and a histologically confirmed primary tumour. Further inclusion criteria were age (≥ 18 years) and contact frequency between the patient and their proxy (daily or weekly) to ensure a reliable assessment of the patient’s daily functioning. The exclusion criterion was an insufficient understanding of the official language of the country of residence to complete study procedures.

Participant recruitment for phase I took part consecutively in both academic and non-academic outpatient clinics in the Netherlands, Italy, and the United Kingdom and for phase III in three European regions, namely Northern Europe (The Netherlands and Austria), Southern Europe (Italy), an English-speaking region (the United Kingdom, i.e. England and Scotland), and a non-European country (Japan). Patients meeting the in- and exclusion criteria, who were also determined to be physically (i.e. performance status) and mentally (i.e. able to consent to research and have the mental capacity to complete the study procedures) fit to participate by their treating physicians, were approached to participate in the study. Following EORTC QLG module development guidelines, patient recruitment targeted an even distribution across the relevant variables, i.e. tumour type (high-grade glioma (HGG)/low-grade glioma (LGG)/1–3 brain metastases/ > 3 brain metastases) in phase I and, additionally, the presence of cognitive deficits (present/not present) in phase III. Both patient and proxies provided written informed consent before participation.

Phases I & II

Phase I aimed to compile an extensive list of IADL relevant to brain tumour patients. Five sources of information were used: (1) the relevant A-IADL-Q [7] activities from a previous study (referred to as ‘pilot study’) in glioma patients [9], and in accordance with EORTC QLG module development guidelines, (2) the literature, (3) the EORTC Item library [15], and semi-structured interviews with (4) patients and their proxies and (5) HCPs.

During the pilot study (source 1), HCPs (N = 6) judged if the activities from the A-IADL-Q could be considered as ‘IADL’ using the definition: ‘IADL are complex activities with little automated skills for which multiple cognitive processes are necessary [6]?’. If ≥ 2 HCPs rated an activity as not being IADL, it was excluded from further analysis. Subsequently, HCPs as well as patients (N = 12) and proxies (N = 12) had to rate if the proposed activities were ‘likely to be affected in brain tumour patients’. Activities were excluded if more than half of all participant groups rated the activity as not affected. Finally, patients, proxies, and HCPs evaluated if the activities from the A-IADL-Q were ‘clearly formulated’. In this case, if an activity was rated unclear by > 4 HCPs or > 10 patients or > 10 proxies, the formulation was reviewed and the item rephrased.

From a literature review of the electronic databases PubMed, Embase, Cochrane, PsycINFO and CINAHL conducted up to April 2017 (source 2) and a review of the items in the EORTC Item Library (source 3), additional potentially relevant IADL were extracted.

Subsequently, a new group of patients (N = 28), proxies (N = 27) and HCPs (N = 18) participated in the semi-structured interviews. Each IADL activity generated from sources 1–3 was rated on both relevance and importance (4‐point Likert‐scale) by patients and proxies, and missing IADL were identified. HCPs also rated the activities on relevance, and in addition provided a top 10 of the most important IADL activities. Items with an average score of < 2.0 on both relevance and importance by either the patients or proxies, or with ≥ 6 HCPs rating them as not relevant, were excluded, except if they were in ≥ 2 HCPs top 10 most important activities.

Finally, another group of patients (N = 2), proxies (N = 2), and HCPs (N = 2) cognitively debriefed the remaining items resulting from sources 1–5 item generation and selection process, and evaluated the appropriateness of the wording of the items, and any potential redundancies (for more details, see Supplemental File 1).

In Phase II, the activities were converted into questionnaire items and a preliminary IADL questionnaire was constructed. Both a patient‐based and proxy‐based version of the IADL questionnaire were constructed and subsequently translated by the EORTC Translation Unit [16] into the languages of the countries participating in phase III (English, Dutch, Italian, German, and Japanese). The proxy-based version consists of the same items but refers to the patient (e.g. ‘has he/she had difficulties’).

Phase III

Phase III aimed to pre-test the preliminary IADL questionnaire by means of semi-structured interviews and neuropsychological testing. The interview consisted of four parts: (1) completion of the IADL questionnaire (4‐point Likert‐scale [‘not at all’- ‘very much’], and not applicable), (2) rating each activity on relevance and importance (4‐point Likert‐scale) and acceptability (e.g. not too difficult/confusing/annoying/upsetting), (3) identification of the 10 most important activities, and (4) identification of missing activities. For the known-group comparison analysis, patients were classified as with or without cognitive impairments using the standard neuropsychological test battery used in EORTC trials which comprises three objective tests (six outcomes) and a subjective cognitive complaints questionnaire (one outcome) (for more details, see Supplemental File 1). Patients scoring < 2 standard deviations (SD) below the mean of the official norm scores on ≥ 2 out of 7 outcomes were classified as cognitively impaired.

Item selection

The item selection decision rules in phase III were based on EORTC QLG module development guidelines ensuring content validity. Cross-cultural validity was preserved by using a 3-round stepwise item selection procedure to ensure that potential skewed geographical patient population distributions would not influence item selection (see Table 1 for all decision rules).

Table 1 Predetermined decision rules for item inclusion, exclusion, and revision for phases I–III

Item retention/omission was based on patient data only. However, a sensitivity analysis was performed by repeating these steps with proxy data.

Statistical analysis

Descriptive statistics were used to describe the sociodemographic and clinical characteristics of the participants, as well as the quantitative data in phases I and III.

A preliminary evaluation of several psychometric properties was performed. The content validity was ensured by using multiple sources of information to identify IADL (phase I), and the subsequent evaluation of relevance, importance, acceptability, and completeness of the item list in phase III (Table 1). Cross-cultural validity was ensured by including participants from different geographical regions and the 3-round stepwise item selection procedure. After the item selection procedure, structural validity was assessed by performing an exploratory factor analysis (EFA; principal component analysis with orthogonal rotation (varimax)) due to the small sample and no a priori known scale structure. This included analysis of eigenvalues, with values > 1 considered as indication for a factor that should be remained, and a scree plot inspection to determine the number of factors to retain in the EFA. Multiple imputation techniques were used for 7% of observations because of the large number of patients (78%) responding ‘not applicable’ to one or more items. As ‘not applicable’ responses are coded as missing data, and standard EFA uses listwise deletion, this technique prevented that a limited number of patients could be included in the analysis.

The EFA resulted in single- and multi-item scales that were used in further analyses. The internal consistency of the multi-item scales was determined by calculating Cronbach’s alpha, with scores between 0.8 > α ≥ 0.7 classified as acceptable, 0.9 > α ≥ 0.8 as good, and α ≥ 0.9 as excellent [17]. Since there is no ‘gold standard’ to measure IADL in brain tumour patients, the criterion validity could not be assessed. Instead, construct validation was examined by means of known-groups validity. An a priori hypothesis was constructed stating that patients classified as cognitively impaired would have higher scores (indicating more problems) on the IADL scales/items than patients who were classified as cognitively unimpaired. To do so, scale scores were calculated based on linear transformation as described in the EORTC Scoring Manual [18], and mean differences in the groups were compared using Mann–Whitney U tests. Congruency between patients and proxies was assessed per item (mean difference patient–proxy rating) and per dyad (inter-rater reliability) with sub-analyses including patients’ cognitive impairment classification (for more details, see Supplemental File 1). IBM SPSS version 26.0 was used to carry out all statistical analyses [19], and a p value < 0.05 was considered statistically significant.

Results

Sociodemographic and clinical characteristics of patients and their proxies included in phases I and III are described in Table 2. In phase I, a total of 44 patients, 43 proxies, and 26 HCPs were included and in phase III, 85 dyads. Patients with and without cognitive impairments were fairly equal distributed among the tumour types (Table 3). Four patients could not complete the neuropsychological testing due to health issues.

Table 2 Sociodemographic and clinical characteristics of patients in both phase I and III, as well as the characteristics of their proxies
Table 3 Percentage of patients with cognitive impairments (defined as z-score more than 2 SD below the control group on at least two domains), separately per tumour type

Phase I & II

The review of the pilot study’s [9] activities (n = 32) reconfirmed that all items were considered IADL, affected in brain tumour patients and clearly defined. The literature search of 342 unique records resulted in 103 relevant records which described 54 unique questionnaires comprising a total of 1376 items. Out of these 1376 items, 310 were related to IADL. Furthermore, 23 IADL were extracted from qualitative studies. The review of the items in the EORTC Item Library identified 526 unique items of which 12 reflected IADL. The 345 IADL were clustered and merged based on content, and IADL similar to the pilot study items were excluded. On top of the 32 items from the pilot study (source 1), an additional 30 IADL were identified with sources 2 and 3, resulting in 62 activities. The semi-structured interviews generated two new activities (i.e. ‘being independent’ and ‘doing calculations’) and excluded two activities based on HCPs’ assessment of relevance (i.e. ‘arts and crafts’ and ‘following an instruction manual’). The remaining 62 activities were subsequently cognitively debriefed, resulting in the exclusion of two redundant activities (i.e. ‘putting ideas into words’ and ‘engaging socially with other people’) and one unclear activity (i.e. ‘getting started with a task without prompting’) (Fig. 1) (for more details, see Supplemental File 2). In phase II, the remaining 59 activities were formulated as items.

Phase III

Missing activities

Nine activities suggested by participants could be considered IADL and were not covered by the 59-item list, however, none were mentioned by ≥ 2 participants, indicating that the item list had sufficient coverage.

Item selection

Following the item selection procedure, a total of 10 items were excluded in the first round, six items in the second round, and an additional 11 items in the third round (Fig. 1; for more details, see Supplemental File 3; Fig. 1). This item selection procedure resulted in a final list of 32 items.

Proxy ratings (sensitivity analysis)

Although item selection was based solely on patient data, the item selection procedure was repeated with proxy data. The 59 items from phase II were reduced to 33 items, 25 of which were the same as those selected based on patient data (data not shown).

Preliminary psychometric properties

Content validity

Most of the 32 items were deemed relevant and important; on average, items were rated as ‘quite a bit’ or ‘very much’ relevant by 57.2%, and important by 68.2% of patients. Following the acceptability criteria as described in Table 1, N = 24 items needed reviewing (for more details, see Supplementary file 3; Table 1). For twelve items, ≥ 2 dyads raised the same concern and were accordingly rephrased.

Structural validity

An EFA was conducted on the 32 IADL items. The Kaiser–Meyer–Olkin measure verified the adequacy of the pooled data: KMO = 0.816 and Bartlett’s test of sphericity was significant, χ2(496) = 1858.5, p < 0.001. Seven factors had an eigenvalue over Kaiser’s criterion of 1, which cumulatively explained 70.1% of the variance. The slope of the scree plot (Fig. 2), however, flattened after three factors. As this is an EFA, we decided to maintain the seven factors for further analyses. This resulted in five multi-item factors and two single-item factors (Table 4).

Fig. 2
figure 2

Scree plot

Table 4 IADL-BN32 preliminary scale structure: factor loadings of each item within each single- or multi-item scale
Internal consistency

The five multi-item factors, or preliminary scales, were evaluated for their internal consistency, which was excellent to acceptable, with Cronbach’s α of 0.94, 0.88, 0.83, 0.76, and 0.73, respectively.

Known-group validity

The known-group comparisons showed significant differences between cognitively impaired and unimpaired patients for scales 1 (ranked mean (RM) = 48.8 vs. RM = 34.4, p < 0.01), 3 (RM = 49.6 vs. RM = 33.1, p < 0.01), and 4 (RM = 46.9 vs. RM = 35.0, p = 0.02) scale, with worse performance in cognitively impaired patients. No significant differences between cognitively impaired and unimpaired patients were observed for scales 2 (RM = 43.8 vs. RM = 37.8, p = 0.24) and 5 (RM = 42.5 vs. RM = 35.6, p = 0.17), or the two single items 14 (RM = 18.6 vs. RM = 17.6, p = 0.78) and 45 (RM = 40.2 vs. RM = 35.2, p = 0.22).

Congruency

Proxies reported on average more problems more often than the patients (proxies: 26/32 vs. patients: 4/32), with an average mean difference of M = − 0.09 (SD = 0.11) [range M = − 0.27 to 0.26]. Patients with cognitive impairments had proxies reporting, on average, M = − 0.28 (SD = 0.20) more problems than the patients, while patient without cognitive impairments reported on average slightly more problems M = 0.03 (SD = 0.10) (for more details, see Supplemental File 3; Table 2). Furthermore, average exact agreement between patients and proxies was on average 56.7% [0–100%], with inter-rater agreement between κ = − 0.16 and κ = 1.00 (for more details, see Supplemental File 3; Table 3) (N/A excluded). For patients with and without cognitive impairments, this was 47.9% [κ = − 0.16 to 1.00] and 64.2% [κ = − 0.16 to 1.00], respectively.

Discussion

The aim of this study is to develop a reliable and valid IADL measure for brain tumour patients. Following the procedures of phases I–III of the EORTC module development guidelines, ensuring content and cross-cultural validity, resulted in the construction of the EORTC IADL-BN32 questionnaire. Preliminary psychometric property analyses showed that the questionnaire has a multidimensional scale structure with acceptable to excellent internal consistency. Moreover, the current scale structure showed known-groups validity regarding cognitive status for three out of five scales in this relatively small sample. Congruency between patients and proxies showed quite some variation between dyads. Agreement between cognitively impaired patients and their proxies was on average lower, with their proxies rating IADL issues as slightly more severe, compared to cognitive unimpaired patients. Therefore, the patient and proxy versions of the EORTC IADL-BN32 will be further assessed in phase IV as it is still unclear if the proxy-based questionnaire is more accurate and preferable in situations where patients are cognitively impaired or in poor health.

Limitations of this study were the relatively small sample and the skewed number of participants per geographical region due to participant recruitment issues at some sites. Results should therefore be interpreted as preliminary, and further validation in a larger sample in phase IV is warranted. A predetermined 3-round stepwise item selection procedure was implemented to compensate for this imbalance and ensure cross-cultural validity. Furthermore, many patients and proxies reported ‘not applicable’ on one or more items resulting in a large proportion of missing data, hampering the EFA. This was corrected by means of multiple imputation. To facilitate the CFA in the phase IV validation, as well as the calculation of the scale/item scores of the final questionnaire, this response option will be omitted in further versions. CFAs in phase IV will confirm if the preliminary scale structure is more accurate than, for example, a single factor model. Finally, item selection in phase III was based on patient data only, while cognitive impairment may result in poorer self-awareness of IADL issues [21]. In our sample, however, the sensitivity analyses with proxy data showed that 25/32 of the same items would have been selected.

In conclusion, the preliminary EORTC IADL-BN32 questionnaire has reasonable preliminary psychometric properties, however, further validation in a larger international sample is warranted. The phase IV validation is currently ongoing in ten countries in different global regions. Additional European countries (i.e. Germany, Norway, Portugal, and Croatia) as well as an additional non-European country (i.e. Jordan) are participating in the phase IV validation, further enhancing the generalizability of our results. The focus of phase IV will be on evaluating the scale structure, responsiveness over time, and acceptability of the questionnaire. If the EORTC IADL-BN32 is valid and reliable, it may be a valuable tool in brain tumour trials and clinical practice to monitor levels of functioning in daily life and may be helpful in evaluating the day-to-day impact of changes in cognitive function.