Key messages regarding feasibility

1) What uncertainties existed regarding the feasibility?

The Heel2Toe sensor had been used in clinical research as an assessment tool and in two small proof-of-concept studies with short-term supervised use to detect change and get user feedback on their experience. There was a need to test the sensor for home use and include a control group as perhaps the attention and exercise recommendations could alone have benefit. Therefore, we designed this pilot and feasibility study.

2) What are the key feasibility findings?

Dropouts from the trial were mainly related to the COVID situation. There were no adverse events in either group. Challenges with using the Heel2Toe sensor related to functionality of the app which were addressed immediately; hardware challenges were addressed in revisions including ease of charging and Bluetooth connectivity; there were challenges for people to use the smartphone app optimally. A revised version has removed the need for the smartphone and will be used in future studies. The results also showed that people were able to use the sensor on their own at home with some technical support (average 22 min per person) which diminished over time and that, despite technical challenges, the majority of people were satisfied with their experience with the technology, some very much so. There was a strong response in the Heel2Toe group and a near-nil response in the control group demonstrating efficacy potential.

3) What are the implications of the feasibility findings for the design of the main study?

The main study will use the revised version of the Heel2Toe sensor which has eliminated the challenges with connectivity and smartphone skills. A waitlist control group will serve as the comparison group for the between-group comparison and all will contribute pre-post-data. Using the 6MWT as the outcome and based on conservative estimates of effect size (0.5), a sample size of 64 per group would be supported. This sample size would also be sufficient for estimating effects on other explanatory and downstream outcomes. Participants would keep the sensor after the study.

Background

The disruption of the dopaminergic system in Parkinson’s disease (PD) has a profound impact on motor networks needed to control movements [1]. Notably in people with PD, the automatic movements that typify normal walking activity are lost [2, 3], and a deteriorating gait pattern develops, characterized by quick, short, shuffling steps, narrow base of support, stooped posture, rigid trunk, and reduced arm swing. The short stride length often causes the foot to scuff the ground, causing trips and falls [4,5,6]. Starting, stopping, and changing direction are more difficult, gait pattern is inconsistent [7], and freezing is common [8]. As gait impairments progress, asymmetries develop and people have difficulty adapting their walking to new or complex environments or to increased task burden [9, 10]. Walking is perceived as harder, and, eventually, walking for enjoyment and health promotion abates and then ceases.

One solution to improve gait is to emphasize a heel-to-toe gait pattern [6], something typically done during physical therapy to change posture and stride length. This strategy provides the walker with feedback and encouragement for this, usually automatic, movement. Relearning the pattern requires repeated practice, and, once the therapist ceases this verbal cueing, the walker returns to their typical sub-optimal gait pattern.

Gait training is predominantly carried out by physical therapists with one-on-one interactions; however, there are not enough therapists for the number of people with gait vulnerabilities. Technology is poised to bridge the gap between supply and demand facilitating self-management of gait vulnerabilities. Some technologies are more successful than others, but many gaps remain in technology readiness, usability, access, training needs, and efficacy potential.

Researchers at McGill University have developed and commercialized through PhysioBiometrics Inc., a device that automates this verbal cueing by providing real-time auditory feedback when the heel strikes first when stepping. The Heel2Toe™ sensor, shown in Fig. 1, consists of a sensor that runs a real-time algorithm that discriminates good from poor steps with 94% accuracy [11, 12], and generates appropriate feedback. It is classified by Health Canada as a Class I medical device (#167,654). The sensor has a gyroscope, an accelerometer, and a magnetometer providing 9 degrees of freedom.

Fig. 1
figure 1

Heel2Toe™ sensor

The gait cycle has been studied and described since the advent of bipedal gait [13,14,15]. Figure 2 presents a graphic of the normal gait cycle when tracked from the ankle joint using the gyroscope. Normal gait is characterized by two troughs and one peak. The first trough is when the ankle moves clockwise from initial contact to foot flat then there is no ankle movement allowing for weight transfer from the heel to the ball of the foot. The second trough is when the ankle again moves clockwise to push the foot off the ground to propel the body forward. Typically, the ratio of push off to heel strike is estimated at 2:1 [16, 17]. The peak represents the swing phase of the gait cycle when the foot leaves the ground and swings forward (counter-clockwise) to initiate another step.

Fig. 2
figure 2

Typical gait cycle

The sensor detects the velocity at which the ankle moves clockwise during the initial contact of the foot during a step (angular velocity: AV). When the AV crosses a threshold for a “good step,” a signal is sent via Bluetooth to a smartphone, and a sound is emitted. This external positive feedback drives motor learning, retraining gait patterns to be more normal, fluid, safe, and sustainable. To normalize walking, people must relearn motor sequences and develop needed adjuncts to efficient walking: strength, power, core stability, balance, etc. Physical therapy targets adjuncts but motor learning requires instruction, repetition, and practice [18]. At least some of the neural mechanisms underlying this learning are likely aberrant in PD. Motor learning via feedback involves neuroplasticity in corticostriatal and striato-cerebellar circuits in a partially dopamine-dependent manner [18,19,20,21].

In two proof-of-concept studies of 6 people with PD [22] and 6 pre-frail seniors [23] receiving 5 training sessions with Heel2Toe™ over 2 weeks, every person made at least one clinically meaningful change on one gait parameter after training. The potential mechanism of action is a dopamine-driven reward and feedback loop [24]. Here, we set out to estimate the extent to which training with the Heel2Toe™ over a longer period of time (3 months) was feasible and acceptable to participants and to estimate changes in walking capacity and gait pattern among people training with feedback from the sensor and among those training without feedback.

The hypotheses for which the pilot trial will provide supporting data are that people in the group training with feedback from the Heel2Toe sensor will make greater gains in walking capacity and motivation and will show more optimal changes in parameters of gait quality than will be observed in the control group.

Design

A two-group, 2:1 randomized, feasibility trial was carried out with repeated measures of gait parameters and walking outcomes. The randomization sequence was generated by an independent statistician. The trial was prospectively registered on April 3, 2020, under the name “Improving Walking With Heel-To-Toe Device” on ClinicalTrials.gov (NCT04300348) https://register.clinicaltrials.gov/prs/app/action/SelectProtocol?sid=S0009NRV&selectaction=Edit&uid=U0000572&ts=2&cx=-nba3sj; the project was approved by the Research Ethics Board of the McGill University Health Center on Feb 17, 2020 (File # 2020–5842). All participants provided written informed consent.

The feasibility phase followed the recommendations from the CONSORT extension to randomized pilot and feasibility trials (PAFS) [25, 26]. PAFS emphasizes testing all aspects of data collection and processes of the intervention and measurement, but warns against between-group testing of efficacy due to lack of statistical power.

Population

People with PD manifesting gait impairments and meeting the criterion that usual walking is without a walking aid [27], corresponding to Hoehn and Yahr Scale of 2 to 3, were recruited from the Movement Disorders Clinics at McGill sites and the Quebec Parkinson Network. Patients with documented cognitive impairment based on their recorded score on the Montreal Cognitive Assessment (MOCA) [28] were not approached for inclusion. All patients kept their usual dopaminergic medication schedule throughout the study. People were assessed at a time that corresponded to their medication regimen.

Intervention

Both groups received a workbook with instructions on simple exercises to facilitate a better walking pattern (available at physiobiometrics.com), 5 sessions with a physiotherapist (PT) over 2 weeks to practice walking well and four specific exercises, one for each major joint area involved in walking (foot and ankle, knees, hip, trunk). This personal gait training period was followed by independent home practice over 3 months. Both groups were instructed to practice walking with the sensor for a minimum period of 5 min, twice a day. The exercises were to be done before each walk, 10 to 15 repetitions,

During the 5 therapy sessions, the Heel2Toe group was taught to trigger the sensor with a strong heel strike to receive the feedback and how to use the sensor and the app on the smartphone. This instruction was in preparation for independent home use for 3 months. The Workbook group also received similar verbal instruction during these 5 therapy sessions when walking with the Heel2Toe sensor but received no feedback from the sensor.

Measures

The feasibility outcomes were pace of enrollment, completeness of data collection, retention into the study, amount of technical assistance required by the participants, and user experience with the technology. Our target pace of enrollment was 30 in 6 months as that was the time left to complete the study after COVID delayed the start date. Our target for missing data at baseline was 10% based on a target sample size of 30; target retention was 80% (loss of 6); and we did not estimate the effect of technology failure on use rates or satisfaction ratings as we were planning on making adjustments as rapidly as we received feedback from our participants, improving these feasibility metrics over time.

The primary efficacy potential outcome was the 6-Minute Walk Test (6MWT), a performance-based outcome (PerfO) of functional walking capacity [29]. A secondary PerfO was the Standardized Walking Obstacle Course (SWOC) [30], a timed performance-based test involving starting, stopping, turning, and making motor decisions. Average values for people with a mean age of 63 years are reported to be 12 s [31]. Sit-to-Stand, the number completed in 30 s, was also assessed. The average for people aged 70–74 years is reported to range from 10 to 17 depending on sex [32]. Assessors were unaware of the group assignment at time of assessment.

Data on constructs related to other aspects of brain health (motivation, symptoms, function, and quality of life) were also collected using patient-reported outcome measures (PROMS). Motivation was measured using the Starkstein Apathy Scale [33] and an inventory of activities based on the World Health Organization’s International Classification of Functioning, Disability and Health (ICF). From this ICF bank of 393 activity and participation items, 17 were chosen as relevant for this context and rated based on degree of self-initiation (0 to 2) and degree of effort (0–2). This measure is under development, and this study provided feasibility data to support further directions.

Symptoms of anxiety, depression, pain, and fatigue were measured using Visual Analogue Health States [34] on a 0 to 100 scale with higher values indicating better health states. Values 60 or less would be considered to reflect a clinical situation where treatment might be indicated [35, 36]. Function was measured with the Neuro-QOL [37]; health-related quality of life (HRQL) was measured with the 8-item Parkinson Deficit Questionnaire (PDQ) [38], where higher scores indicate poorer HRQL, and the EuroQol measure [39]. Two other VAS scales were used to measure general health perception and quality of life on a 0 to 100 scale with 100 indicating best value.

Indicators of gait quality were obtained directly from the Heel2Toe sensor during the 6MWT. Due to inconsistent Bluetooth connection from sensor to smartphone (fixed over the course of the trial), the number of recorded steps varies. The indicators are as follows: percentage good steps (those that passed the pre-determined threshold of − 150°/s of ankle angular velocity (AV); AV at each part of the gait cycle (heel strike, push off, swing) and associated coefficients of variation (CV) of AV, where CV is calculated as the ratio of standard deviation (SD) to the mean of each parameter, expressed as a percentage. Two measures were derived from these data: power phase, the area of the two troughs under the zero AV line (ankle still during stance), termed area under the line (AUL); and balance phase, the area above the zero-line termed area above the curve (AAL). The balance phase is so named as its shape and area are determined by the ability of the person to do single-leg stance long enough and lift the swing leg high enough for the foot to clear the ground. Average time in swing and CV were also measured. A total of 13 gait quality parameters are reported here.

To identify whether gait quality parameters changed over the intervention period, a difference of 10% from baseline to 3 months was used as the critical value. A 10% change from baseline indicates important change in different types of measures [40, 41] including gait parameters [42].

Analysis

Counts of people enrolled over 6 months, with complete baseline data, and completing the study were made. A number of technology failures, use, and positive endorsements of satisfaction were also made; average help minutes was also calculated.

Reliable change [43], magnitude of change relative to pre-post variability and observed inter-test correlation, was calculated for each participant within each group, over the intervention and follow-up periods. The critical value for a single arm, pre-post, study is 1.645. Also presented are the 95% confidence intervals for change in 6MWT, results from a paired t-test, and effect sizes [44] for each group. The sample was too small to use imputation as needed for an intention-to-treat analysis and so only per protocol, within group, results are presented. Data on secondary outcomes are presented for descriptive purposes only as sample sizes are small, and variability large. As this was a pilot study, no between-group analyses are indicated [25, 26], but estimates of change were used to guide power for a future trial.

Data on gait quality parameters are presented per person according to group. The number of gait parameters showing improvement, no change, or deterioration were summed for each person and accumulated over all people. Rate of improvement per group was calculated as total number of improved gait parameters divided by the total number of person-measures assessed (parameters x people); 95% confidence intervals (CI) for these rates were calculated.

Sample size

The study was powered to detect a minimal important within-group change of moderate or greater magnitude (effect size ½ standard deviation) on the 6MWT. A sample size of 20 per group was targeted to provide 80% power (Type I error 0.05) to provide 95% confidence that future estimates of within-group effect will exclude the null value of 0 correlation (95% CI, 0.03 to 0.96 SD). The trial was approved to start on the day that McGill University shut down because of COVID (March 2020). The trial was not permitted to start with in-person assessments and therapy sessions until April 2021 and funding restrictions required the trial duration to be curtailed resulting in a reduced sample size. Thus, we chose to assign people 2:1 to the intervention and control groups to maximize the number receiving intervention. The advantages of an unequal randomization ratio outweigh disadvantages when the intervention is considered advantageous by potential participant as in our case with new technology for a chronic disease [45]. In addition, we needed as much information as possible on how people used the sensor in the real world rather than information on how people with this slowly progressing condition fare over a short period of time with recommendations only.

Results

Once COVID restrictions were eased for clinical research, over a 6-month period, we assessed 33 people for eligibility and randomized 27 in 6 months, approximately 5 per month which saturated the limited resources available to our team. Figure 3 shows the path of participants through the study. As the study was curtailed because of COVID, we did not attempt to recruit three other participants. Of these, 18 were randomized to the Heel2Toe group and 9 to the Workbook group. One person in the Heel2Toe group did not receive any intervention owing to difficulty with scheduling. Fourteen people in the Heel2Toe group completed the 3-month assessment and 13 completed the 6-month assessment. In the Workbook group, these numbers were 7 and 6. Reasons for non-completion related to the demands of the trial, fear of COVID, travel, and illness. At baseline, there were 2 people with missing data on the primary outcome, the 6MWT. On the other outcomes, missing data ranged from 0 to 3. The results on usability of the Heel2Toe sensor are presented in Supplementary Fig. 2 (data on use) and Supplementary Fig. 3 (satisfaction).

Fig. 3
figure 3

Flow of participants through the study

The characteristics of the participants in terms of demographics and brain health outcomes at randomization are shown in Table 1. There were some qualitative differences on symptoms such as pain, fatigue, and mood with the participants in the Workbook group reporting average values in the range of clinical concern; however, there was a considerable amount of variability in the ratings.

Table 1 Characteristics of participants in each of the two groups at randomization

The results on the primary outcome, the 6MWT, are presented for each group separately in Table 2. Average values at baseline were approximately 75–80% of what would be predicted for age. The number of participants differed at each timepoint because of missing data. Among the 14 people in the Heel2Toe group with both baseline and 3-month evaluations, the average change in the 6MWT was 66.4 m (SD, 55.6); the change 6MWT for the 7 people in the Workbook group was −19.4 m (SD, 41.6). The difference in the Heel2Toe group was associated with a paired t-test value of + 4.47 (p = 0.0006) and an effect size of + 0.47; the corresponding effect parameters for the Workbook group were − 1.24 (p = 0.26) and − 0.11, respectively. These parameters are also presented at the 6-month follow-up visit. The proportion of people in the Heel2Toe group who improved more than measurement error for was 13/14 after 3 months and 12/13 after 6 months; for the Workbook group, these ratios were 0/7 and 5/6. The average change at follow-up in the Heel2Toe group was 75.7 m (SD, 81), and the average change in the Workbook group was 34.4 m (95.7 m). However, reliable change was 4/14 and 5/13 for these two time periods for the Heel2Toe group and 0/7 and 1/6, for the Workbook group. Individual changes to 3 and to 6 months on the 6MWT are presented in Supplementary Figs. 1a and b for each of the two groups. In the Heel2Toe group, most of the changes observed over the active intervention period were maintained to 6 months. For the Workbook group, one person made a dramatic change, resulting in a large mean change, others made some smaller changes.

Table 2 Results on the 6MWT and other performance measures for each group

Table 2 also presents the results on the other PerfOs, the SWOC and Sit-to-Stand. The time to complete the obstacle course was on average 20 to 30 s across groups with normal values reported as 12 s. Of interest is that the number of people agreeing to make more attempts on this course was greater after the intervention with the Heel2Toe sensor (33%), whereas only 14.3% chose this option in the Workbook group. Results on the Sit-to-Stand test were at the lower end of normal for age.

Values on PROMs for motivation and other brain health outcomes are presented in Table 3. Observed change on the Apathy Inventory, improvement in the Heel2Toe group (− 4.2 ± 7.6), and worsening in the Workbook group (3.6 ± 10.9) supported our hypothesis that the feedback affects motivation. Also, overall rating of health and health-related quality of life as measured by the EQ-5D VAS and EQ-5D utility improved in the Heel2Toe group (VAS, + 7.1 ± 10.1; utility, 0.01 ± 0.8) and worsened in the Workbook group VAS, − 10.2 ± 12.7; utility, − 0.8 ± 0.12). In the Heel2Toe group, fatigue and overall quality of life showed improvement; in the Workbook group, pain showed improvement, although all effect sizes were less than 0.5 (medium effect size). There were no adverse events in either group.

Table 3 Brain health outcomes for participants in each group over time

Supplemental Tables 1 and 2 present the results of the analyses on the gait quality parameters. Despite small numbers, the groups were relatively well balanced at baseline (ST1). Values on gait quality parameters are presented at baseline and at the end of the active intervention period (3 months) for each person according to group (ST2). Values that differed by 10% were colored green for improvement, yellow for no change, and orange for deterioration. Supplemental Fig. 1 shows the proportion of participants in the two groups who, over 3 months of active intervention, improved, remained the same, or deteriorated on gait parameters. The rate of improvement in the Heel2Toe group was 49.7% (95% CI, 39.6 to 61.5%) and 13.5% in the Workbook group (95% CI, 5.4 to 27.7%).

Discussion

The results of this pilot study supported feasibility as enrollment rate was achieved, and there was little missing data apart from when people dropped out. There was a higher dropout rate than hoped but the COVID situation resulted in people being unavailable for testing. There was also compelling evidence that walking training with feedback from the Heel2Toe[TM] sensor was effective. The results, shown in Table 2 and Supplementary Fig. 1a, support efficacy potential based for the Heel2Toe sensor based on the magnitude of average change in each of the two groups (+ 66.4 vs. − 19.4), the proportion of people making change greater than measurement error (13/14 vs. 0/7), and in proportion making reliable change (4/14 vs. 0/14). The inclusion criteria for this pilot were broad, and we found important improvements across the range of baseline walking capacity.

Information on usability pointed out areas for revision of the Heel2Toe™ sensor all of which have now been implemented into the latest version. The results also showed that people were able to use the sensor on their own at home with some technical support which diminished over time and that, despite technical challenges, the majority of people were satisfied with their experience with the technology.

We also found that there were some changes in motivation favoring the Heel2Toe group which is considered to be one of the mechanisms contributing to improved outcomes.

There is support from the literature for the effectiveness of biofeedback in improving gait patterns in healthy and clinical populations including people with PD [46, 47], but few feedback devices are available to the general public [48].

This study was designed to provide evidence as to feasibility and hence its limitations relate specifically to that design. Notwithstanding the large effect on the 6MWT in the intervention group, the sample sizes in the two groups were small, and the study was not powered for between-group comparisons. The planned sample size was not achieved because of the multiple delays and protocol changes owing to the volatile COVID situation during the study period. This also affected retention into the trial as people were worried about contagion and had other stressful situations with which they were dealing.

The number of technical issues uncovered was both a limitation and a strength as we were able to modify some in real-time and others for subsequent revision.

We used a measure of motivation (apathy) that is under development to obtain data on its performance in this population. Results from this early deployment should be interpreted with caution.

While the sample size was small, the variability in the outcomes studied was within those reported in other studies. For the 6MWT, 10 observational studies comprising 1004 people with PD provided data allowing for the calculation of the ratio of SD to distance walked in 6 min (coefficient of variation—CV) which was estimated at 26.5% with 6MWT values averaging 367 m [49,50,51,52,53,54,55,56,57,58]. The results from our study showed similar values for the 6MWT and a CV of 28% (see Table 2). The other measures with high SD were the visual analogue scales with CV ranging from 50 to 75%. A study of variability in these measures showed CV of approximately to 66% [59]. The variability in VAS for the EQ-5D is almost identical to that reported in a paper by Parkin et al. [60].

Based on the results, the feasibility of a pragmatic definitive trial is supported. Using the 6MWT as the outcome and based on conservative estimates of effect size (0.5), a sample size of 64 per group would be supported. This sample size would also be sufficient for estimating effects on other explanatory and downstream outcomes.