Introduction

Cochlear implants use electrodes implanted into the scala tympani delivering electrical stimulation to bypass the lost sensory cells of hearing (the hair cells) and directly activate the auditory nerve. Although cochlear implants have proven remarkably successful at restoring speech comprehension in quiet conditions1,2,3, many users report difficulties understanding speech in noisy conditions4,5,6, reduced music enjoyment7,8,9,10,11, inability to recognise speakers12,13, and trouble differentiating statements from questions14,15,16. These outcomes are believed to be due to the limited spectral (i.e., frequency) information provided by cochlear implants. Although contemporary cochlear implants have ~ 22 electrodes, many studies indicate that users only receive ~ 8 spectrally independent channels4,17. Although this is sufficient in quiet environments, more spectral information is needed in noisy environments18. Despite efforts to improve the number of independent channels through modification of implant design19, stimulation paradigm20,21, or encoding strategies22,23, users have demonstrated little improvement beyond ~ 8 independent channels.

Further studies indicate that the number of independent channels is fundamentally limited by the spread of electrical stimulation from the stimulating electrode throughout the perilymph-filled cochlear chambers, (i.e., demonstrate very low spatial resolution)24,25. As such, stimulation from two adjacent electrodes can activate overlapping regions of the auditory nerve and cause interactions, so that they are no longer perceived independently26,27.

In contrast, optogenetics, which relies on light rather than electrical stimulation, has been demonstrated to activate the auditory nerve with high spatial resolution28,29,30. The stimulating light can be focused and directed to minimise overlap or interaction between channels. Optogenetics is enabled by genetic modification of tissues to express light-sensitive ion channels, known as channelrhodopsins. Stimulation of these channelrhodopsins by light permits the flow of ions across the cell membrane that, in turn, evokes a response from the tissue. There have been several studies to date investigating the use of optogenetics in the cochlea, and the development of optogenetic cochlear implants31. However, studies show that optogenetics is unable to evoke responses with high fidelity (i.e., consistently evoke responses over time), stability (i.e., consistent response amplitude), nor high temporal precision (i.e., consistent response latency) at the high stimulation rates used in contemporary cochlear implants (> 400 pulses per second, pps)32,33. Such high rates are necessary for electrical stimulation to reduce synchronicity of auditory nerve responses to electrical stimulation, which would otherwise negatively impact hearing outcomes.

Another novel approach to overcome the limitations of electrical stimulation in the cochlea is combined optogenetic-electrical stimulation. In optogenetically modified tissue, preceding the electrical stimulation with light can reduce electrical activation thresholds34, increase the spatial resolution of the stimuli35, and improve the temporal characteristics (fidelity, stability, and temporal precision) of responses to stimuli presented at up to 100 pps33. However, these studies were conducted in transgenic mice with consistently high channelrhodopsin expression in the entire population of neurons of the auditory nerve. Clinical translation of optogenetics requires introduction of channelrhodopsins to the auditory nerve through gene transfer techniques, such as viral transduction, which do not always result in consistent expression throughout the auditory nerve with currently available techniques36,37. Therefore, it is important to validate these results using a clinically relevant method of genetic modification and at the higher stimulation rates used in contemporary cochlear implants. Although many studies compare different channelrhodopsins to optimise optogenetic stimulation towards their application, no such study has yet been done for combined optogenetic-electrical stimulation. This research is necessary for the clinical translation of this stimulation approach.

A variety of channelrhodopsins have been discovered and developed to date, displaying a wide range of characteristics suited for different applications38. Many channelrhodopsins are engineered variants with specific mutations to alter their characteristics, such as wavelength sensitivity, kinetics, and/or light sensitivity38,39. For the high-rate stimulation needed for cochlear implants, the primary characteristic typically considered is the closing kinetics of the channelrhodopsin ion channel. To date, auditory neurons expressing opsins with the fastest channel closing kinetics (Chronos, f-Chrimson, and vf-Chrimson) have been shown to achieve 50% mean spike probability in response to optogenetic stimulation at ~ 150 Hz32,40,41, whereas electrical stimulation can easily achieve 100% fidelity of auditory nerve responses at these rates42. However, mutations that result in faster channel kinetics often negatively affect light sensitivity, demanding higher power for neural activation38. The opposite is also true: slower channel kinetics translating to lower power requirements38. ChR2-H134R (hereafter referred to as H134R), one of the earliest channelrhodopsins developed, demonstrates good light sensitivity but relatively poor kinetics for high-rate stimulation applications43. In contrast, the ChIEF channelrhodopsin has closing kinetics nearly twice as fast as H134R (9.8 ms versus ~ 19 ms39,44), but significantly reduced light sensitivity (EC50 0.9 mW/mm2 versus 0.7 mW/mm2)44.

38,39,43,44. In this study, we investigate two channelrhodopsin variants—H134R and ChIEF—in virally transduced mice for optogenetic and combined stimulation at rates up to 400 pps. The temporal resolution of responses (i.e., fidelity, temporal precision, and stability) and facilitation between stimuli were assessed via auditory nerve and inferior colliculus recordings.

Methods

Virus preparations & pup injections

Adeno-associated virus (AAV)-Anc80 and AAV-PhP.eB were chosen as the viral vectors due to their high specificity for spiral ganglion neurons following intra-cochlear injection in neonates45,46,47. The pAAV-hSyn1-ChR2(H134R)-EYFP (Addgene plasmid #26973) was packaged as an Anc80 AAV by the Vector and Genome Engineering Facility, Children’s Medical Research Institute (New South Wales, Australia) and prepared in phosphate buffered saline (PBS; pH7.4) supplemented with 50 mM NaCl and 0.001% pluronic F-68. AAV-PhP.eB, with a hSyn1-driven ChIEF-EYFP fusion gene and Woodchuck hepatitis virus post-transcriptional regulatory element (WPRE), was packaged by VectorBuilder and prepared in PBS supplemented with 200 mM NaCl and 0.001% pluronic F-68. Viral titre was estimated at 1.76e12 vg/mL for AAV-Anc80-hSyn1-ChR2(H134R-EYFP) and 1.10e13 vg/mL for AAV-PhP.eB-hSyn1-ChIEF-EYFP. Under hypothermic anesthesia, the left cochlea of C57BL/6 neonates aged P2-4 days was exposed via a post-auricular incision and blunt dissection of the soft tissues to access the bulla, which was opened using a sharp needle and microscissors. One microlitre of one of the vectors was then delivered through the round window into the cochlear perilymph via rapid injection using a Hamilton syringe positioned using a micropositioner. Tissues were replaced to their original position and the skin layer was sutured. Pups were warmed until breathing returned, rubbed with nesting material, and returned to the mother. Experimenters could not be blinded to which virus the mice received due to litter naming conventions perpetuated throughout each stage of the experiments.

Ototoxic deafening

Prior to electrophysiology recordings, mice were unilaterally deafened with neomycin using an identical technique to that described previously that was shown to produce >  ~ 25 dB threshold shifts at all frequencies tested and was also validated histologically33. In summary, mice were anesthetised (0.25–3% gaseous isoflurane), and analgesia (lignocaine hydrochloride; 0.1 mL 1% V/V, or bupivicaine; 0.1 mL 0.5% V/V) was applied subcutaneously at the site of incision. An incision was made ~ 1 mm caudal to the pinna, and soft tissues dissected to expose the bulla. A bullostomy was performed to expose the round window and the mid-apical turn of the cochlea. A fenestration in the mid-apical turn was hand-drilled using a sharp probe and the round window membrane was pierced with a ⌀ 0.22 mm steel wire. 10–15 µL of 10% neomycin sulfate (w/V) in 9% saline solution (w/V) was perfused through the round window while gentle aspiration was applied to the mid-apical fenestration over 20–30 min. While we did not confirm deafening in the current study, previous studies indicate it is sufficient to deactivate the hair cells and mitigate the potential of hair cell mediated responses in a highly reproducible manner.

Craniotomy

Following ototoxic deafening, mice were moved to a stereotaxic frame. Analgesia was applied (lignocaine hydrochloride; 0.1 mL 1% V/V, or bupivacaine hydrochloride; 0.1 mL 0.5% V/V; subcutaneous) to the surgical site before an incision was made along the midline of the scalp. The ear canals were exposed and cut close to the skull using sharp microscissors, and ear bars inserted. A diamond dental drill was used to perform a craniotomy over the right inferior colliculus. The dura over the inferior colliculus was carefully removed using a 30 g needle. A saline soaked cotton ball was placed over the exposed brain until the recording electrode was inserted. To access the auditory nerve, the muscles overlying the parietal and interparietal bones on the left side of the skull were dissected using sharp forceps. Again, the brain under the left interparietal bone was carefully exposed using a diamond dental drill and a 30g needle used to lift the dura. A saline soaked cotton ball was placed over the exposed brain until the recording electrode was inserted.

Stimulating equipment

Following the craniotomies, the stereotaxic frame was rotated for better lateral access to the exposed cochlea. Stimulating equipment was inserted into the cochlea. Electrical stimulation was delivered via two 50 µm platinum wires (PTFE insulated; tips exposed), one positioned ~ 1 mm into the mid-apical fenestration used in the deafening procedure and one placed ~ 1 mm into the round window. Optical stimulation was delivered using an optical fibre (105 µm core, NA 0.22) attached to the round window platinum wire.

Recording equipment

A tungsten electrode (TM33B20, World Precision Instruments, USA; impedance ~ 2 MΩ) was inserted through the left hemisphere towards the auditory nerve using a microdriver (David Kopf Instruments, USA) while electrical stimulation was applied to the cochlea. The reference electrode was comprised of a steel needle inserted subcutaneously on the scalp between the eyes. Signals from the tungsten electrode were amplified 100 × by an isolated differential amplifier (ISO-80, World Precision Instruments, USA) and sampled at 200 kHz using a data acquisition device (National Instruments, USA). A single shank 32-channel array with 50 µm spacing (NeuroNexus Technologies, USA) was inserted into the inferior colliculus. To provide recording stability, a 1% (w/V) agar solution was applied over the exposed brain and inserted shank electrode. A steel needle was inserted into the right axillary region to form the reference electrode for inferior colliculus recordings. Signals from the recording array were amplified and acquired at 33 kHz by a Cerebus data acquisition system (Blackrock Microsystems, USA).

Stimulation

Light stimuli were delivered using a custom 452 nm laser (OptoTech, Australia) coupled to an optical fibre (105 µm core, NA 0.22) by an FC connector. Light was presented at 0.5–2 ms pulses at 0–14.5 mW intensities. The laser was calibrated using a photodiode (PDA36A2, Thorlabs, USA) coupled via a 10 m optical fibre to the laser. Calibration was calculated as the plateau power intensity of 500, 750, 1000, and 2000 µs laser pulses measured and averaged over 10 repetitions across the full range of intensities. The stimulating laser was placed outside the faraday cage such that electrical artefacts of the laser firing were negligible in the neural recordings.

Electrical stimuli were delivered by a custom-built battery powered stimulator connected to the platinum wires. Biphasic current pulses with a phase of 25 µs (interphase gap of 8 µs) were delivered over a range of current levels (CL), where CL in µA is calculated as: \({I}_{\mu A}=17.5\times {100}^\frac{CL}{255}\).

Stimuli throughout the analysis are described as sub- (below), para- (equal to), and supra- (above) threshold relative to thresholds identified under 4 pps stimulation. Perithreshold is also used to describe a range around threshold. Specific ranges are stated in the relevant results sections.

For combined stimulation trials, the electrical stimuli were delayed such that they finished with the end of the laser pulse. Stimuli were synchronised by a trigger signal produced by the data acquisition system.

Tungsten recordings

Recordings were processed as previously described33. Tungsten recordings were filtered backwards using a bandpass filter (500–3000 Hz) to mitigate the effect of electrical stimulation artefacts on the auditory nerve response. To further mitigate the effect of the artefact, biphasic current pulses were delivered as both cathodic and anodic first and averaged together (50 repetitions each, total of 100 repetitions) for both electrical-only and combined stimulation trials. Optical responses were also averaged over 50–100 repetitions. The P1-N2 response following stimulation was captured as the auditory nerve response. Auditory nerve thresholds were identified as the minimum level at which a response could be confidently visually identified.

Multielectrode recordings

Processing of multielectrode recordings is described in greater detail in George et al.48. Briefly, recordings were bandpass filtered (300–5000 Hz) and spikes counted in the 5–40 ms window following stimulation using a threshold of − 4.5 × root mean square (RMS). This window captures the majority of activity in the inferior colliculus without capturing electrical artefacts (see Supplementary Fig. S1 online).

For electrical-only and combined stimulation trials, responses only to cathodic-first stimuli were used. Responses were averaged over 10–20 repetitions for each stimulus or stimulus combination.

Inferior colliculus activation threshold was identified as the power level that produced activity at 33% of the maximum evoked activity. Spatial tuning curves were fitted for the 33% level (chosen based on previous studies48,49) at each recording electrode. Activation widths at any level were calculated using the crossings of the spatial tuning curve.

Histology

Following acute electrophysiology experiments, mice were euthanised (cervical dislocation under gaseous isoflurane anesthesia) and the left cochlea was dissected and processed for cryo-sectioning. Cochleae were immediately fixed following dissection in 10% (V/V) neutral buffered formalin for 2–16 h and then rinsed in PBS before 2–3 days of decalcification in 0.12M ethylenediaminetetraacetic acid (EDTA). Processed cochleae were embedded in optimum cutting temperature compound (OCT, Tissue-Tek, Saruka, Japan) and snap-frozen for cryo-sectioning serial sections at 12 µm. Mid-modiolar sections were immunostained using mouse anti-HuD (1:1000, Santa Cruz Biotechnology, SC-48421) or anti-Tuj1 (1:500, Biolegend, 801201) for spiral ganglion neurons, and mounted with 4′,6-diamidino-2-phenylindole (DAPI) to visualise nuclei. The Rosenthal’s canal in three turns of each cochlea were imaged using a fluorescent microscope and processed using ImageJ (National Institute of Health, USA). Processing involved thresholding the image at the mean background fluorescence in the YFP channel to identify opsin-positive spiral ganglion neurons. These were counted as a percentage of all spiral ganglion cells (HuD positive cells).

Analysis/statistical methods

L50 was calculated by first fitting a sigmoid curve to the amplitude growth function (AGF) for auditory nerve recordings or number of spikes at the best electrode for inferior colliculus responses as a function of stimulation intensity, and the slope at halfway between maximum intensity and threshold.

For auditory nerve recordings to burst stimuli, the first response in sequence was always discarded due to a large and variable onset response. Burst recordings with at least two additional responses (i.e., two responses after the first discarded response) were required for inclusion in the analysis of fidelity. Recordings with fidelity of at least 50% were required for the analysis of stability and temporal spread. Individual responses were identified as having a P1-N2 size greater than the quiet period following the burst, and a P1 occurring within 1–2.5 ms following stimulation.

Numerical results stated are average ± standard error of the mean, except for burst stimulation fidelity, temporal spread, and stability, which state median ± interquartile range.

Optogenetic activation: Statistical tests for optogenetic activation thresholds and dynamic range used Student’s t-test. Normality was confirmed using the Shapiro–Wilk test.

Change in optogenetic response amplitude/threshold over time: Statistical tests used RM-ANOVA for within cohort comparisons and Student’s t-test for across cohort comparisons.

Burst stimulation measures: To better compare the three modalities (i.e., optical stimulation, electrical stimulation, or combined stimulation), temporal spread and stability include only the responses which achieve ≥ 50% fidelity. Statistical tests for fidelity, temporal spread, and stability used Wilcoxon-Mann–Whitney test, as Shapiro–Wilk test showed data was not normally distributed.

Histology: Statistical tests for channelrhodopsin expression differences between turns used repeated measures analysis of variance (RM-ANOVA), and differences between cohorts used the Kruskal–Wallis test, as variances were not consistently equal according to the Levene’s test.

Results

Optical responses

We first compared the response characteristics of auditory neurons expressing either H134R or ChIEF channelrhodopsin to 452 nm wavelength optical stimulation in vivo through tungsten electrode recordings of the compound action potential of the auditory nerve and 32-channel shank probe recordings of multiunit activity in the inferior colliculus (Fig. 1). Mice were acutely deafened with 10% neomycin perfused through the cochlea, as described previously33, 30–60 min prior to recordings.

Fig. 1
figure 1

Overview of the experimental set up. (A) P2-4 neonatal mice received intracochlear AAV-H134R or AAV-ChIEF injections. After 6–11 weeks, mice were acutely deafened via perfusion of 10% neomycin through the cochlea. (B) Approximately 30–60 min after deafening, electrophysiology recordings were taken from the inferior colliculus in the form of multielectrode recordings and from the auditory nerve as compound action potential recordings. Electrical and optical stimuli were delivered via platinum wires inserted into the cochlea (round window and fenestration made during deafening) and an optical fibre inserted into the round window.

Auditory nerve compound action potentials could be elicited in all H134R mice (n = 7) and all ChIEF mice (n = 9) (Fig. 2A). Optogenetic activation thresholds for the auditory nerve were not significantly different between H134R (0.66 ± 0.4 mW) and ChIEF (0.18 ± 0.1 mW) (p = 0.3, Student’s t-test) and, for both channelrhodopsins, the optical activation thresholds reduced with increasing pulse duration (Fig. 2B).

Fig. 2
figure 2

Characterising the auditory nerve compound action potentials to optical-only stimulation in H134R and ChIEF transduced mice. (A) Example auditory nerve compound action potentials to ~ 5.0 mW blue light pulses of increasing pulse duration showing increasing amplitude with pulse duration for both H134R and ChIEF. Blue bars indicate stimulus pulse duration. (B) Relative auditory nerve activation thresholds and (C) L50 (i.e., the slope of the response growth function) as a function of optical stimulation pulse duration. Results from n = 6–8 individual mice are shown. (D) Change in auditory nerve response amplitude in individual H134R and ChIEF mice over day-long pulsed optical (4.9 mW) and electrical stimulation (230 CL). Optical stimuli were 1 ms, electrical stimuli were 25 µs biphasic pulses. Stimuli were delivered at both slow (4 pps) and high (100–400 pps) rates. Individual mice are plotted with dotted lines and the median slope is plotted as a solid line. Open symbols indicate recordings where the response could not be confidently differentiated from the noise floor. (E) Rate of change in auditory nerve response amplitude. Symbols indicate individual mice, and solid bars indicate the median. Asterisk indicates significant difference compared to a mean of zero (Student’s t-test, p < 0.05).

For each mouse, we constructed an AGF using the peak-peak amplitude of the auditory nerve compound action potential response, then fitted it with a sigmoid curve. In a number of these mice, the AGF did not reach saturation, and dynamic ranges could not be compared using the 10–90% measure. Instead, we calculated the L50 between threshold and the highest stimulation intensity presented as an indication of dynamic range. The L50 of the auditory nerve response reduced with increasing stimulation pulse duration and was significantly (p < 0.05, Student’s t-test) elevated for ChIEF (1.77 ± 0.2 dB/dB) compared to H134R (0.91 ± 0.3 dB/dB) for a 1 ms stimulation pulse duration (Fig. 2C). Both were significantly reduced compared to electrical stimulation (3.9 ± 0.4 dB/dB and 4.4 ± 0.5 dB/dB for electrical stimulation gathered in H134R and ChIEF, respectively; p < 0.05, paired t-test, n = 6 for each group).

We also evaluated the stability of responses over day-long (range 3–13 h) optical stimulation for both channelrhodopsins. Auditory nerve response amplitude to 4.9 mW stimulation diminished over day-long stimulation for both channelrhodopsins (Fig. 2D) until, in many cases (6 of 6 ChIEF mice, 2 of 7 H134R mice), no response could be elicited at that stimulation intensity and could not be recovered, even after 1–2 h of darkness. Similarly, responses to electrical stimulation at 230 CL also decreased in amplitude over the experimental course of stimulation. For our statistical comparisons, we linearly fitted the amplitudes for optical-only and electrical-only responses (excluding recordings where a response was not elicited, i.e., the open symbols of Fig. 2D) of individual mice to obtain the rate of change of response amplitude over time (Fig. 2E). We included only mice which had stable stimulation set ups (i.e., no adjustments made to the stimulating electrodes or optical fibre over the course of stimulation) and had at least three recordings of optical-only and electrical-only responses (n = 7 for H134R, n = 5 for ChIEF). There was no significant difference between the rate of change in the amplitude of optogenetic and electrical responses for H134R and ChIEF mice. Although loss of responses appeared to occur more rapidly in ChIEF mice compared to H134R mice, this did not reach significance (p > 0.05, Student’s t test). However, compared to no change, the auditory nerve response amplitude to optogenetic stimulation was relatively stable for H134R mice (p > 0.29, Student’s t-test compared to mean = 0), but not for ChIEF mice (p < 0.05, Student’s t-test compared to mean = 0) (Fig. 2D,E). Electrical stimulation was relatively stable, with minimal change in auditory nerve response amplitude (p > 0.29, Student’s t-test compared to mean = 0) (Fig. 2D,E)).

Inferior colliculus multi-unit responses to 4 pps optogenetic stimulation could be elicited in all H134R mice (n = 7) and n = 8 of 9 ChIEF mice (Fig. 3A) (NB: multi-unit responses could not be recorded from one of the ChIEF mice due to damage to the inferior colliculus recording electrode). Like the trend observed from the compound action potential recordings shown in Fig. 2, activation threshold measured in the inferior colliculus also decreased with increasing pulse duration (Fig. 3B). We constructed response growth functions using the responses at the best electrode, then calculated L50 from the fitted sigmoid. In contrast to the auditory nerve responses, the slopes of inferior colliculus responses at 1 ms were not significantly different between the two channelrhodopsins (1.56 ± 0.8 dB/dB for H134R, 1.50 ± 0.5 dB/dB for ChIEF; p > 0.9, Student’s t-test) (Fig. 3C) and, neither were significantly different from the slopes of electrical responses (2.29 ± 0.2 dB/dB, p > 0.4, paired t-test, n = 7 for H134R; 2.78 ± 0.4 dB/dB, p > 0.08, paired t-test, n = 8 for ChIEF).

Fig. 3
figure 3

Characterising multi-unit responses to optical-only stimulation in H134R and ChIEF transduced mice via recordings from the inferior colliculus. (A) Example inferior colliculus response images to optical stimulation from a H134R mouse (top) and a ChIEF mouse (bottom), demonstrating reduced thresholds with increasing stimulation pulse duration. White line indicates spatial tuning curve (i.e., the threshold for each recording electrode, defined as the lowest stimulation level to produce 33% of the maximum response). Colour scale shown is the same for all responses. Best recording electrode was selected as the electrode with the lowest threshold. (B) Relative inferior colliculus activation thresholds and (C) L50 (i.e., the slope of the response growth function) as a function of optical stimulation pulse duration. Results from n = 6–8 individual mice are shown. (D) Change in inferior colliculus activation threshold in individual H134R and ChIEF mice over extended pulsed optical and electrical stimulation. Individual mice are plotted with dotted lines and the median slope is plotted as a solid line. Open symbols indicate recordings where no response could be elicited using the maximum stimulation intensity (15.3 mW). (E) Rate of change in inferior colliculus activation thresholds. Symbols indicate individual mice, and solid bars indicate the median. Asterisks indicates significant difference between groups (RM-ANOVA, p < 0.05). Daggers indicates significant difference compared to a mean of zero (Student’s t-test, p < 0.05).

Inferior colliculus responses to repeated optical stimulation deteriorated over time, as measured by an increase in threshold compared to the first response measured (Fig. 3D). The rate of change in activation thresholds for optogenetic stimulation was significantly higher in H134R mice compared to electrical stimulation (p < 0.05, RM-ANOVA, n = 7) (Fig. 3E). The same was seen in ChIEF mice (p < 0.05, RM-ANOVA, n = 5), but the rate of change in optogenetic activation thresholds for the two cohorts were not significantly different (p > 0.05, Student’s t-test). Irreversible loss of inferior colliculus responses to optical stimulation (up to a stimulation intensity of 15.3 mW) occurred in 5 of 6 ChIEF mice following extended stimulation (open symbols in Fig. 3D), whereas responses could be elicited in all 7 H134R mice for the duration of the experiments. Again, the change in electrical activation thresholds was not significantly different from no change (i.e., mean equivalent to zero) for either cohort (p > 0.1, Student’s t-test) (Fig. 3E). However, both H134R and ChIEF optical responses were significantly different from no change (p < 0.05, Student’s t-test compared to mean = 0) (Fig. 3E).

Facilitation of auditory nerve responses

Following characterisation of the optogenetic response, we investigated the facilitation of electrical activation by simultaneous stimulation with light (combined stimulation). Electrical-only and optogenetic-only trials were interleaved with combined stimulation trials, and responses from the auditory nerve and inferior colliculus recorded. As described previously33, since the stimulating regions of the optical and electrical stimuli do not overlap exactly, the combined stimulation response is in actuality comprised of three summated responses: the electric-only response, the optical-only response, and the combined-stimulus response. Thus, to assess facilitation of the auditory nerve response, the optogenetic-only response was subtracted from the combined response to calculate changes in the electrical threshold and changes in the auditory nerve response amplitude (Fig. 4A).

Fig. 4
figure 4

Characterising responses of the auditory nerve to combined stimulation. (A) To evaluate facilitation between the optical and electrical stimuli in the combined stimulation response (i), the optogenetic-only response (blue; 82 µW) was subtracted from combined stimulation responses (purple). The traces in (ii) show the results of the subtraction, which can be compared to the electrical-only response (iii). The residual between (ii) and (iii) is shown in (iv) revealing a small facilitation at the suprathreshold stimulation used in this example. Orange vertical bars indicate the electrical artefact. Triangles indicate response amplitude. (B) Change in electrical activation threshold with combined optogenetic/electrical stimulation at sub-, para-, and supra-threshold optical levels for each channelrhodopsin. (C) Change in response amplitude with the three levels of combined optogenetic/electrical stimulation measured at the electrical activation thresholds. (D) Same as (C) but measured at 0.7 dB above the electrical activation threshold. Categories refer to the relative intensity of optogenetic stimulation used for combined stimulation, (i.e., subthreshold, parathreshold, or suprathreshold light). Symbols show mean and error bars show standard error of the mean. Optogenetic threshold was defined as the minimum intensity to produce a visually identifiable response. Thresholds were regularly reassessed throughout the recording session and adjusted if required. Asterisks indicate significant difference compared to a mean of zero for H134R (Student’s t-test, p < 0.05).

A range of electrical levels were combined with subthreshold light (0.7–19 dB below threshold), parathreshold light (0 dB), or suprathreshold light (0.5–25 dB above threshold). Threshold was defined as the minimum light intensity to produce a visually identifiable auditory nerve response. Addition of subthreshold and parathreshold optogenetic stimulation to electrical stimulation had no significant effect on electrical thresholds in both H134R and ChIEF mice (p > 0.4, paired t-test, n = 6 for H134R and ChIEF, 3–9 light intensities per mouse) (Fig. 4B). However, a modest reduction in threshold was seen in ChIEF mice when using suprathreshold optogenetic stimulation (− 0.39 ± 0.2 dB; p = 0.05, paired t-test). Thresholds in H134R mice with suprathreshold optogenetic stimulation were not substantially nor significantly different to electrical stimulation alone.

We quantified the change in response amplitude at the electrical threshold and at 0.7 dB above electrical threshold (i.e., at the electrical stimulation level that results in a 3 dB increase in response amplitude, calculated using the L50 for electrical stimulation). At electrical threshold, we observed no significant difference in the response amplitude with the addition of optogenetic stimulation at any intensity for either channelrhodopsin (p > 0.05, Student’s t-test) (Fig. 4C). At 0.7 dB above electrical threshold (Fig. 4D), there was a small but significant increase in auditory nerve response amplitudes with the addition of para- and suprathreshold optogenetic stimulation to electrical stimulation for H134R mice (0.36 ± 0.1 dB and 0.42 ± 0.1 dB, respectively, p < 0.05, Student’s t-test). Combined stimulation of ChIEF mice did not result in any significant effect on response amplitude at any optogenetic stimulation power.

Facilitation of inferior colliculus activity

In addition to the auditory nerve recordings, we also looked at the inferior colliculus responses for evidence of facilitation during combined stimulation by quantifying changes in the electrical activation threshold (Fig. 5A) and the activation width (recording electrode distance between crossings of the spatial tuning curve) at the stimulation level midway between threshold and the maximum current threshold tested (Fig. 5B). Both H134R and ChIEF mice demonstrated a reduction in electrical activation thresholds around the optogenetic activation threshold (p < 0.001, Student’s t-test, n = 8 for H134R, n = 6 for ChIEF) (Fig. 5A).

Fig. 5
figure 5

Characterising responses to combined stimulation at the level of the inferior colliculus. (A) Change in inferior colliculus electrical activation threshold with simultaneous optogenetic stimulation. Lines are sigmoid curves fitted to data. Data point represents individual trials, pooled across all mice (H134R: n = 7, ChIEF: n = 6). Asterisk indicates significant change in threshold (Student’s t-test, p < 0.05). (B) Change in inferior colliculus electrical activation spread with simultaneous optogenetic stimulation intensity (relative to threshold), as measured at the stimulation level midway between threshold and the maximum current threshold tested. Lines are exponential curves fitted to data. Asterisk indicates significant change in activation width (Student’s t-test, p < 0.05). Dagger indicates significant decrease in activation width for ChIEF (Student’s t-test, p < 0.05). (C) Example of residuals analysis. Response images showing the spiking activity in response to electrical-only stimulation (i) and combined stimulation (ii). Colour scale as per Fig. 3A. The electrical-only activity in (i) and the optogenetic-only activity (i.e., 0 CL level of ii) are subtracted from the combined stimulation recording (ii) to show the residual activity (iii), where an increase in spikes above the activity expected from electrical and optogenetic activity is labelled in red, and a decrease is labelled in blue. White lines in (i) and (ii) are the spatial tuning curves for each response. Black line in (iii) is the electrical spatial tuning curve (same as i). (iv) The residual combined stimulation activity at the best electrode (dotted green line in iii) is plotted against the current level relative to threshold as a percentage of the maximum electrical activity. Values above the horizontal dotted line indicate activity greater than the optogenetic-only and electrical-only responses (i.e., combined-only activity). Values to the left of the vertical dotted line indicate activity below the electrical activation threshold (i.e., facilitation). (D) The residual combined stimulation activity for (i) H134R and (ii) ChIEF at different optogenetic stimulation levels. Optogenetic stimulation levels are given relative to the optogenetic activation threshold. Solid line is the mean, shaded areas are minimum and maximum.

Activation width increased at suprathreshold optogenetic stimulation intensities in both H134R and ChIEF mice (p < 0.05, Student’s t-test) (Fig. 5B). Trials with perithreshold light intensities (between − 1 dB and 1 dB) demonstrated a reduction in activation width for ChIEF mice (ratio: 0.75 ± 0.2, p < 0.05, Z-test with mean of 1, 6 trials, 4 mice) but not for H134R mice (ratio: 1.0 ± 0.1, p > 0.9, Z-test with mean of 1, 4 trials, 3 mice). The reduction in activation width for ChIEF mice was also evident when measured at 10 CL, 20 CL, 30 CL, 40 CL, and 50 CL above threshold (p < 0.05, Student’s t-test), but not any level for H134R mice.

To understand how much of this facilitation was due to the interaction between the two stimuli rather than the two stimuli acting separately, we quantified the increase in spikes at the best recording electrode for combined stimulation compared to electrical stimulation. A recording in which combined stimulation facilitation was observed is shown in Fig. 5C to demonstrate the analysis method. A median smoothing was applied to the difference. Consistent with the results of the auditory nerve, we observed no substantial effect of optogenetic stimulation in the inferior colliculus activity at subthreshold electrical levels for both channelrhodopsins (Fig. 5D).

Temporal fidelity, precision and stability of high-rate stimulation

Next, we compared the temporal characteristics of auditory nerve responses to optical, electrical, or combined stimuli in H134R and ChIEF mice. Stimuli were presented as a 300 ms burst at 100, 200, and 400 pps at a 1 Hz burst presentation rate. Electrical-only, optogenetic-only, and combined stimulation trials were randomly interleaved. In accordance with our previous studies33 and consistent with the methods used in Fig. 4, the combined stimulation response was isolated by subtracting the electrical-only and optogenetic-only responses from the combined stimulation responses. Combined stimulation trials used subthreshold light (0.6–7 dB below threshold), parathreshold light (0 dB), or suprathreshold light (1.7–17 dB above threshold) with electrical stimulation (− 1.7 to 9 dB relative to threshold) with intensities relative to the threshold at 4 pps.

Three characteristics were extracted from the responses: fidelity (measured as the percentage of stimuli that elicited responses), temporal spread (measured as the change in response latency between the earliest and latest response) and stability (measured as the change in response amplitude between the first and last response) (Fig. 6A). To best compare temporal spread and stability, we consider trials which achieve fidelity equal to or above 50% in our analyses.

Fig. 6
figure 6

Characterising the auditory nerve responses to high-rate stimulation using suprathreshold electrical (0.4–9 dB above threshold), suprathreshold (0.7–17 dB above threshold) optogenetic, or combined stimulation using three relative intensities of optical stimuli and electrical (− 1.7 to 9 dB) stimulation. Stimulation intensities were relative to the threshold at 4 pps. (A) Example optogenetic (top) and electrical (bottom) responses of a ChIEF transduced auditory nerve to 100 pps stimulation. Dark blue and dark red traces show recordings over time (triangles show peaks of evoked responses, i.e., fidelity). Blue-magenta traces show responses from the dark blue trace overlaid relative to the stimulus. Triangles show peaks and troughs of the first and last responses. Temporal spread is measured as the time difference between the peak of the first and last responses (1), and stability is measured as the relative change in the peak-peak response amplitude between the first and last responses (2). Blue bar represents the timing of the optical stimulus for the optogenetic response, and the red bar represents the timing of the electrical stimulus for the electrical response. (B) The percentage of responses elicited (fidelity) at 100 pps in H134R (i) and ChIEF (ii) transduced mice. (iii) The median percentage of responses elicited for H134R and ChIEF mice at 100–400 pps. (C) Change in response timing (temporal spread) across the burst at 100 pps for each channelrhodopsin. (D) Change in the response amplitude (stability) across the burst at 100 pps for each channelrhodopsin.

Comparisons of electrical-only and optical-only stimulation

In mice transduced with the H134R channelrhodopsin, auditory nerve responses to 100 pps suprathreshold (0.7–17 dB above threshold) optical stimulation showed significantly worse fidelity (median ± interquartile range: 20 ± 22%) compared to suprathreshold (0.4–9 dB above threshold) electricalonly stimulation (100 ± 0%) (p < 0.001, Wilcoxon-Mann–Whitney test) (Fig. 6Bi). Over the duration of the 300 ms burst, the responses to 100 pps optical stimulation increased in latency by 491 ± 167 µs and decreased in amplitude by − 5.6 ± 0.9 dB, while there was minimal change to the responses to electrical stimulation (20 ± 26 µs change in latency and − 0.3 ± 0.4 dB change in amplitude) (Fig. 6C(i),D(i)).

Compared to H134R mice, the auditory nerve responses of ChIEF mice to suprathreshold 100 pps optical stimulation demonstrated significantly greater fidelity (80 ± 47%; p < 0.05, Wilcoxon-Mann–Whitney test), similar temporal spread (376 ± 347 µs; p > 0.4, Wilcoxon-Mann–Whitney test), and significantly poorer stability (− 8.3 ± 2.7 dB; p < 0.05, Wilcoxon-Mann–Whitney test) (Fig. 6B(ii), C(ii), D(ii)). Although the fidelity of auditory nerve responses to optical stimulation at 200 and 400 pps were not significantly different between H134R and ChIEF mice (p > 0.2, Wilcoxon-Mann–Whitney test), a subset of 200 pps trials in ChIEF mice achieved fidelity above 50% and were included in temporal spread and stability analyses (Fig. 6B(iii), Supplementary Fig. S2). Only a single trial in H134R mice achieved fidelity above 50%, and was analysed further (Supplementary Fig. S2) but no median calculated for Fig. 6B–D(iii). None of the 400 pps trials for H134R nor ChIEF achieved fidelity above 50% and were excluded from stability and temporal spread analyses (Fig. 6B(iii), Supplementary Fig. S2).

Compared to electrical stimulation, optical auditory nerve responses of ChIEF mice demonstrated significantly worse fidelity at all rates (p < 0.001, Wilcoxon-Mann–Whitney test) (Fig. 6B(iii), Supplementary Fig. S2). The temporal spread of optical auditory nerve responses in ChIEF mice was also significantly worse at 100 and 200 pps compared to electrical auditory nerve responses (p < 0.001, Wilcoxon-Mann–Whitney test) (Fig. 6C(iii), Supplementary Fig. S2). Although the stability of optical auditory nerve responses of ChIEF mice was significantly worse compared to electrical auditory nerve responses at 100 pps (p < 0.001, Wilcoxon-Mann–Whitney test), stability was not significantly different at 200 pps (p > 0.8, Wilcoxon-Mann–Whitney test) (Fig. 6D(iii), Supplementary Fig. S2).

Comparisons of combined stimulation (with suprathreshold optical stimulation) to optogenetic-only and electrical-only stimulation

Application of combined stimulation using suprathreshold optical stimulation significantly improved the fidelity compared to optogenetic-only stimulation in H134R mice at all rates (p < 0.05, Wilcoxon-Mann–Whitney test) (Fig. 6C(iii), Supplementary Fig. S2). This was also true for ChIEF mice at 200 and 400 pps (p < 0.05, Wilcoxon-Mann–Whitney test), but not at 100 pps, where optogenetic stimulation already demonstrated high fidelity (p > 0.4, Wilcoxon-Mann–Whitney test) (Fig. 6B(ii)). Compared to electrical stimulation, however, the fidelity of combined stimulation responses using suprathreshold optical stimulation was significantly worse (p < 0.05 at all rates and for both channelrhodopsins, Wilcoxon-Mann–Whitney test).

At 100 pps, temporal spread and stability were significantly improved for combined stimulation using suprathreshold optical stimulation compared to optical stimulation for both channelrhodopsins (p < 0.001, Wilcoxon-Mann–Whitney test) (Fig. 6C,D(i-ii)). The responses of ChIEF mice to 200 pps combined stimulation also demonstrated significantly better temporal spread (p < 0.01, Wilcoxon-Mann–Whitney test) but not stability (p > 0.8, Wilcoxon-Mann–Whitney test) over optical responses.

At 200 and 400 pps, electrical stimulation responses demonstrated significantly better temporal spread than the responses to combined stimulation using suprathreshold optical stimulation in ChIEF mice (p < 0.001, Wilcoxon-Mann–Whitney test). This was also true for H134R mice at 100 pps (p < 0.001, Wilcoxon-Mann–Whitney test) but not at 200 pps (p = 0.14, Wilcoxon-Mann–Whitney test) (Fig. 6C(iii), Supplementary Fig. S2). The stability of electrical responses was also significantly higher than the stability of responses to combined stimulation using suprathreshold optical stimulation at 200 and 400 pps for H134R mice (p < 0.05, Wilcoxon-Mann–Whitney test), but not for ChIEF mice (p > 0.19, Wilcoxon-Mann–Whitney test) (Fig. 6D(iii), Supplementary Fig. S2).

The fidelity of combined stimulation responses using suprathreshold optical stimulation was similar between H134R and ChIEF mice, except at 200 pps (H134R: 95 ± 72%, ChIEF: 56 ± 90%, p < 0.05, Wilcoxon-Mann–Whitney test) (Fig. 6B(iii), Supplementary Fig. S2). Temporal spread followed a similar trend, with responses at 200 pps demonstrating a significant difference (H134R: 36 ± 56 µs, ChIEF: 110 ± 182 µs, p < 0.05, Wilcoxon-Mann–Whitney test), but no difference for other rates (Fig. 6C(iii), Supplementary Fig. S2). The stability of responses to combined stimulation using suprathreshold optical stimulation was significantly different at 100 and 200 pps (p < 0.01, Wilcoxon-Mann–Whitney test) but not at 400 pps (p = 0.1, Wilcoxon-Mann–Whitney test) (Fig. 6D(iii), Supplementary Fig. S2).

Opsin expression along the cochlea

To determine if the differences observed between H134R and ChIEF during high-rate stimulation might be a consequence of differences in transduction efficiency due to the injection or different virus preparations (see Methods), we conducted an immunohistological analysis of opsin expression in the apical, middle and basal turns of cochleae.

Histology showed high transduction of the channelrhodopsin in the auditory neurons of the left (injected) cochleae of H134R (n = 7) and ChIEF (n = 8) mice, as identified and quantified by the fluorescent YFP tag (Fig. 7A). The percentage of YFP-positive neurons was not significantly different between channelrhodopsin cohorts (p > 0.1, Kruskal–Wallis test; Fig. 7B), suggesting that it would not have significantly contributed to the differences in high-rate stimulation responses. A significant apical-basal gradient of YFP-positive cells was evident (Fig. 7B) (p < 0.05, RM-ANOVA). While not quantified, there was an apparent variability in YFP brightness between cells and, within some cells, patches of intense YFP labelling (Fig. 7C).

Fig. 7
figure 7

Expression patterns of channelrhodopsin in the two virus transduced cohorts. (A, B) Example fluorescence microscope images of auditory neurons in Rosenthal’s canal of the middle turn in H134R (left) and ChIEF (right) mice. Green = YFP (opsin), red = Tuj (auditory neurons), blue = DAPI (nuclei). Scale bar is 50 µm. Arrow heads indicate intense YFP labelling observed in a subset of cells. (C) Higher magnification images of intense YFP labelling indicated in (A, B). Scale bar is 5 µm. (D) Mean percentage of YFP-positive auditory neurons in the apical, middle and basal turns of H134R and ChIEF mice. Error bars represent standard error of the mean.

Discussion

This study compared the temporal characteristics of electrical, optogenetic, and combined stimulation of the auditory nerve using clinically relevant stimulation rates and a virus-transduced mouse model. Critically, it identified the subtle differences in responses of auditory neurons expressing channelrhodopsins with different channel kinetics and light sensitivity.

Optical responses

The activation thresholds measured for H134R and ChIEF were not consistent with expected thresholds based on findings in single cell recordings from cultured cells where the H134R opsin was reported to have a lower threshold compared to ChIEF. Many factors can influence threshold, including expression levels, expression distribution, protein trafficking, and stimulating equipment positioning32,37,50,51. Additionally, the faster kinetics of ChIEF result in higher synchronicity across the auditory nerve, and these synchronised compound action potentials lend themselves to being more readily identifiable from noise.

This study is the first to evaluate the reliability of optogenetic stimulation following day-long, high-rate stimulation. We observed substantial reduction of response amplitude and significant elevation in optogenetic activation thresholds over the course of repetitive optical stimulation for ChIEF transduced mice, resulting in complete loss of responses in a subset of mice. Optogenetic activation thresholds for H134R responses also became elevated over the day, but responses were not entirely lost. The possible reasons for the observed results include bleeding in the cochlea, unstable recording preparation, and channelrhodopsin bleaching. The first reason can be easily accounted for by the fact that the health of the mice was monitored closely, and intracochlear bleeding during recordings was not an issue for the mice included in the measure. Secondly, unlike the optogenetic responses, the activation threshold of electrical stimulation was relatively stable over the experiment duration, which indicates the recording preparation in the inferior colliculus was stable. Auditory nerve responses sizes were also stable for electrical stimulation, as well as optogenetic responses in H134R mice, but not ChIEF mice. This might indicate a difference in the stability of the preparation between H134R and ChIEF mice, or that changes in the amplitude of responses in the auditory nerve are much harder to detect than changes in the activity of the inferior colliculus. Channelrhodopsin bleaching, however, is much harder to prove or disprove without a substantially larger cohort and accompanying in vitro study. Spectroscopy studies suggest that the bleaching phenomenon reflects a branched pathway that occurs between the ‘open’ state and ‘ground’ state of channelrhodopsin52,53. This phenomenon manifests as a reduction in photocurrents and elevation of optogenetic activation thresholds or as a complete loss of response to light stimulation. Although bleaching can last many minutes to hours, extended periods of darkness can recover the channelrhodopsins back to the ‘ground’ state. Bleaching has been reported for slow channelrhodopsins54, and Matarazzo et al.55 noted high intensities (15 mW) or long pulse durations (5 ms) consistently resulted in loss of the optically evoked compound response in the sciatic nerve of H134R expressing transgenic mice. However, bleaching has not been studied in detail for rapid channelrhodopsins (i.e., ChIEF and H134R), possibly due to the relatively short duration of other studies and/or minimisation of the stimuli delivered. Indeed, bleaching of ChIEF and H134R appeared to correlate with the number of light stimuli presented in our study. In a subset of mice, we retested thresholds following 1–2 h of darkness, but did not observe an improvement in thresholds or response amplitude. It is possible that the 1–2 h were insufficient, and a longer period may be necessary to restore the response. Although this presents a challenge to the development of clinical optical stimulation, it is not prohibitive—turning off the device overnight and allowing bleached channelrhodopsins to return to a ground state may be a feasible solution in many applications. These results emphasise the need for extended and/or chronic stimulation studies for channelrhodopsins currently undergoing clinical translation. Techniques and encoding strategies that minimise the energy would thus be more valuable for the clinical translation of optogenetics, such as combined stimulation.

Combined facilitation

Unlike previous studies using transgenic H134R mice, we did not observe reliable facilitation of electrical activation thresholds at any light intensity for combined stimulation for any channelrhodopsin. In the case of H134R, and possibly ChIEF, this is likely a consequence of the difference in the method of mediating channelrhodopsin expression in the auditory neurons. Compared to transgenic mice, using an AAV-based transduction method resulted in fewer auditory neurons expressing the channelrhodopsin. In addition, the channelrhodopsin may not have been optimally trafficked to the cell membrane compared to transgenic mice used in previous studies. Indeed, similar results have been observed in previous studies of the inferior colliculus response. Although one study demonstrated facilitation of the electrical response using simultaneous optogenetic stimulation in a transgenic model35, another study using a virus transduced model could not consistently achieve the same37. This indicates that the variability of channelrhodopsin expression in the auditory nerve resulting from viral transduction decreases the likelihood of overlapping the electrical and optogenetic stimuli and diminishes the success of combined stimulation at low rates.

Despite this, facilitation might be further improved by delaying the electrical pulse beyond the 1 ms used in this study. Indeed, several studies indicate that channelrhodopsins achieve peak photocurrent milliseconds after activation39,56. Although delaying the electrical stimulus to coincide when the photocurrent is higher and/or the optogenetic depolarisation is greater can lead to better facilitation35, doing so limits the maximum stimulation rate that can be achieved, and should be considered carefully.

Lack of facilitation in ChIEF mice at 100 pps stimulation rates may also reflect a difference in the ion preference of those channelrhodopsins compared to H134R. A study by Hart et al.57 suggested that it is not merely the optogenetic depolarising currents that facilitate electrical activation thresholds, but rather the increase in intracellular calcium, which reduces the activation threshold of voltage-gated ion channels. The ion preference for ChIEF may be inferred from characterisation studies performed on ChEF, an opsin closely related to ChIEF. Ion selectivity was shown to be similar to H134R, with some differences in the permeability ratio of different cations between ChR2 and ChEF39.

High-rate stimulation

Compared to transgenic animals, fidelity at 100 pps for H134R was significantly lower, possibly as a result of the poorer quality and/or quantity of channelrhodopsin expression in transduced animal models33. Studies have shown that improving trafficking to the plasma membrane improves activation thresholds37 and the fidelity of responses at high stimulation rates32. Keppeler et al.32 demonstrated that the inclusion of a trafficking peptide motif to the end of the Chronos channelrhodopsin increased the maximum stimulation rate at which optical auditory brainstem responses could be elicited. Developments to gene therapy techniques to improve the consistency of transduction along the cochlea and successful channelrhodopsin trafficking are likely to benefit all applications of optogenetic and combined stimulation.

ChIEF optogenetic-only responses demonstrated significantly higher reliability at 100 pps over H134R, likely a consequence of faster channel closing kinetics. Despite this, ChIEF responses were not significantly more stable nor temporally precise. Furthermore, reliability of ChIEF was not significantly improved over H134R responses at 200 and 400 pps, demonstrating that the improvements granted by faster channel kinetics are limited. Similar results have been observed with Chronos and vf-Chrimson, which exhibit channel closing kinetics of 3.6 ms40 and 2.7 ms56, respectively, much shorter than the ~ 9 ms of ChIEF. Studies of auditory neuron responses mediated by these channelrhodopsins indicate that spike probability drops below 50% by 200 pps even with the addition of trafficking peptide motifs to the channelrhodopsins32,41, and that they have significantly poorer fidelity compared to acoustic click responses at ≥ 100 pps58. Although this is described as “adaptation”, these studies show not a reduction in spike probability as adaptation usually refers to, but a complete loss of responses within the first few stimuli, and thus a complete halt of information delivery to the auditory system at these rates. Furthermore, these channelrhodopsins require high stimulating power for functional responses, raising more concerns regarding the feasible clinical translation of optogenetics for cochlear implants.

One could argue that the difference in performance between the two channelrhodopsins used in this study might also be influenced by the different virus capsids and viral titres. Given that we see no significant difference in the transduction efficiency in the immunohistochemistry, it is unlikely that this would be the most significant contributor to the results observed.

Importantly, the engineering and discovery of new channelrhodopsins continues today, with new variants with novel properties published regularly. It is possible that a channelrhodopsin that performs robustly in high-rate stimulation applications will be closer in the future. Additionally, this application may benefit from the use of two opsins together. Bansal et al.59 conducted a theoretical analysis indicating that co-expressing a step-function opsin and fast channelrhodopsin in the same cell may permit long-term high-rate stimulation, as the two opsins overcome each other’s limitations. Although there are many technical limitations of achieving this technique, it may prove to be very effective as a treatment to hearing loss.

Although facilitation was not observed at low stimulation rates (4 pps), three characteristics of facilitated combined stimulation responses (fidelity, temporal precision and latency) were evident at the higher stimulation rates (100–400 pps). The presentation of many stimuli close together improves the probability of eliciting a combined response. For example, at 400 pps, the tissue is exposed to light for 40% of the burst stimulus.

Combined stimulation using suprathreshold optogenetic stimulation, showed very high reliability and stability even up to 400 pps for both channelrhodopsins. Although the temporal precision was poorer at these levels compared to electrical stimulation, combined stimulation still outperformed optogenetic-only stimulation whilst maintaining electrical-like reliability and stability, thus achieving a system that can deliver a spatially precise stimulus with high reliability and stability. Interestingly, the temporal properties of the combined stimulation responses for the two channelrhodopsins were similar, indicating that the closing kinetics do not strongly influence combined stimulation outcomes.

Whether these high rates of stimulation will be needed in optogenetic cochlear implants remains a point of debate in the field. Responses to optogenetic stimulation generally exhibit higher stochasticity than to electrical stimulation33,40, and this stochasticity may be sufficient to prevent the decrease in hearing outcomes caused by the hypersynchronicity of electrical stimulation60,61,62. However, temporal information such as the fundamental frequency (for speaker recognition and music appreciation) and temporal fine structure (for speech perception in noise) will require high fidelity stimulation in the range of 100–1000 Hz63,64,65.

Opsin expression

Using AAV gene therapy, we were able to successfully transduce a high proportion of auditory neurons with the channelrhodopsin genes. Similar to previous studies, we observed large variability of channelrhodopsin expression along the apical-basal gradient of the cochlea and between cells within the same turn36,37. We also report here sub-cellular areas of intense YFP expression in some cells. Given that the expression appears to be intracellular rather than on the cell membrane, this may be indicative of sub-optimal channelrhodopsin expression or trafficking in some cells. This may be caused by a number of cellular processes, including protein degradation, or endosomal/lysosomal uptake. The cause and consequence of this phenotype requires further investigation, but given that this phenotype is not observed in transgenic animals suggests that it might be resolved by optimising the viral constructs—a vital step towards the translation of optogenetics to the clinic.

Although virus transduction is a step towards clinical translation, the neonatal injections used here are an imperfect model for translatability. Unfortunately, the transduction efficiency of adult intracochlear injections (in mice and primates) remains highly variable37,47,51,66. Wrobel et al.51 improved the efficiency in adult Mongolian gerbils through the use of a direct modiolar injection technique that resulted in low but consistent transduction throughout the cochlea, at the cost of a ~ 25% loss of SGNs. Further work by Huet et al.66 showed slow rate intramodiolar injections could preserve the SGN population, but resulted in reduced transduction efficiency. Further research to improve viral transduction of adult auditory neurons in rodents is ongoing, and will be important for the translation of these technologies towards cochlear implants.

Limitations and clinical translation

For the best outcomes of combined stimulation, high spatial overlap of the two stimuli is critical. This is not easily achieved with viral methods of auditory neuron transduction in mice. Improving the quantity of transduced auditory neurons and the quality of channelrhodopsin expression will improve both optogenetic-only and combined stimulation, but combined stimulation will also require the clever design of devices that can maximise overlap between the two stimuli. In our study, stimulation was confined to the basal turn where expression was worst, although typically ~ 60% of neurons in that turn were transduced.

It is important to recognise that these studies were conducted in acutely deafened ears. Consequently, the auditory nerve has not undergone degeneration as many chronically deaf ears do. Previous studies have shown that degeneration resulting from chronic deafness can negatively impact the performance of both optogenetic and combined stimulation, demanding higher powers of optogenetic stimulation to achieve electrical-like temporal fidelity33. These results emphasise the importance of developing therapeutics to maintain or restore auditory nerve health, such as neurotrophins.

Conclusion

Novel stimulation methods are necessary for advances in cochlear implant recipient outcomes, and it is critically important to validate these methods in clinically translatable models. This study is the first to show that optogenetic responses can be severely diminished or lost during day-long, high-rate stimulation of auditory neurons expressing the rapid channelrhodopsin ChIEF. This has important implications for the preclinical assessment and translation of optogenetics to clinical applications.

Combined stimulation, previously shown to increase spatial resolution compared to electrical stimulation due to facilitation of the response, was shown here to also improve reliability and temporal precision of responses at 400 pps in mice transduced with the ChIEF and H134R channelrhodopsins. We reason that very high viral transduction efficiency is necessary for the priming effect of combined stimulation. This study provides important insight into the use of optogenetics for clinical applications.