Introduction

In the normal hearing process, sound waves enter through the outer ear, and travel along the external auditory canal to strike the eardrum, causing it to vibrate. These mechanical vibrations are transmitted to the inner ear via ossicles. Within the inner ear, the cochlea processes and encodes these vibrations into different frequencies and amplitudes. The traveling mechanical wave within the cochlea activates the corresponding hair cells on the basilar membrane based on the sound frequency. These hair cells then release neurotransmitters that stimulate the auditory neurons, conveying auditory information to the brain’s auditory cortex1,2,3.

Deafness may result from various factors that damage or disrupt components of the auditory system. Traditional hearing aids, which amplify sound, are commonly used to treat moderate-to-severe hearing loss4,5. In cases where the components responsible for transmitting vibrations, such as ossicles, are damaged, middle ear implants are employed. These implants capture incoming sounds through a microphone and convert them into micro-vibrations via a transducer implanted in the middle ear6,7. Dysfunction of hair cells leads to a loss of fine-tuning of incoming sound, which can result in severe-to-profound hearing loss. This type of hearing impairment is treatable with cochlear implants (CIs), which transform sound into electrical pulses that stimulate the auditory nerves8,9,10. However, for patients with cochlear or auditory nerve malformations, CIs are ineffective. In these cases, auditory brainstem implants (ABIs), which directly stimulate the cochlear nuclei at the brainstem using a process similar to CIs, provide a viable alternative11,12. Despite their effectiveness, cochlear implants, which are among the most successful neural prostheses to date, face challenges such as frequent battery replacement, susceptibility to water damage, and esthetic concerns regarding external components, prompting research into fully implantable cochlear implants (FICIs)4,13,14,15,16,17.

Research on FICIs primarily focuses on the design of implantable sensors, such as microphones, and the necessary interface electronics to stimulate auditory neurons. Briggs et al. proposed an electret-based microphone combined with a cochlear stimulator featuring both external and subcutaneous microphones to allow patients to switch between a regular CI and a fully implantable system18. Jung et al. introduced an implantable microphone with an acoustic tube to mitigate the filtering effects of skin and tissue19. Jia et al. developed a piezoelectric (PZT) microphone, encased in a titanium tube for biocompatibility, that senses sound through ossicle vibrations and this system was tested with fresh cadaveric heads20. Although these sensors show promising sound sensing capabilities, they output a single signal that requires electrical filters for frequency band allocation and processing. Our group’s previous work, led by Beker et al. developed a PZT transducer that acts as a mechanical filter for ambient sound within a specific frequency band21. This prototype, however, only supports single-channel operation; the system’s size and mass, limited by ear anatomy, constrain multi-channel operation. Our design overcomes this by using thin film Pulsed Laser Deposited (PLD) PZT sensors, which provide 8-channel outputs by mechanically filtering and discriminating the sound signal into different frequency bands22,23.

Interface electronics are crucial for converting these implantable sensors into functional hearing devices. Sarpeshkar et al. proposed an ultra-low power, programmable analog bionic ear processor for an electret microphone, capable of handling a wide input dynamic range to accommodate most daily sound levels. This processor generates control signals for the electrodes that stimulate the auditory neurons24. Similarly, Georgiou et al. reported an analog speech processor and stimulator for a totally implantable cochlear prosthesis system, featuring a single-chip design with a processor power dissipation of 126 μW25. Both cochlear implant circuits24,25 primarily focus on front-end signal conditioning and lack a comprehensive neural stimulation unit, essential for evoking responses from auditory neurons. Power consumption is the most critical design factor for FICI systems since battery replacement of an implantable system involves complex reimplantation surgery. A low-power, implantable signal conditioning IC for a PZT middle ear sensor, reported in ref. 26, achieves reduced power consumption of the overall interface electronics below 600 µW with an optimized stimulation pulse shape. However, the use of a single-sensor architecture with eight digital filters for frequency isolation increased the power dissipation. Jang et al. introduced a MEMS device that senses sound through mechanical filters and provides electrical stimulation to auditory neurons27. However, the proposed mechanical filters struggle to capture low frequencies (<3 kHz) and their low output voltage limits the sensing capability of the system. Additionally, the off-chip implementation of the interface circuit for neural stimulation fails to meet the power, compression rate, stimulation current level, and patient fitting requirements of FICI systems.

Here, we present a custom-designed FICI system featuring a low-profile MEMS acoustic hearing sensor and a low-power signal-conditioning circuit for long-lasting operation. The compact, highly sensitive acoustic sensor, attachable to the middle ear, offers efficient mechanical filtering for vibrations present in everyday sounds. The sensor’s electrical output is then processed by a low-power signal-conditioning circuit, efficiently stimulating auditory neurons. The electrical stimulations of auditory neurons corresponding to the selected frequency of incoming sound are conducted through cochlear electrodes, based on the signal processing performed by the system. The signal-conditioning circuit’s programmability allows customization of the stimulation current level and range to match each patient’s threshold and most comfortable levels. The implemented system was validated in vivo using an animal model, confirming its broad operational range.

Results

Design and realization of full-custom FICI system

The full-custom FICI system designed to sense daily sound signals and stimulate auditory neurons according to sound levels and frequencies is presented in Fig. 1a28,29. Sound vibrations are converted to mechanical vibrations on the eardrum and sensed by a low-profile acoustic sensor comprised of multi-channel PZT sensors acting as mechanical filters. The outputs of these sensors were evaluated by low-power signal-conditioning interface electronics, which generated stimulation pulses for auditory neurons. The neural stimulation pulses were delivered to auditory neurons through an intra-cochlear electrode. The interface electronics was powered by a rechargeable battery, chargeable through a radio frequency (RF) coil also used for post-implantation patient fitting applications.

Fig. 1: Full-custom fully implantable cochlear implant system architecture and sub-components.
figure 1

a Detailed representation of the full-custom fully implantable cochlear implant (FICI) system where the piezoelectric sensor is located on the eardrum, the sensor outputs are provided to the signal conditioning circuit with flexible interconnects, then the generated electrical stimulation pulses at signal conditioning circuit are delivered to the cochlea with intra-cochlear electrodes. The power and patient fitting controls of the signal conditioning circuit can be delivered through a rechargeable battery and a radio frequency coil. b 3D illustration of the bilayer sound sensor including spacers, caps and flexible interconnect connections. c Fabricated sound sensor with an overall size of 3.5 × 3.5 mm2. d Micrograph of the fabricated signal conditioning circuit with highlighted sub-blocks where LA stands for logarithmic amplifiers, CR is current rectifiers, LPF is low-pass filters, S/H is sample and Hold, Ctrl is control unit, V/I is voltage to current converter and SM is switch matrix circuits. e Schematic representation of the signal conditioning interface electronics.

Key considerations for sound sensing and processing in the middle ear include: (1) The compact structure of the sound sensor due to restrictive volume and area in the middle ear and intricacies of middle ear surgery; (2) placement of the sound sensor directly on the ossicular chain without anchoring to the middle ear walls to avoid reducing the middle ear’s vibration level through mass and force loadings; (3) the crucial role of the number of frequency bands in maintaining natural sound characteristics, despite the increased processing power required; (4) human voice mainly includes frequencies between 250 and 4000 Hz, with lower frequencies encompassing body noise and higher frequencies necessary for speech perception in noisy environments, hence the resonance frequencies of the frequency bands should be appropriately placed; and (5) the need for custom-designed ultra-low-power interface circuits to efficiently stimulate auditory neurons, with a novel PZT sensor designed for a wide dynamic range and maintaining a high signal-to-noise ratio at the output.

Commonly used PZT sensors for this application are beam structures with piezoelectric films in the 31-mode, which can produce high strain depending on the deflection of the beam. Using tipmass increases deflection and strain on the piezoelectric film, thus improving output voltage. Additionally, these masses allow for denser structures by reducing the beam lengths, enabling sound detection from vibration of hearing chain while resonating in the hearing frequency range. Center frequencies of the bands are matched with resonance frequencies of the beams. Researches in the literatures indicate that 8-channel systems are ideal for low-power applications, with frequencies spread linearly from 300 to 1200 Hz and distributed logarithmically above 1200 Hz. in daily sound signals30. Thin-film PLD-PZT (MESA+ Institute, Netherland), was seleceted for its superior PZT properties, allowing for a wide dynamic range.

The volume and area constraints of the implantable system were managed with a bilayer transducer design. Channels were located with respect to their length to optimize the dimension of the structure. The layer consisting three channels, includes beams resonating at 300, 1200, and 1600 Hz. Thelayer consisting five channels has frequencies of 600, 900, 2200, 3200, and 4800 Hz channels, as depicted in Fig. 1b. This new generation design has an active volume of 3 mm × 3 mm × 0.36 mm for each layer and a total active mass of 5.2 mg. The structure is compactly packaged with a total size of 3.5 mm × 3.5 mm × 1.52 mm and a total mass of 20.1 mg, staying below the loading limit of excess mass on ossicles during vibration. Unlike our previous sound sensor structures, the design was further optimized by reducing beam thickness to enhance the output levels of the sound sensor. This adjustment increased the stress levels on the PZT layer, enabling the sensor to generate ~300 mVpp output voltage under 100 dB SPL. The signal-to-noise ratio (SNR) of the channels of the transducer was measured in the absence of any input signal. SNR levels between 66.8 and 84.2 dB were obtained as a result. Table S1 displays the calculated SNR levels of the channels, noting that tests were not conducted in an anechoic chamber, and line noise on the system reduces actual performance, particularly at low frequencies.

The signal conditioning circuit, at the subsequent stage, utilized the voltage signals from the multichannel acoustic PZT sensor to build neural stimulation pulses31. One critical parameter of implantable systems is power consumption, which constrains the device’s operating time. Therefore, the signal conditioning circuit was primarily designed in current-mode to minimize power losses at current-to-voltage conversion and vice-versa, as shown in Fig. 1c. Another important parameter is the input dynamic range, which greatly affects speech perception. Daily sound levels range between 40 and 100 dB SPL16, and previous studies have demonstrated that a 50 dB input dynamic range provides adequate speech perception for multichannel cochlear implants32. Although the daily sound range is broad, the electrical dynamic range of the cochlea is about 20 dB32,33. Consequently, a low-power wide-range Logarithmic Amplifier (LA) is employed as the first stage to logarithmically compress the input sound range and amplify the limited amplitude of size constrained PZT sensors. Our amplifier can fit a 60 dB input dynamic range into ~15 dB (Fig. S1). The amplified current output of the LA is fed to an original multiplying current rectifier, which applies additional amplification in current mode. The output current of the rectifier at each channel is then filtered and sampled with a sample/hold circuit to generate a reference bias for the stimulation current generator that drives the auditory neurons with the required high currents. The current generator circuit also operates as a 7-bit digital to analog converter for the threshold and most comfortable level adjustment, enabling easy and wide-range control of the current by adjusting these digital signals. To ensure charge neutrality, the generated current is converted into a biphasic pulse via a switch matrix. The switch matrix targets the corresponding electrode (E1-E8) for each channel that matches the PZT sensor frequencies with auditory neurons. In order to further reduce the power dissipation, the LAs at each channel are powered down when inactive. The control unit generates power enable signals for the LAs, selection signals to switch between channels for sampling, and switch matrix control signals for 8-channel continuous interleaved sampling stimulation strategy, where each electrode is stimulated sequentially without any overlapping time to prevent interference between channels. The control unit is designed to accommodate different channel modes, allowing the system to operate at 1, 4, 6, or 8 channels and enabling user trade-offs between sound perception quality and power dissipation.

The power dissipation of the signal conditioning circuit while operating at 8-channel mode is below 600 µW, comparable with state-of-the-art FICI interface circuits, and enables long-lasting operation with an implantable battery. Power consumption of the interface circuit can be distributed as front-end signal conditioning, covering from logarithmic amplifier to the sample hold circuit, and the neural stimulation circuit, which is the most power-hungry part of the interface including the stimulation current generator and the switch matrix at the final stage. The power distribution of the FICI interface is 9.6 µW at the front-end signal conditioning circuit and 580.2 µW at the neural stimulation part with a typical biphasic rectangular current pulse. This power consumption can be further decreased by optimizing the stimulation pulse shape, which helps reduce the voltage compliance at the stimulation electrodes. Figure S2 presents the stimulation current and voltage on the electrode with a typical rectangular pulse shape that is applied in this study and an optimized exponential pulse shape where the voltage on the electrode is reduced by around 20% with the pulse shape optimization. The pulse shape optimization enables reducing the power consumption of the neural stimulation part to 460 µW with the reduced supply voltage. Therefore, the power consumption of the overall system can be kept below 500 µW while it is expected to achieve the neural stimulation threshold with a lower stimulation current level with the optimized exponential current pulse shape.

The implantable rechargeable batteries that can power the FICI interface are subject to international standards. A possible candidate for the full-custom FICI is Contego 50 mAh which is a Li-Ion implantable grade battery (ISO 13485) with a Titanium casing34. The 50 mAh capacity and 4.1 V voltage rating (205 mWh) of the battery allow more than 10 days of operation at 80% cycles when the FICI operates uninterruptedly all day. Contrary to conventional cochlear implants, which acquire power real-time using a coil pair, in FLAMENCO, wireless charging is applied. For charging the battery with an inductive coil link, there is a limitation regarding the maximum inductive power transferred to a tissue with respect to the part of the body where energy is transferred and generated heat around this tissue. According to these limitations, power density on the local head region cannot be larger than 10 W/kg when averaged over a 6-min period and cannot be larger than 20 W/kg for the duration of consecutive 10 s35. In addition, temperature rise is limited to 1 °C. In light of these limitations, the charging current is limited to 35 mA, meaning that the battery can be charged in less than 2 h.

In vitro validation of the FICI system

The hearing sensor was combined with a signal conditioning circuit and tested under different acoustic conditions to validate the sound sensing and electrical stimulation performance of the FICI. The acoustic PZT sensor was located on an artificial membrane that mimicked the eardrum. The input sound was applied through Etymotic Research ER-2 insert earphones, which were controlled by an audio amplifier, and the sound signal was generated by a signal generator. One of the earphones was plugged into the designated cavity, while the second was used to calibrate the sound levels. The acoustic sensor converted the incoming sound into voltage signals and is presented in Fig. 2a. Even at the lowest sound level (40 dB SPL) the sensor output was more than 150 μV, well above the input referred level of the signal conditioning circuit (10 μVrms), thus providing a high signal-to-noise ratio. The response for a combination of all 8 PZT sensor channels at a typical SPL (70 dB) that commonly occurs in daily life is given in Fig. S3. The channel bandwidths are wide enough to cover the full frequency band from 250 Hz to 6 kHz. The signal conditioning circuit sensed a wide range of input sound signals and provided a linear stimulation response to distribute the input sound signal to the electrical stimulation range of the cochlea. Figure 2b presents the sensing and neural stimulation response of the signal conditioning circuit with the acoustic sensor. The response showed that the FICI can capture sound signals between 45 and 100 dB SPL. For this test, a sound signal with 616 Hz, which corresponds to the resonance frequency of the second channel of the hearing sensor, was applied. The signal conditioning circuit generated a biphasic current pulse to stimulate auditory neurons, where the amplitude of the current varied according to the input sound level. The current pulses were generated at 21.1 Hz with 50 μs pulse width, which are typical values used for electrically evoked compound action potentials27. The stimulation frequency could also be tuned to 1 kHz, which is a typical value used in CI systems. For input sound ranges between 45 and 100 dB SPL, the system was able to generate stimulation currents in the range between 250 μA and 1 mA. The evenly distributed sound signals can vary from device to device; therefore, a 7-bit controller and a calibration circuit were added to arrange minimum-threshold and maximum-current levels for stimulation. This also provides patient fitting control to cover inter-recipient variability.

Fig. 2: In vitro characterization of piezoelectric sensor and signal conditioning circuit.
figure 2

a Acoustic piezoelectric sensor response of the second and seventh channels which have resonance frequency at 616 and 2931 Hz, respectively, where the responses were measured in the SPL range between 50 and 100 dB. The minimum detected sensor output was around 150 μV, above the minimum detection level of the signal conditioning circuit. b Generated neural stimulation current of the signal conditioning circuit at varying sound pressure level between 45 and 100 dB SPL. The stimulation current level varies between 250 μA and 1 mA and could be further extended using patient fitting control of the circuit.

Verification of the FICI using an in vivo animal model

Figure 3a presents the schematic representation of an in vivo experimental setup for the full-custom FICI system where the intracochlear electrodes were surgically placed in the cochlea of the animal model. The low profile PZT hearing sensor was mounted on a parylene membrane, placed on a sensor holder with a cavity 10 mm in diameter and 2 cm in length to mimic the ear canal23,36. The sound input of the system was provided through the Etymotic Research ER-2 insert earphones where the earphone was plugged into the designated cavity. The PZT sensor on the membrane converts the sound vibration into electrical signals, which are processed by the signal conditioning circuit. The signal conditioning circuit generates biphasic neural stimulation current pulses to activate the auditory neurons of the guinea pig through the intracochlear electrode array (MED-EL GmbH, Innsbruck, Austria), which was inserted through the round window into the scala tympani of the cochlea. The electrode array includes a reference electrode, placed extra tympanically on the bony wall of the bulla, and two stimulating intracochlear electrodes, which were inserted into the cochlea through the scala tympani. The stimulation electrodes were connected to the second and seventh channels of the FICI system and were tested with different frequencies to demonstrate the frequency selectivity of the system. The stimulation response was observed by the electrical auditory brainstem response (eABR) measurements, acquired via the optical amplifier and the Universal Smart Box of Intelligent Hearing Systems. Figure 3b shows an in vivo experiment setup used to validate the capability of the full-custom FICI system.

Fig. 3: Demonstration of the in-vivo experimental setup with an animal model.
figure 3

a Schematic representation of an in vivo experimental setup for the full-custom fully implantable cochlear implant system. where the intracochlear electrodes were surgically placed to the cochlea of the animal model. The piezoelectric sensor was mounted on a parylene membrane, which is placed on a sensor holder that has a cavity with 10 mm diameter and 2 cm length to mimic the ear canal. The sound input of the system was provided to the cavity through the Etymotic Research ER-2 insert earphone. The piezoelectric sensor outputs are processed by the signal conditioning circuit and generates biphasic current pulses to stimulate auditory neurons of the guinea pig. The current pulses are provided through an intracochlear electrode array which was inserted through the round window into the scala tympani of the cochlea. The stimulation response was observed by the electrical auditory brainstem response (eABR) measurements, acquired via the optical amplifier and the Universal Smart Box of Intelligent Hearing Systems (HIS). b The full-custom FICI system under the in-vivo tests on an animal model which also shows the surgically placed intra-cochlear stimulation electrodes.

In order to validate the system performance, eABR responses of six ears of Hartley guinea pigs (one ear of four animals and both ears of an animal were used) were tested and analyzed. Before measuring eABR responses, ototoxic deafening was performed to reduce the electrophonic activity in the eABR. Ototoxic deafening was achieved by administrating 600 mg/kg kanamycin (Kanovet, Vetaş, İstanbul, Turkey) by intramuscular injection and 75 mg/kg intraperitoneal furosemide (Lasix; Avetis Pharma, Istanbul, Turkey) after 1 hour once a week27,37,38. The deafening process of the guinea pigs was observed by checking the click and tone burst stimulus-evoked ABR responses. Figure 4a shows the response of an animal with normal hearing whereas Fig. 4b represents the response after ototoxic deafening was performed to that animal.

Fig. 4: In vivo experiment results with different sound levels.
figure 4

a Representative acoustic auditory brainstem response (ABR) waves with click stimulus (8–10 kHz) in a guinea pig with normal-hearing before the deafening protocol. The stimulus was presented at a frequency of 21.1 pulses per second with 512 repetitions. Stimuli level 30–90 dB sound pressure level (SPL) is shown in 10 dB steps. Roman numeral denotes ABR wave number. The wave amplitudes and latency increase and decrease, respectively, with increasing dB SPL. b Representative acoustic ABR waves with click stimulus (8–10 kHz) in a deafened guinea pig after the deafening protocol. The stimulus is presented at 21.1 pulses per second with 512 repetitions. Stimuli level 90 dB SPL is shown for two different recordings. c Representative electrical auditory brainstem response (eABR) waves to FICI stimulation in a deafened guinea pig at the second channel of the piezoelectric sensor with pure-tone acoustic stimuli at 616 Hz, which is the corresponding resonance frequency. A biphasic pulse train is presented to electrode #1 in monopolar configuration at 21.1 pulses per second with 512 repetitions. Stimuli level 40–100 dB SPL is applied in 10 dB steps. The wave amplitudes and latency increase and decrease, respectively, with increasing dB SPL. d Representative eABR waves to FICI stimulation in a deafened guinea pig at the second channel of the ABR with pure-tone acoustic stimuli at 2922 Hz, which is an off-resonance frequency of the cantilever. As expected, no response was observed from the eABR recording at the off-resonance stimulation. e Representative eABR waves to FICI stimulation in a deafened guinea pig at the seventh channel of the PZT sensor with pure-tone acoustic stimuli at 2922 Hz, which is the corresponding resonance frequency. A biphasic pulse train is presented to electrode #2 in monopolar configuration at 21.1 pulses per second with 512 repetitions. Stimuli level 40–100 dB SPL is shown in 10 dB steps. The wave amplitudes and latency increase and decrease, respectively, with increasing dB SPL. f Injected charge level at each pulse phase of the FICI stimulation in a deafened guinea pig which is excited with pure-tone acoustic stimuli.

FICI system can evoke eABRs

The auditory brainstem response (ABR) is a compound action potential evoked in response to transient auditory stimuli, typically clicks or tone bursts. It is characterized by a series of waves representing different stations along the auditory pathway. Wave II was quantified in the analysis, because the peak and trough of Wave II were reliably evoked by acoustic stimuli and electrical pulse shapes applied to the guinea pig. To verify whether FICI system pulses could generate an electrically evoked ABR (eABR), the system was applied to guinea pig, and its eABRs were measured at varying SPLs.

Figure 4c depicts the eABR recordings with acoustic stimulation from 40 to 100 dB SPL at 616 Hz, which is the resonance frequency of the second channel of the PZT sensor. A stimulus artifact in the eABR recording occurred at the onset of the stimulus. Response recordings between 4–6 ms may be affected by digastric muscle response. The eABR recordings of four deafened guinea pigs (one from both ear) is presented in Fig. S4. The mean eABR peak latency of Wave II was 1.37 ± 0.15 ms (n: 6) and the mean magnitude of the Wave II is observed as 0.78 ± 0.55 µV (n: 6). Decreasing stimulus level resulted in reduced amplitudes and prolonged peak latency. As expected, this effect on latency increased systematically from Wave I to IV. As a result, the threshold of the FICI system was recorded as 45 dB SPL, considering the Wave II. Although electrical artifact occurred at lower input sound levels, no eABR response was observed, indicating that generated charge was not enough for excitation.

For further confirmation of the frequency selectivity, the second and seventh channels of the PZT sensor were subjected to the same experiment with acoustic stimuli at 616 Hz (resonance frequency of the second channel) and 2931 Hz (resonance frequency of the seventh channel). The frequency selectivity of the PZT sensor was assessed by measuring eABR recording while connecting respective output channel of the signal conditioning circuit to the intracochlear electrode. The designed system generated stimulation pulses only if the input sound frequency matched the frequency band of the corresponding channel. When an off-resonance frequency of the corresponding channel was applied, no stimulus artifacts or responses were observed in the eABR recordings. Figure 4d presents the measured eABRs using the second channel of the PZT sensor while applying a pure-tone sound at 2931 Hz. Since it was insensitive to off-resonance frequencies, the cantilever generated a very low output to initiate electrical stimulation from the signal conditioning circuit. Similarly, the stimulation performance for the seventh channel at its resonance frequency (2931 Hz) is given in Fig. 4e, which also has its threshold level as 45 dB SPL. The off-resonance response of the seventh channel was similar to the one shown in Fig. 4d. Figure 4e shows the amplitude of Wave II as a function of the input sound level. The amplitude of Wave II increased with the level of input sound pressure sensed by the PZT sensor. These eABR recordings clearly demonstrate that the FICI system could electrically stimulate auditory neurons using input acoustic stimuli. Figure 4f depicts the correspondence in the relation between input sound levels and injected charge.

Discussion

A full-custom FICI system was developed by considering all the limitations and requirements of a CI and its performance was validated with in-vivo animal models. In previous FICI publications, either the sound sensor or the signal processing circuits were highlighted. On the sensor side, both capacitive and piezoelectric methods have been investigated to achieve implantable acoustic transducers. Young et al. attached a MEMS capacitive accelerometer-based transducer to the umbo, but the sensor’s output was considerably low and their readout circuit consumed too much power (~4.5 mW), which limited the battery usage of the implantable system39. Conversely, the output level of piezoelectric approaches could be further improved by increasing strain levels on the structure, making the piezoelectric approach more feasible for implantable applications. Jang et al. developed an artificial basilar membrane using an AlN piezoelectric cantilever array, featuring an 8-channel distribution spanning from 2.92 to 12.6 kHz. This design produced 4.06 mV at a frequency of 7.04 kHz under 101.7 dB SPL27. However, covered frequency bands are not applicable for hearing systems. Zhao et al. designed an AlN piezoelectric intracochlear acoustic transducer featuring four cantilever beams. This novel approach managed to produce up to 79.7 μV at 95 dB SPL in guinea pigs. However, the system required specialized sensing and amplification stages, which posed challenges for low-power applications40.

In the implantable system we present, ambient sound was sensed by the cantilevers via PZT actuation. Implementing a PZT sensor that could provide high signal levels while covering the acoustic band with sufficient channels in the small volume available in the middle ear was challenging. During the development of the bilayer multi-channel sound sensor, a bulk piezo-ceramics PZT energy harvester prototype had been developed for the conceptual sound sensor and energy harvester system by our group21. Although the bulk PZT approach could generate significant energy, it would not fit into the middle ear for sound sensing. Consequently, we designed a thin film single-channel sound sensor system as a proof-of-concept22. The device, placed on a parylene membrane for experimental characterization, generated 114 mVpp output voltage under 110 dB SPL at 1325 Hz. Subsequently, an 8-channel sound sensor with thin film PLD-PZT layers was designed for the FICI concept. Dimensions of the sound sensor were suitable for middle ear implantation, with an active volume of 5 × 5 × 0.6 mm3 and a mass of 4.8 mg. The outstanding output performance of 50.7 mVpp was obtained on an artificial tympanic membrane under 100 dB SPL at 652 Hz. Input and output waveforms correlation showed that such a system can solve the major problem of amiddle ear implantable sound sensor23. In this study, we designed and fabricated the bilayer sound sensor by considering the surgical operation methods and optimizing the design using finite element model simulations. The bilayer system surpasses the previous one in terms of both volume efficiency and output levels. The structure and the design parameters of the transducer were considerably improved from previous studies. During the development of this concept, the configuration of channels and the layered structure were not the only factors taken into consideration. During the system design phase, each channel’s parameters were configured individually for its respective resonance frequency. Unlike our previous research, where all channels had uniform width, this variation in channel dimensions resulted in significant disparities in structural strength, noise levels, output performance, and filter characteristics. These differences were observed compared to the results highlighted in this study. With this design, the overall volume of the sole transducer structuredecreased by 60%, while output levels were enhanced by 15 dB. The latest design iteration impresses with a compact size, with each layer having an active volume of 3 mm × 3 mm × 0.36 mm and a combined active mass of 5.2 mg. The entire structure is packaged in a total size of 3.5 mm × 3.5 mm × 1.52 mm, weighing in at 20.1 mg. This configuration remains within the loading limit for additional mass on the ossicles during vibration to ensure maximum obtainable vibration levels. Generated output signals by each channel (Fig. S3.) can excite the interface electronics while covering the entire hearing frequency spectrum at a given excitation level. At low sound levels i.e., 40–50 dB SPL output levels at the frequency band edges become comparable with the input-referred noise of the interface electronics and cause dead frequency zones, which can be mitigated by vacuum packaging of the transducer on the sensor side to further enhance the signal to noise ratio. Moreover, this generation of transducers covers a wider range of the audible frequency spectrum while excluding the irritative low frequencies. This limitation is in place to reduce the sensor’s vulnerability to vibrations caused by physical movements, particularly those within the frequency range of up to 200 Hz. In the case of implantable sensors, recommended configurations involve a bandwidth that covers frequencies ranging from 250 Hz to higher ranges. This setup is designed to address the spectrum relevant to everyday hearing while minimizing the impact of internal noise originating from the body.

For long-term performance, the degradation of piezoelectric materials over an extended period is primarily attributed to factors such as temperature variations, moisture, chemical exposure, mechanical stress, aging, and operational conditions. In the context of the proposed system, wherein the acoustic transducer is to be implanted in the middle ear, measures have been taken to maintain a constant temperature around the body temperature. Additionally, the transducer is subject to vacuum packaging to shield it from environmental influences, including moisture and chemical exposure. Consequently, the predominant concerns in this specific application revolve around mechanical stress, operational conditions, and aging. Previous literature has extensively investigated the long-term reliability of the transverse piezoelectric coefficient, demonstrating outstanding performance even under extreme test cycles41. Depolarization is another aspect of degradation in thin film piezoelectric materials; however, in this application, the system’s operational range remains well below the polarization voltages. The PZT transducers are also tested by exciting the transducers with a vibration table under 1gpeak constant acceleration sweep (>100 dB SPL) in the relevant frequency range. There was no physical damage at the end of the tests, showing the durability of the sensors at high intensity levels.

This research introduces a feasible method for integrating a MEMS sensor into the middle ear during a conventional cochlear implant surgery performed through posterior tympanotomy42. Utilizing this conventional procedure not only allows for middle ear access but also establishes critical anchor points along the ossicular chain for securing the sensor and opens avenues for improving cochlear implant functionality through the integration of a transducer. The average size of the available hole reported in the literature for posterior tympanotomy is ~4.7 mm43. However, excessively large surgical exposures of the ear may increase the risk of post-implantation infections. Hence, it is crucial to minimize the size of the device to mitigate potential complications. When determining the maximum dimensions of the transducer, factors such as individual anatomical variations, surgeon’s maneuverability, the connection aperture, surgical tools, and range of motion were taken into account. For this transducer design, the maximum dimension was set at 3.5 mm, which includes the frames and packaging.

The signal conditioning circuits reported for FICI systems were focused mainly on front-end signal processing, which is critical for sound quality. The ultra-low-power sound processor circuits proposed in refs. 24,25 combine low power dissipation with high-quality processing, however, they require a proper neural stimulation unit to activate auditory neurons. A complete FICI signal conditioning circuit was implemented by Yip et al., but, with excess power dissipation at the sound filtering26. A previous study by our group presented an ultra-low-power front-end circuit combined with a neural stimulation unit. However, only single channel performance was validated and the circuit had a limited input dynamic range44. Currently, there are only a couple of studies that include signal conditioning circuits and are tested with eABRs on an animal model. The signal conditioning circuit implemented by Jang et al. to evoke the eABR of guinea pigs utilizes an amplitude modulation-based stimulation strategy27,45. However, the signal conditioning circuit is both bulky and missing the main design parameters of FICIs. For instance, the input sound range performance of the previously reported eABR recordings was about half of the response obtained by our full custom FICI system. In addition to successfully satisfying the design criteria of a FICI application, the eABR response of the full custom FICI system provides the widest achieved input sound dynamic range (45–100 dB SPL) that has been validated in vivo conditions. Moreover, the power consumption of the system was comparable with the state-of-the-art signal conditioning circuits, and thus would enable long-term use of the implantable system. Despite the great step to incorporate the FICI concept into everyday life, our work has its limitations. The transducer’s performance capability is not sufficient for full-band operation due to low-power circuit requirements for the off-resonant bands. A similar range can be obtained during the stimulation with arrangements of the input range of the logarithmic amplifiers and patient fitting system of the interface circuit. However, this comes at the cost of a higher power consumption. Although the voltage outputs of the transducer are higher than those reported in the literature, the limitation of insufficient performance at off-resonance bands can be overcome by packaging methods. The main purpose of packaging is not to improve the performance at resonance frequency but at the frequency band edges and protect the channels from environmental effects (humidity, moisture, chemical exposure etc.). To adjust the dynamic range of the system, the output levels of the channels can be limited mechanically or electrically. The presented implant is the first-ever custom-designed FICI system that covers the design and implementation of both the sound sensor and signal conditioning circuits to meet the design specifications and is validated with the in-vivo animal experiment. Furthermore, the validated system performance shows an adaptable and much wider dynamic range compared to previous in vivo studies to cover the daily sound.

The eABR signals from deafened guinea pigs were measured by applying acoustic stimulus to the FICI system. The presented results on eABR recording revealed that the FICI system stimulated auditory neurons of ototoxically deafened guinea pigs. These findings showed that the FICI system elicited biological ABR responses by converting acoustic stimulation to electrical signals. The results shown in Fig. 4 strongly agree with the earlier animal-based eABR morphology reports46,47. The fact that the observed eABR Wave II latency was 1–2 ms earlier than the acoustic ABR also verified the consistency. This is, due to the absence of delays mediated by mechanical wave propagation, sensory cell transduction, and synaptic stimulation of primary afferent neurons48. Latency/intensity function of the eABR waveform pattern is similar to acoustic ABRs and has a steeper latency-intensity function as expected. In addition, the presented results confirmed reports in the literature that increasing stimulus intensity leads acoustic ABR Wave II latency to reduce by 2 ms between threshold and saturation, whereas the eABR wave II latency changed only slightly49,50.

Conclusion

We designed a full-custom FICI system comprising a transducer and interface circuit, capable of stimulating the auditory nerve of a guinea pig across a sound range of 45–100 dB SPL within the selected frequency bands. The multi-channel system has eight filtering and stimulation channels, where the sensor outputs can generate nearly 300 mVpp for a 100 dB SPL input, and the signal conditioning circuit generates about 1 mA of stimulation current, which can be further controlled by the patient fitting system. For a typical speech sentence at 70 dB SPL, a total power dissipation of 600 µW enables a long battery life for the system. The in vivo animal test demonstrated the efficacy of the FICI system by integrating a middle ear implantable PLD PZT acoustic sensor and a low-power sound processing circuit, proving the concept of our full custom FICI system. Our study showed that the MEMS-based FICI system generated electrical potentials in response to acoustic stimuli after intracochlear electrode implantation into the guinea pig cochlea. This indicated the capacity of the FICI system to mimic the function of the cochlea. Compared to previous studies, this design provides a full-custom implementation of FICI, satisfying both integrated transducer and low-power interface circuit criteria and achieving the widest input sound range validated in vivo. There are additional possibilities for enhancing the system, particularly in the packaging of the implantable multi-channel transducer. Improved packaging methods could increase the sensitivity of the transducer and elevate the minimum detectable sound level, allowing for mechanical or electrical limitation of the output levels of the channels. Future work might include exploring different compositions of piezoelectric material to minimize the noise level of the transducer. Further analysis will determine the optimum location of the transducer to maximize vibration reception while minimizing mass loading in the middle ear. The geometry of the sensor was determined as optimal for insertion through the facial recess and placement in the middle ear cavity. The transducer and low-power IC design hold promising potential for clinical translation of FICI, but future work could optimize the transducer design for wider bandwidth, and an alternative packaging approach could be applied to enhance the results. Additionally, extra blocks could be added to the signal conditioning circuit to optimize the bandwidth required for improved speech intelligibility. As an alternative to CI, there are ABI systems that stimulate the surface of the cochlear nucleus. The ABI system comprises similar components (i.e., microphone, processor, telemetry interface, and current stimulator) to the CI systems except for the stimulator electrode type and location. This similarity indicates that our FICI system could potentially be suitable for the ABI system as well.

Methods

Fabrication of PZT hearing sensor

The sound sensors were fabricated on 4-in. p-type (100) SOI wafers consisting of a 10-μm Si device layer, 1-μm buried oxide, and 350-μm handle layer (Fig. S5-1). The fabrication process flow diagram is illustrated in Fig. S5. Thermal oxidation was carried out to grow a 500-nm-thick SiO2 layer, serving as a lateral insulation layer between the device layer and the bottom electrode (Fig. S5-2). A 20-nm titanium (Ti) and a 100-nm platinum (Pt) layer were deposited by DC sputtering to form the bottom electrode. A 1 µm piezoelectric PLD-PZT layer was deposited on a 14-nm LNO orientation layer using pulsed laser deposition technique (MESA+ Institute, Netherland). The piezoelectric layer was patterned by wet etching (Fig. S5-3). The Pt layer was patterned using an improved etching technique51, and the Ti layer was removed using wet etching (Fig. S5-4). A 1 µm Parylene-C layer was deposited by evaporation using SCS- PDS 2010 system. Contacts for the top electrodes were opened using an RIE process using a CF4 and O2 gas mix (Fig. S5-5). A 30-nm Cr adhesion and a 400-nm gold (Au) layers were deposited by sputtering to form the top electrode on the PZT layer with patterning achieved using the Transene Etchants. The exposed part of the parylene layer was etched in the RIE process (Fig. S5-6). Cantilevers were patterned on the device layer of the SOI wafer, initially etching away the thermal oxide and the silicon layers using RIE and DRIE processes (Fig. S5-7). To form a freestanding cantilever array, the handle silicon layer on the backside was etched by DRIE. Prior to the DRIE, the Alumina wafer was temporarily bonded to the device side of the wafer after spray coating to protect the cantilever surface using a crystal bond. After patterning the photoresist, the exposed part of the SiO2 layer was etched away by RIE. DRIE was then performed to remove the silicon beneath the buried oxide layer and the buried oxide was etched away in RIE (Fig. S5-8). Finally, temporary bonding was resolved in DMSO at 80 °C (Fig. S5-9). As a result of this fabrication process on an SOI wafer, beam, and tip mass thicknesses are constant for all channels at 10-µm and 350-µm, respectively. The dimensions of each channel’s length, width, mass length, and piezoelectric length are given in Table S1. The thickness of the two spacers and two thick caps for packaging are 200-µm each. Therefore, the total thickness of the packaged transducer is 1.52 mm.

Characterization of FICI system

Acoustic characteristics of the sound sensor were measured on a PDMS membrane that simulates the tympanic membrane, with a flexible parylene substrate used to connect the channels electrically. The artificial tympanic membrane was excited with an insert earphone (Etymotic Research, ER-2) in an acoustical coupler. A pure tone sine wave signal generated from the signal generator (Keysight 33522B) was amplified using an audio amplifier (DENON PMA520). The output level was calibrated by a sound level meter (IEC 651 Type II) and a 2-cc acoustic coupler. The sound sensor’s output was amplified by an instrumentation amplifier and recorded using a Data Acquisition Board (DAQ – NI cDAQ-9174) and LabView software. The vibration level of the transducer was measured during the acoustic performance measurements using a Laser Doppler Vibrometer. The measured levels are within the range of human cadaver vibration levels reported in the literature. Fig. S6 shows the measured vibration levels of the artificial tympanic membrane and human cadavers under 80 dB SPL excitation in the literature. The outputs of sound sensors were also applied to the signal conditioning circuit of the full custom FICI, where the stimulation current generation performance of the circuit was measured for various input sound levels at the resonance frequency of the second channel of the sound sensor. An RC circuit model with a 3 kΩ series resistor and a 10 nF capacitor was used to model the electrode tissue interface26,31. The output response was recorded using NI cDAQ-9174 and LabView software.

Experimental animals and ethics

Male adult Duncan-Hartley/Guinea Pigs (obtained from Kobay DHL A.S.) with intact auto-palpebral reflex, weighing 410–576 g were used in this study. The external auditory canals and tympanic membranes of the animal were confirmed as healthy post oto-microscopic examination. The animals were maintained under standard laboratory conditions, housed in cages at 21 °C with access to food and water ad libitum, with 12 h light:dark cycles until the deafening was confirmed. During this process, the animals were administered 10 mL/kg IP saline, once a day for hydration. All in vivo experimentation was performed in accordance with the World Medical Association Declaration of Helsinki.

Ethical committee approval was obtained (protocol number: 399/2019) from Kobay DHL A.S. Local Ethics Committee. All experimental animals used in this study were treated in accordance with the guidelines and rules of the Kobay DHL A.S. Local Ethics Committee.

ABR recording

An Etymotic 10B+ probe and High Frequency Transducers (Intelligent Hearing Systems; IHS) were inserted into the external ear canal depending on the range of the stimulation frequency. Recording electrodes were twisted to reduce the noise floor and were placed as follows: the inverting (negative) electrode of the ipsilateral superior post-auricular area, the noninverting (positive) electrode of the vertex, and ground electrode was inserted deep into the muscles of the ipsilateral leg with the impedances being kept at below 1 kΩ.

Electrophysiological responses were amplified using an IHS Opti-Amp bio-amplifier connected to the SmartEP system. A total of 1024 data points were sampled at a rate of 32 kHz. Click stimuli, with a duration of 100 µs, and tone burst stimuli, at 16 and 32 kHz, with a duration of 2 ms, were presented. The rate of stimulation and number of sweeps per presentation were set to 21.1 Hz and 512, respectively, for both stimulus types. Stimulus intensity was increased in 10-dB steps from hearing threshold to 90 dB SPL, and the ABR was recorded at each level. While approaching the threshold level, intensity levels of 5 dB were used to determine the threshold. The reproducibility of the responses was tested by repeating the measurements at least twice. ABR thresholds were determined and compared at pre- and post-drug administrations. ABR amplitude growth and intensity-latency functions data were obtained using MATLAB (Natick, MA) software. Throughout this experiment, the guinea pig remained in a sound-attenuated, electrically shielded booth.

Animal surgery

During surgery, the guinea pig was placed in a stereotaxic frame in prone position. A post-auricular 2–3 cm skin incision was made 3–4 mm posterior to the pinna. The auditory bulla is palpated and the overlying muscle tissue is dissected from the bulla. Then a retractor is used to expose the posterior bulla. A hole was drilled into the bulla using a diamond burr and the bullostomy was enlarged to visualize the basal turn of the cochlea and the round window niche. A linear incision was made in the round-window membrane using a pick and a custom-designed intra-cochlear electrode array was inserted into the scala tympani.

eABR recording

eABRs were recorded in a manner similar to ABR measurements, except that the input sampling rate was 40 kHz and the stimulation-sampling rate was 80 kHz. The electric current signals were generated and controlled by an IBM-compatible computer using IHS software and synchronized to the system via a 5V-TTL connection.

Statistics and reproducibility

Data sets are given with descriptive statistics of mean ± standard deviation. The sample size for the study was calculated using the G * Power Version 3.1.9.4 statistical software (α: 0.05; 1 − β: 0.8).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.