Single neuromorphic memristor closely emulates multiple synaptic mechanisms for energy efficient neural networks

Weilenmann, Christoph; Ziogas, Alexandros Nikolaos; Zellweger, Till; Portner, Kevin; Mladenović, Marko; Kaniselvan, Manasa; Moraitis, Timoleon; Luisier, Mathieu; Emboras, Alexandros

doi:10.1038/s41467-024-51093-3

Single neuromorphic memristor closely emulates multiple synaptic mechanisms for energy efficient neural networks

Article
Open access
Published: 13 August 2024

Volume 15, article number 6898, (2024)
Cite this article

Download PDF

You have full access to this open access article

From

View current issue

Single neuromorphic memristor closely emulates multiple synaptic mechanisms for energy efficient neural networks

Download PDF

2877 Accesses
3 Altmetric
Explore all metrics

Abstract

Biological neural networks do not only include long-term memory and weight multiplication capabilities, as commonly assumed in artificial neural networks, but also more complex functions such as short-term memory, short-term plasticity, and meta-plasticity - all collocated within each synapse. Here, we demonstrate memristive nano-devices based on SrTiO₃ that inherently emulate all these synaptic functions. These memristors operate in a non-filamentary, low conductance regime, which enables stable and energy efficient operation. They can act as multi-functional hardware synapses in a class of bio-inspired deep neural networks (DNN) that make use of both long- and short-term synaptic dynamics and are capable of meta-learning or learning-to-learn. The resulting bio-inspired DNN is then trained to play the video game Atari Pong, a complex reinforcement learning task in a dynamic environment. Our analysis shows that the energy consumption of the DNN with multi-functional memristive synapses decreases by about two orders of magnitude as compared to a pure GPU implementation. Based on this finding, we infer that memristive devices with a better emulation of the synaptic functionalities do not only broaden the applicability of neuromorphic computing, but could also improve the performance and energy costs of certain artificial intelligence applications.

Emulating short-term synaptic dynamics with memristive devices

Article Open access 04 January 2016

Neuromorphic computing with multi-memristive synapses

Article Open access 28 June 2018

Memristive-Based Neuromorphic Applications and Associative Memories

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Introduction

Biological neural networks (BNNs) have inspired today’s most successful artificial neural networks (ANNs), which consist of neurons linked through connections known as synapses. Traditionally, each synapse in such a network serves three functions: (1) storage of long-term memories in their weights (W), (2) synaptic transmission - modeled as input-weight multiplication, and (3) long-term plasticity - the update of W during training.

However, these ANN synapses only capture a subset of the functionalities of biological ones. The latter follow complex biophysical dynamics and learning rules such as Hebbian plasticity¹ and short-term plasticity^2,3 (Fig. 1a). Additionally, higher-order plasticity rules exist that do not directly determine the synaptic weight, but rather the properties of the plasticity rule itself. One example is the control over the decay timescale of the short-term plasticity rule, which can range from milliseconds to minutes, depending on the neuronal activation^2,4. These rules, known as meta-plasticity^5,6, play a crucial role in demanding tasks that require not only learning but also learning-to-learn, i.e., meta-learning^{7,8,9,10,11,12}.

**Fig. 1: Biologically inspired synaptic functions and their memristor implementation.**

The complexity of biophysical mechanisms in synapses and the corresponding plasticity rules are essential for nervous system function (e.g., refs. ^13,14,15), but are missing in conventional ANNs. This limited biological realism might partly explain why artificial intelligence (AI) systems often perform inferiorly to humans and animals in various aspects, such as motor skills and adaptability to dynamic environments¹⁶. Moreover, today’s ANNs consume vast amounts of energy due to the large network size required for complex tasks¹⁷. For instance, training the large language model GPT3 consumed 1.287 GWh of electrical energy¹⁸, enough to power over 100 households for a year.

To address these issues, a more bio-inspired model for synapses was developed, incorporating short-term and Hebbian plasticity, as well as meta-plasticity¹⁹. Specifically, this model, known as the ST-Hebb synapse, does not only perform the above-mentioned three functions of traditional ANN synapses but also includes additional roles such as (Fig. 1b): (4) storage of short-term memories (F) that decay over time, (5) short-term plasticity - the update of F (ΔF) during training and inference, and (6) meta-plasticity - the control over the decay time. To incorporate ST-Hebb synapses into a deep neural network (DNN), the short-term plasticity neuron (STPN) model has been proposed (Fig. 1c), combining a conventional neuron model with ST-Hebb synapses²⁰. This model utilizes all six synaptic functions (1) to (6), incorporates meta-learning, can be integrated into multi-layer networks, and outperforms more conventional ANNs with less biologically realistic synapses in various challenging tasks.

The hardware of choice to run such neural networks are parallel computing architectures like graphics processing units (GPUs). However, GPU-based implementations of multi-functional synapses suffer from the computational overhead caused by the aforementioned additional synaptic operations. This trend is exacerbated by the large amount of synapses building state-of-the-art neural networks, ranging from 10⁶ to 10¹⁴²¹. On top of that, the operations governing ST-Hebb’s synaptic dynamics are memory bound and are thus negatively affected by the well-known von Neumann bottleneck imposed by physically separated memory and processing units²². These factors render the implementation of ST-Hebb synapses on GPUs inefficient, thus motivating the development of new hardware paradigms that are better suited to neural networks with multi-functional synapses.

Several promising neuromorphic architectures use memristors as hardware synapses because of their ability to collocate memory and computation in a single device, which circumvents the von Neumann bottleneck²³. Memristors are two-terminal devices that can change their conductance state upon electrical^24,25 or optical^26,27 stimuli, similar to the change of the synaptic coupling (weight) upon a neuronal spike in biological systems. A growing body of research suggests that the rich internal dynamics of memristors can be leveraged to mimic biophysical processes taking place in synapses and neurons^28,29.

There have been multiple demonstrations of bio-inspired hardware synapses realized using memristors with both long- and short-term dynamics^30,31 that exhibit biological learning rules such as triplet spike-timing-dependent plasticity (triplet-STDP)³² or Bienenstock-Cooper-Monroe (BCM)³³. However, these demonstrations rely on spike timing plasticity rules and can therefore not be integrated into DNNs³⁴, which limits their applicability. Meanwhile, a single-layer neural network that makes use of bio-inspired, multi-functional synapses was recently demonstrated on memristive hardware³⁵. The authors showed the benefit of adding short-term synaptic plasticity during inference for a classification task in dynamically changing environments. Memtransistive devices were used as synapses. In addition to the two electrical contacts common to all memristors, they possess a gate analogous to transistors. To realize decaying traces a voltage signal with the shape of the short-term decay was applied to the third gate contact. Short-term plasticity is therefore not an intrinsic property of these devices, i.e., the devices do not inherently exhibit short-term memory, but require an additional stimulus to do so. The need for three-terminal devices and precisely engineered voltage signals applied to each memtransistive synapse poses challenges for a large-scale implementation of such systems, because the required control circuit and wiring would rapidly become considerably complex. Therefore, the introduction of a two-terminal memristive device that intrinsically encompasses all six synaptic roles (1–6) is key to enable scalable neuromorphic hardware that is not only energy-efficient, but also reaches or even surpasses the performance of conventional AI approaches.

In this work, we propose such a two-terminal memristive device that relies on the valence-change switching mechanism in SrTiO₃ (STO)³⁶ and intrinsically possesses the six operations needed to function as an ST-Hebb synapse. A symbolic representation on top of an SEM image of the fabricated nanoscale device is shown in Fig. 1d. The measured memristor conductance acts as the plastic synaptic weight and mirrors the behavior displayed in Fig. 1b. Specifically, our device can store two different states in its memory, (I) a state with slow dynamics (long-term weight W) and (II) a state with fast dynamics (short-term weight F), which are both encoded in the conductance of the memristor. In terms of computation, the four synaptic operations labeled 2, 3, 5 and 6 in Fig. 1b can all be performed by our STO devices: (III) Long-term plasticity (i.e., change in the long-term weight W) and (IV) short-term plasticity (short-term weight update ΔF) can both be triggered by voltage pulses of different magnitudes. Notably, the short-term decay happens spontaneously, without the application of a complex signal. (V) Meta-plasticity (i.e., control over the decay time) can be achieved by applying a DC bias voltage to one of the two terminals, which limits the complexity of the control circuit and wiring. (VI) Additionally, our devices provide the standard in-memory multiplication capabilities of the input (voltage U) by the synaptic weight (conductance G), which is realized by Ohm’s law I = G ⋅ U. They also exhibit low cycle-to-cycle variability due to their non-filamentary switching operation. As a consequence, the random displacement of few atoms does not induce as much noise as in filamentary valence-change-type memristors³⁷. Moreover, we can operate our devices at very low conductance values (10s of nS), which lowers the power consumption during operation. Their achievable short-term timescales range from 10 milliseconds to 100’s of seconds. Importantly, timescales in the order of 100 seconds are typically difficult to realize with nanoscale footprints using other neuromorphic approaches such as analog circuits because the required capacitors rely on much larger dimensions^38,39,40.

To estimate the energy consumption of our multi-functional hardware synapses in the context of a large DNN, we introduce a modified STPN (m-STPN) unit that emulates parts of the device characteristics and fully incorporates the measured energy consumption of our devices. We then integrate this unit into the original STPN network simulator of ref. ²⁰ to perform a complex reinforcement learning task in software with multi-functional synapses, namely learning to play Atari’s video game Pong. The Atari suite is a common benchmark for reinforcement learning and is chosen here as an exemplary task for a dynamic environment. We show that the m-STPN unit enables faster and more stable training compared to the original version for the task of Atari Pong. A major reason for this is the introduced constraint on the short-term decay time constant imposed by our devices. Furthermore, we demonstrate that short-term weights with long timescales, such as the ones exhibited by our memristors, are required for a robust and fast training of the network. Finally, we compare the network’s energy consumption for a pure GPU implementation of the synapses with the estimated energy consumed by our memristive synapses. We demonstrate an estimated gain in energy efficiency between 96× and 966× , depending on the GPU implementation.

Results and discussion

Multi-functional synaptic behavior in single memristor

We fabricated a multi-functional memristive synapse on an STO single crystal substrate (Fig 1d). We chose STO as active material, because it is a versatile and well-understood platform with rich internal dynamics due to the generation and movement of oxygen vacancy defects^41,42 that can be tuned by e.g., doping^43,44,45, different electrode materials^33,46 or interface engineering⁴⁷. First, a high work function contact (Pt with a Cr adhesion layer beneath) was deposited. This step was followed by the fabrication of a Ti electrode with a Pt capping layer that prevents the Ti from oxidizing in air. Both contacts were deposited using electron beam evaporation and patterned by electron beam lithography with a subsequent lift-off process, resulting in a typical gap between the electrodes of roughly 40nm. The devices were annealed at 300 °C for 20min in flowing Ar, which causes a thermal oxide to form at the Ti-STO interface. The whole stack was finally covered with a uniform layer of 15nm of SiN. The fabrication process is discussed in detail in Methods section “Device fabrication”.

Figure 2a shows 30 cycles of the I-V characteristics of our Cr/Pt-STO-Ti memristor (Fig. 2b). The voltage (−2V to 2V) is applied to the Pt electrode, while the Ti one is grounded. A high cycle-to-cycle repeatability as well as low conductance values (10s of nS) are obtained, which allows for energy-efficient device operation. The low conductance values and the counter clockwise switching direction, as indicated by the black arrows, are attributed to a non-filamentary switching mechanism, which has already been reported for similar material stacks⁴⁸. In this switching regime the conductance change is not caused by the formation of a filament made of oxygen vacancies (${{{\rm{V}}}}_{{{\rm{O}}}}^{\bullet \bullet }$) that bridges the two electrodes, but by the modulation of the Schottky barrier at the Pt-STO interface⁴⁹. This modulation is attributed to generation and recombination of ${{{\rm{V}}}}_{{{\rm{O}}}}^{\bullet \bullet }$’s upon the application of an external voltage (bottom of Fig. 2b). The vacancies in turn locally dope the STO, which changes the height and width of the Schottky barrier, affecting the conductance. When a positive voltage is applied to the Pt contact, oxygen from the crystal (${{{\rm{O}}}}_{{{\rm{O}}}}^{\times }$) moves to the Pt-STO interface or into the porous Pt electrode, leaving behind a positively charged crystal defect⁴². This kind of n-type doping increases the conductance due to a decrease in the Schottky barrier height and width. Since ${{{\rm{V}}}}_{{{\rm{O}}}}^{\bullet \bullet }$’s are mobile and positively charged, they migrate away from the Pt electrode along the applied electric field towards the Ti electrode, where they accumulate and potentially form a filament in a process called electroforming⁵⁰. We observed that for high positive voltages (>4V) we are able to electroform our device and put it in a filamentary-switching operation (Supplementary Section S2). This confirms the generation of ${{{\rm{V}}}}_{{{\rm{O}}}}^{\bullet \bullet }$’s at positive voltages in our devices and allows to distinguish the filamentary and non-filamentary regime based on an analysis of the IV characteristics.

**Fig. 2: DC and dynamical behavior of multi-functional memristive synapses.**

The amount of vacancies generated as well as the distance over which the ${{{\rm{V}}}}_{{{\rm{O}}}}^{\bullet \bullet }$’s migrate from the Pt contact depend on the voltage and duration of the applied electrical signal⁵¹. Long electrical pulses at high voltages are expected to lead to a high vacancy concentration extending far away from the Pt electrode whereas short, low-voltage pulses result in a relatively small vacancy concentration close to the Pt. After these pulses the generated vacancies migrate back towards the Pt contact without an external voltage, driven by a gradient in electrochemical potential⁴¹ and get filled there by the interfacial oxygen⁵². In addition to the incorporation of molecular oxygen from the porous Pt electrode also atmospheric water vapor can lead to the filling of oxygen vacancies by incorporating oxygen from water molecules into STO⁵³. Through such processes vacancies start disappearing from the vicinity of the Pt contact, forming a growing, vacancy-free region. Since the Schottky barrier is mainly sensitive to the vacancy concentration immediately adjacent to the Pt electrode even small vacancy movements in this region significantly change the contact resistance and thus the overall device conductance⁴², explaining the observed conductance decay in our memristors. Furthermore, the vacancies close to the Pt get annihilated first in timescales of minutes (short-term), while the vacancies further away require an increasingly long time to migrate back resulting in timescales of multiple hours (long-term), as described in⁴¹. Therein, this slowdown is attributed to the built-in electric field at the Pt-STO interface, which decreases monotonically with the distance from the Pt electrode. It is also likely that the oxygen incorporation kinetics at the Pt-STO interface play a role in determining the short-term decay timescale⁵².

Additionally, the back migration flux of ${{{\rm{V}}}}_{{{\rm{O}}}}^{\bullet \bullet }$ and subsequent vacancy filling at the Pt-STO interface can be increased by the application of a negative bias, leading to a faster conductance decay. Hence, the decay time can be voltage-controlled. A summary of the postulated physical mechanisms and how they underlie the synaptic functions in Fig. 1b is given in Supplementary Section S6. Even though this physical picture supports our experimental observations it cannot be excluded that other effects play an important role in the switching process, such as interface trap states⁵⁴ or protonic conduction, which is well studied in oxide-based memristors^53,55,56,57. Further investigations will be needed to unequivocally determine the physical mechanism(s) at the origin of our devices’ behavior.

In our approach the memristor’s conductance implements the synaptic weight, whose dynamics (long- and short-term) are crucial in ST-Hebb synapses (Fig. 1b). To investigate the conductance dynamics of our STO memristors we apply pulses of different voltages and widths to them (Fig. 2c, d). We first induce long-term plasticity (function 3 in Fig. 1b) by applying 100 SET pulses with an amplitude of 4V and a duration of 500μs that cause the device to switch from a low to a high conductance state (Fig. 2c). This high conductance state slowly decays over thousands of seconds without applied bias (Supplementary Section S3). After the SET procedure we leave the device at 0V for 240s (not shown) to let it settle to a stable state. We then proceed with measuring the conductance of the device at 0.6V for 375s (Fig. 2d), during which 100μs-long pulses with voltages of 2, 2.5, and 3V, are applied. The long-term conductance induced by the SET pulses remains largely constant for the time period of the measurement. The 100μs-long pulses lead to a short term conductance increase, i.e., short-term plasticity (function 5 in Fig. 1b), whose magnitude depends on the pulse voltage (3, 5 and 10nS for 2, 2.5 and 3V, respectively) and is followed by a decay. This can be observed in Fig. 2e, where the conductance during the three last pulses of the protocol (dotted rectangle in Fig. 2d) is plotted. The conductance during the read voltage is shown, omitting the values during the 100μs-long pulse. In Fig. 2f the long- and short-term components of the conductance (functions 1 and 4 in Fig. 1b) are visualized for six measurements with different values of the long-term weight W. The measurement data was obtained by repeating the protocol of Fig. 2c, d multiple times, i.e., first setting the long-term weight (W₁, W₂, ...) by 100 SET pulses, waiting for 240s, and then applying the short-term pulse protocol of Fig. 2d. Values of the long-term conductance in the range of 12 to 23nS can be set in this way (long-term plasticity). These conductance values can further undergo short-term increases induced by voltage pulses (short-term plasticity). The obtained collocation of both long- and short-term plasticity motivates the use of this devices as ST-Hebb synapses.

The short-term plasticity is investigated in more detail in Fig. 3a, which displays the mean (solid line) and standard deviation (shaded area) of five measurements. Pulse-induced short-term conductance updates (ΔF) and subsequent decays are obtained using four different voltage amplitudes (2, 2.5, 3, and 3.5V). The pulse width was fixed to 100μs and the read voltage to 0.6V. The conductance values were normalized by subtracting the initial conductance at t=0 from the data. We observe low cycle-to-cycle variability, in agreement with the I-V characteristics in Fig. 2a. The same measurement was repeated for two additional pulse widths (20 and 500 μs). The resulting ΔF’s are reported in Fig. 3b as a function of the pulse amplitude and width. It can be seen that ΔF values in the range of 0.7 - 38.6 nS can be achieved by adjusting these parameters. The corresponding energy per pulse is given in Fig. 3c for the same pulse voltage and width combinations. The details of the energy calculations are given in Supplementary Section S7 and a measurement with 200 pulse cycles is given and discussed in Supplementary Section S4.

**Fig. 3: Control over the magnitude and dynamics of short-term conductance updates.**

Besides the magnitude of the conductance increase, it is also possible to control the subsequent decay using a DC bias voltage (V_bias) that is constantly applied during the experiment (Fig. 3d), effectively implementing meta-plasticity (function 6 in Fig. 1b). The mean and standard deviation of the conductance for five measurements are shown as a function of time. The voltage pulse that triggers the conductance increase is the same in all cases (3.5V / 500μs), thus resulting in similar ΔF, whereas the bias voltage is varied (see Supplementary Section S9 for details). The timescale of the decay increases with increasing V_bias from hundreds of ms (V_bias = −0.6V) to tens of seconds (V_bias = 0.6V). To quantify the resulting decay time constant (Λ) as a function of the bias voltage, we fitted an exponential to the measured curves (Supplementary Section S9). In our fit, the maximum value of Λ = 1 indicates no decay and the minimum value (Λ = 0) corresponds to immediate decay. Similar measurements were performed on other devices to qualitatively assess device-to-device variability (Supplementary Section S5). Figure 3e demonstrates that we can experimentally control Λ over a range from 0.08 to 0.92 as a function of the applied V_bias. The relationship between V_bias and Λ is modeled by a sigmoid function $\Lambda ({V}_{bias})=\frac{L}{1+\exp (-k\cdot ({V}_{bias}-{V}_{0}))}+{\Lambda }_{0}$, where L, k, V₀, and Λ₀ are fitting parameters.

In summary, the following functions are performed intrinsically by our memristors: Storing both (1) long- (W) and (2) short-term (F^(t)) weights (Fig. 2f), (3) long-term plasticity (Fig. 2c), (4) short-term plasticity (Fig. 3a, b), (5) meta-plasticity via control over the decay time parameter Λ (Figs 3d, e), and (6) multiplication of the input voltage with the synaptic weight according to Ohm’s law.

DNN with multi-functional memristive synapses

The six intrinsic functionalities of our memristors can be utilized by ST-Hebb synapses in a deep STPN network. Such networks have been shown to outperform traditional DNN implementations without multi-functional synapses at a variety of complex tasks in dynamic environments²⁰. One such dynamic task is learning to play Atari Pong, a video game and common machine learning benchmark. In Pong a player (the STPN network) confronts an opponent, each manipulating a vertically movable bar to strike a ball, aiming to get the ball past the opponents bar (i.e., scoring a point) or preventing the opponent from doing so. The game concludes when either player scored 21 points. The STPN network’s reward is the difference between the player’s and the opponent’s points at the end of the game. Given only this scalar reward as input, the network finds a strategy that results in the maximum score of 21 by repeatedly playing the game and employing reinforcement learning, a bio-inspired learning paradigm⁵⁸. Below we show the development of a modified STPN unit (m-STPN), which are STPNs²⁰ with a modified weight normalization scheme (see Methods section “Modified STPN model” for details and benefits of this approach). These units make use of our multi-functional synapses to play Atari Pong. Through simulation we could estimate the energy consumption of the whole network if it were running on our memristive hardware and compare it to a pure GPU implementation.

Modified short-term plasticity neuron

The deep STPN network simulator investigated here (Fig. 4a) employs a network layer consisting of our modified STPN units (m-STPN layer). The network itself relies on an actor-critic architecture that takes frames of the Atari Pong environment as inputs and computes both the next action to take in the environment (actor) as well as an estimation of the value of the current state (critic). The frames are first processed by two convolutional layers into a dense feature set that forms the input for the m-STPN layer. The latter consists of 64 m-STPN units, each of which is connected through ST-Hebb synapses to the 2592 inputs as well as recurrently to 64 outputs. In total, this amounts to (2592 + 64) ⋅ 64 = 169984 synapses. The output of the m-STPN layer is then fed into two fully-connected linear layers that compute the next action (the actor’s next step to take in the game) and the current value (how advantageous is the current game state). To compare the influence of the STPN implementation on the training performance, three networks with different STPN layers (m-STPN, STPN, and no plasticity) were investigated (Fig. 4b). Here, the reward during training is plotted as a function of the steps taken by the actor (see Methods section “Network training” for details). Each curve represents the average reward of 16 agents that learn to play the game with different randomly initialized parameters. We observe that in terms of training speed both m-STPN and STPN outperform the no plasticity implementation (i.e., a traditional recurrent layer without time dependent synaptic weights). Furthermore, the m-STPN version learns slightly faster than the original STPN network, while also exhibiting a much smaller standard deviation among different training runs (shaded areas in Fig. 4b). While the robust training performance of the m-STPN layer is encouraging, the main aim of our m-STPN’s is to show that our mutli-functional memristors can act as hardware ST-Hebb synapses in the STPN network of Fig. 4a. To achieve this, the following device characteristics where implemented into the m-STPN units: (1) mapping of the memristor conductance (G_meas) to the simulated, unitless synaptic weight (G) by the linear relationship

$$G=({G}_{meas}-{G}_{min})/m$$

(1)

with m = 2nS and G_min = 12nS. (2) Adding a discretization operation to the simulated short-term weight update (ΔF) that limits the number of ΔF values (states) to an amount that can be resolved by our memristors. To satisfy this requirement, the conductance values corresponding to two adjacent states should be separated by at least one standard deviation, which is below 1nS for all short-term weight updates ΔF_meas (max. ± 0.9 nS in Fig. 3b). We therefore chose a discretization step of 1nS for ΔF_meas, which translates to a step of 0.5 for the simulated ΔF according to Eq. (1). (3) Fixing the maximum of ∣ΔF∣ to 20, which makes sure that the weight update remains in a range that is achievable by the STO memristors. A histogram of ΔF for all synapses during an entire Pong game, with and without non-idealities, is given in Supplementary Section S11. (4) Limiting the range of the decay time constant Λ to values that can be reached by our devices ([0.08, 0.92]). Furthermore, it was observed that the constraining of Λ also has an impact on the training performance of the network, as shown in Fig. 4c. The five lines denote different constraints imposed on the learned decay time parameter Λ. Notably, it is beneficial to incorporate synapses with large decay time constants during training: The larger the upper limit of Λ the faster the reward increases. Unexpectedly, the case with Λ = 0 (i.e., immediate decay of the short-term weight changes for all synapses) also learns, albeit slower and less robustly, as can be seen from the larger standard deviation compared to Λ = [0.08, 0.92] (inset of Fig. 4c). The longer, constrained decay times were made possible by the modified weight normalization scheme in m-STPN’s (Methods section “Modified STPN model”). Because the decay constant Λ is naturally limited in our devices, destabilizing phenomena such as an exponential gain (Λ > 1) instead of a decay (Λ < 1) are automatically prevented. Also note that non-volatile memristive devices, which correspond to a Λ = 1, are insufficient for the implementation of synapses in STPN networks (Supplementary Fig. S15)

**Fig. 4: Simulation and energy consumption of an STPN network with multi-functional synapses.**

After training some of the 16 trained agents achieve the maximum reward of 21 (Supplementary Section S12). The total synaptic weight value G = W + F of a single synapse of such a trained agent is reported in Fig. 4d over the course of an entire game that lasts roughly 50 seconds. This specific synapse was chosen because it exhibits the largest synaptic changes (ΔF) in the whole network. It is therefore referred to as S_max{ΔF} in the remainder of the text and will serve as a representative example for the behavior of a synapse in an STPN network. It is observed that the value of the synapse’s weight G changes over time due to the short-term plasticity of ST-Hebb synapses. Importantly, the short-term updates are sparse, which makes the implementation of this reinforcement learning task energy efficient on our memristive hardware as only a small number of energy consuming short-term weight updates (ΔF) are needed. The zoom-in additionally shows both the long-term weight component W (in red) and the short-term weight updates ΔF (in black). Each simulation timestep is marked by a dot.

Energy consumption of deep STPN network

Next, we estimate the energy consumption of synapse S_max{ΔF} for the duration of the entire game if it were implemented on our memristor. Two sources of energy loss are considered: Firstly, each voltage pulse that causes a short-term weight update consumes energy (E_pulse) (Fig. 3c). Secondly, due to the application of a constant bias to control the decay time a small current continuously flows through the devices, inducing a power loss (P_bias). We address these two components separately. Figure 4e reports the first one (E_pulse) as a function of the short-term weight updates ΔF. This quantity is extracted from the measurement data in Figs. 3b and 3c, for different pulse widths (w_p). The measured energy data points closely follow a power law relation: E_pulse(ΔF) = c ⋅ (ΔF)^α with c = 30pJ and α = 1.52. This power law relation was incorporated into our neural network simulator to estimate the energy consumption of the short-term weight updates in our memristors. Because the value of ∣ΔF∣ is limited to 20 and because the weight updates are sparse, this first contribution to the energy consumption remains low. In Fig. 4f the second contribution to the energy consumption (P_bias) is given as a function of the total synaptic weight G. It is calculated according to ${P}_{bias}=| {G}_{meas}| \cdot {V}_{bias}^{2}$. Note that even for a simulated weight of G = 0 there is a remnant power draw (except if V_bias = 0) because of the finite minimum conductance value G_min = 12nS of the physical devices. For the maximum bias voltage V_bias = 0.6 the power consumed by a synapse with a constant weight of G = 0 is therefore 4.3 nW. This low power consumption is a direct consequence of our memristor’s low conductance values, enabled by their non-filamentary switching behavior.

In Fig. 4g the estimated energy consumed during inference over the course of a Pong game by either a memristor (blue) or a pure GPU implementation (orange) of synapse S_max{ΔF} is provided. In the memristor case, the energy consumption can be decomposed into two contributions, the short-term weight updates (ΔF) and the applied bias voltage needed to control the decay time constant (Decay). These two components cover the short-term synaptic plasticity and meta-plasticity required by an ST-Hebb synapse during inference. The standard input-weight multiplication is obtained through Ohm’s law I = G ⋅ V_read, where V_read encodes the input. The power consumed by this operation is however already accounted for by P_bias: The current resulting from the application of the maximum bias voltage $\max \{{V}_{bias}\}=0.6V$ can be read out to compute the input-weight multiplication. To implement the same plasticity, meta-plasticity, and input-weight multiplication on a GPU the following four operations need to be executed at every time step during the game (6826 in total): (1) Element-wise addition of short- and long-term weight components, (2) element-wise multiplication of F with Λ for the short-term decay, (3) element-wise addition of F and ΔF for the short-term weight update, and (4) vector-matrix multiplication of inputs and weights (weight mult.). For each of these operations the GPU’s energy consumption was measured for a single synapse (see Methods section “GPU energy measurement”). It is found that the energy consumption of the memristor increases more slowly with the number of timesteps than the GPU baseline. It should however be noted that even though our multi-functional memristive synapse can fully mimic the behavior of an ST-Hebb synapse, the operations of the neuron still need to be performed on a GPU: This concerns the calculation of the magnitude of ΔF via the first term in Methods Eq. (4), the calculation of the non-linear activation function in Methods Eq. (3), and the normalization of the pre-synaptic input (Supplementary Fig. S11b).

To estimate the total synaptic energy consumption of the whole network the contribution of each synapse for an entire game of Pong has to be considered (Fig. 4h). Both the energy consumed by the ΔF updates (dark blue), and by the control of the decay time constant (light blue) are shown in the form of a histogram. Most synapses do not undergo any short-term weight update during the entire game and therefore do not consume energy for this operation, as indicated by the large ΔF spike centered around 0. For the decay control, we assume the worst-case scenario where a bias voltage of 0.6 V is applied to all synapses. The current due to this bias can be read out, which accounts for the energy consumption due to the calculation of the vector-matrix multiplication between the input and the weights. A crossbar array architecture is assumed for this purpose.

The total energy (i.e., ΔF plus Decay) consumed by each memristive synapse is shown in the histogram of Fig. 4i. By summing up the contribution from all synapses we obtain a total energy consumption of 36mJ (Memristor row in Table 1). This value takes into account the four synaptic operations (ΔF, Decay, W+F, and weight multiplications) of all memristive synapses of the entire STPN network for a whole Pong game. To give a nuanced comparison with a pure GPU implementation, we provide two separate measurements using an NVIDIA A100 40GB device (Method section “GPU energy measurement” for details). We report the median of 100 individual runs per synaptic operation for half- and single-precision floating-point arithmetic (fp16 and fp32, respectively).

Table 1 Energy consumed in mJ by the whole STPN network during one game of Atari Pong

Full size table

First, we measure the GPU’s energy consumption for executing each synaptic operation for all the network’s 169984 multi-functional synapses. The results are shown in the GPU (standard) row. It is observed that roughly one third of the total energy consumption stems from the three ST-Hebb specific operations (ΔF, Decay, and W + F) and two thirds from the standard input-weight multiplication. We note that since the GPU is a massively parallel machine, this number of synapses may not fully utilize the device, potentially leading to lower energy efficiency. Indeed, the A100 GPU achieves the highest energy efficiency for a hypothetical network with around 2²¹ synapses. The case labeled GPU (optimal) is the corresponding energy consumption scaled to the original network’s number of synapses. By comparing the fp16 case of the GPU (optimal) energy consumption with the total in the Memristor row an improvement of a factor of 96 is obtained. The saved power is due to both the multi-functional nature of our memristors and their in-memory compute capabilities, which in combination allow for the simultaneous computation of four operations without any memory traffic. The absence of memory traffic is especially beneficial, because all operations considered (i.e., element-wise and vector-matrix multiplication) have little to no data reuse and are memory-bound. As a consequence, most energy is consumed in data movement rather than computation (von Neumann bottleneck). This is demonstrated in Method section “GPU Energy measurement” where we quantify the energy consumption of the GPU’s memory traffic: It accounts for more than 98% of the total. We also provide a discussion on the latency and energy delay product (EDP) of our implementation in Supplementary Section S17. Note that the energy consumption in the memristor case was estimated from the behavior of individual devices and not based on a comprehensive circuit simulation encompassing the whole STPN network. Although such investigations would certainly lead to increased energy consumption⁵⁹, we believe that the memristor advantage is large enough (two orders of magnitude) to persist even under more realistic conditions.

In conclusion, we presented a two-terminal memristor based on STO that is able to store and compute both long- and short-term synaptic weight updates, effectively collocating memory and computation as well as long- and short-term dynamics. In particular, we demonstrated control over the short-term decay time constant without the need for an additional electrical contact or complex control signals, which implements a form of intrinsic meta-plasticity. All these features are essential for neuromorphic circuit implementations, e.g., STPN networks, which outperform traditional artificial neural networks in large-scale, complex machine learning tasks such as Atari Pong. We contributed here to the development of these networks with the introduction of m-STPN units, increasing the reliability during training and highlighting the importance of long decay time constants. Finally, in simulation, we compared our memristor implementation of an STPN network to a GPU one and obtained a significant increase in inference energy efficiency by a factor of at least 96.

To fully realize our simulation concept in hardware, further work is needed: Firstly, our STO memristors should be converted to vertical structures, which is expected to reduce device to device variability and also allows for the creation of crossbar arrays. In such a vertical, thin-film-based structure the spacing between the electrodes could most likely be significantly decreased, as compared to our planar devices, which in all likelihood will lead to lower operating voltages. Secondly, the long-term retention of our memristors should be improved, while still preserving their short-term plasticity. It has been suggested that an oxide layer between the Pt electrode and STO could increase the retention of low conductance states⁵¹. Moreover, since we observed a significant impact of the decay time constant on the training performance, different decay models should be investigated for both long- and short-term components in STPN networks. The advancement of such neural networks inspired by biology holds the potential to significantly increase the performance of AI applications across diverse dynamic environments. Furthermore, multi-functional memristive synapses with intrinsic dynamics could function as a key enabling technology for the energy efficient hardware implementation of next-generation neural networks.

Methods

Device fabrication

The STO single crystal substrate was first submersed into a 90 °C DI water bath under UV light illumination for 100 min⁶⁰. The substrate was then baked at 250 °C for 5min and subjected to an O₂ plasma treatment (200W) for 3 min. This water leaching surface treatment is expected to produce an atomically flat, predominantly TiO₂-terminated surface, which is characterized by terraces of 1 unit cell (u.c.) height. This was indeed observed at several locations of the substrate, as shown in Supplementary Fig. S1. Both electrode stacks (Cr-Pt and Ti-Pt) were then patterned using e-beam lithography and deposited by e-beam evaporation (Supplementary Figs. S2a and S2b). After deposition the whole device was subsequently annealed at 300 degrees for 20 min in Ar atmosphere (Supplementary Fig. S2c). This step causes a thermal oxide to form at the Ti-STO interface, leaving behind oxygen vacancies⁶¹. Annealing also likely leads to diffusion of chromium into STO, doping the STO in the process⁶². The device stack was finally encapsulated within 30nm of SiN using plasma enhanced chemical vapor deposition (PECVD) to protect against oxidation (Supplementary Fig. S2d). The STO single crystal substrate was characterized by a four point probe measurement, which resulted in a surface resistance of >10 GOhm, exceeding the measurement limit of the setup. We can therefore safely ignore surface contributions to our device conductance.

Experimental setup

The quasi static I-V characteristics were measured with a Keysight M9601A Source Measure Unit. Voltage pulses were generated with a Keysight 33500 Arbitrary Waveform Generator. The current was fed through a DHPCA-100 trans-impedance amplifier from Femto and read out with a Rohde&Schwarz RTE 1104 oscilloscope.

Modified STPN model

The equations describing the forward pass through an STPN layer follow²⁰:

$${{{\bf{G}}}}^{(t)}={{\bf{W}}}+{{{\bf{F}}}}^{(t)}$$

(2)

$${{{\bf{h}}}}^{(t)}=\tanh ({{{\bf{G}}}}^{(t)}{{{\bf{x}}}}^{(t)})$$

(3)

$${{\bf{F}}}^{(t+1)}=\underbrace{{{{\mathbf{\Gamma}}} \odot ({{\bf{x}}}^{(t)} \otimes {{\bf{h}}}^{(t)})}_{{{{\Delta}} {{\mbox{F}}}}}}+\underbrace{{{{\mathbf{\Lambda}}} \odot {{\bf{F}}}^{(t)} }_{{{\mbox{Decay}}}}}$$

(4)

where bold letters denote matrices, ⊙ element-wise multiplications and ⊗ outer products. The STPN layer model is parameterized by the long-term weight W, the Hebbian association strength Γ, and the short-term decay parameter Λ. During training these 3 parameters are learned using back propagation through time (BPTT). While W directly controls the synaptic strength the Λ and Γ parameters define how the synaptic weight responds to stimuli, effectively implementing a form of meta-plasticity or learning to learn. The plastic update of the synapse is modeled by Eq. (4). Equations (3) and (4) are adapted slightly from the original work in²⁰ to reflect the specific implementation here. In addition to Eqs. (2) to (4) the original STPN model also includes a form of normalization on both the synaptic input $x\to {x}_{eff}=\frac{x}{| | W+F| | }$ and the plastic weight $F\to {F}_{eff}=\frac{F}{| | W+F| | }$ (Supplementary Fig. S11a). This speeds up stochastic gradient descent during training. The normalization of F leads to a modification of Eq. (4) where the decay parameter Λ becomes ${\Lambda }_{eff}=\frac{\Lambda }{| | W+{F}^{(t)}| | }$. As a consequence, the decay time constant changes at every time step, because F^(t) varies over time. Such variations can lead to instabilities during training and they cannot be straightforwardly implemented on our memristors. Another consequence is that the decay time constant Λ cannot be a priori constrained to a certain range because Λ_eff depends on the values of W and F, which are unknown at the start of training. However, clamping Λ is important as training becomes highly unstable if synapses reach values Λ_eff > 1 (see Supplementary Fig. S15). The solution adopted to circumvent this issue in the original formulation of²⁰ consisted of starting with small values of Λ at the beginning of the training to ensure that Λ_eff does not exceed 1. This has the disadvantage that the network only slowly learns longer decay time constants. By removing the normalization of the plastic weight F and only normalizing the input (Supplementary Fig. S11b) in our modified STPN unit we achieve a better performance during training and also make the implementation on memristors feasible.

Network training

We closely follow the training protocol established in ref. ²⁰. Concretely we use RLLib⁶³ to train and evaluate agents in PongNoFrameskip-v4. During training the network repeatedly plays against the computer opponent of the gymnasium software library (a common python implementation of Atari game environments) on standard difficulty setting (0 out of 3). Preprocessing (dimensionality and color scale) for the game frames is done as in ref. ⁶⁴ with the exception of frame stacking, which was omitted. The training parameters were also adopted from ref. ²⁰: rollout length (50), gradient clipping (40), discount factor (0.99) and a learning rate starting at 0.0001 with a linear decay schedule finishing at 10⁻¹¹ at 200 million iterations. Models are trained from the experience collected by 4 parallel agents.

GPU Energy measurement

To fairly compare the efficiency of a memristor and GPU implementation of the network in Fig. 4a, it is essential that the GPU’s energy consumption is only measured for the specific arithmetic operations that can be performed on the memristor: (1) W+F, (2) Decay, (3) ΔF and (4) weight multiplication (for more details see Supplementary Section S15). On the GPU, these kernels take the form of (matrix) additions and multiplications that can be optimally performed on such hardware. A dedicated code was implemented in Python: It runs each kernel separately. To measure the GPU’s energy consumption, we use the pyJoules library⁶⁵, which is a Python wrapper for NVIDIA’s own energy reporting framework, NVIDIA Management Library (nvml). Since all operations have very short runtimes, to improve accuracy, we measure the energy spent for 10,000 to 200,000 executions of the corresponding kernels. We report the median of 100 multi-executions and estimate the 99% confidence interval (CI) using bootstrapping with 1000 samples. We report the GPU energy consumption of operations (1) to (4) in three ways: (I) per full Atari Pong game of the whole neural network, which employs 64 ∗ 2656 = 169984 synapses and runs for 6826 time steps (Table 1 in the main text), (II) per operation, i.e., a single synapse and one time step (Table 2 in this section), and (III) for a single synapse over the course of a full Atari Pong game, i.e., 6826 time steps (Fig. 4g).

Table 2 Energy consumed per floating point operation (flop) in pJ

Full size table

(I) For the GPU (standard) results in Table 1, the matrices W, F^(t), x^(t), and Λ required by operations (1) to (4) have the same size as in the neural network simulation. For the GPU (optimal) results, we increase the size of the matrices using the formula (2592 + 64) ⋅ k, where k is a power of two and ranges from 64 (original network size) to 4096. We report the energy spent for k = 1024, which exhibits the highest energy efficiency, scaled down to the original network’s size (see Supplementary Fig. S17).

(II) The GPU (standard) and GPU (optimal) rows in Table 2 were obtained by dividing the values of Table 1 by the number of operations executed during the whole game (64 ∗ 2656 ∗ 6826). The energy is given per floating point operation (flop) in pJ. Note that the weight multiplication is computed by a fused multiply-add (FMA) operation, which counts as two flops (one for addition and one for multiplication).

In the GPU (compute) row of Table 2 we implement a CUDA kernel that only operates on data stored in registers without reading/writing from/to the GPU global memory. These results therefore measure the energy spent for the computation only, without the contribution of memory traffic. Concretely, each kernel execution performs as many arithmetic operations (addition, multiplication or FMA) as needed for one complete Atari Pong game. To increase accuracy, each measurement combines 10,000 kernel executions. As before, we report the median of 100 multi-executions and estimate the 99% CI. The energy consumption per flop for the weight multiplication correspond to approximately 5.9 and 9.5 pJ/flop for half- and single-precision. This is in the same ballpark as measurements provided by NVIDIA and independent testing of the GPU’s floating point unit (FPU)^66,67, which validates our measurements. By comparing the GPU (compute) results with the GPU (optimal) we observe that the memory traffic accounts for more than 98% of the GPU’s total energy consumption. This result shows the remarkable energy efficiency of the GPU’s FPU and the benefit of reducing memory traffic. Note, however, that this particular GPU implementation would not be useful in practice, because the results of the kernel’s computations are not accessible via the memory and can therefore not be used by a program running on the GPU. For this reason the highest efficiency GPU benchmark that corresponds to a working implementation is the fp16 energy measurement in the GPU (optimal) row.

(III) For the GPU energy consumption of a single synapse shown in Fig. 4g we made use of the energy measurements per operation in the fp32 case of the GPU (standard) row in Table 2. It should be noted that the energy values in Table 2 were computed by first measuring the energy consumed by all synapses of the network in parallel and then divided by the number of synapses. This ensures that the massive parallelism of GPU’s is utilized, although we’re only interested in the energy consumption of a single synapse. The energy contributions per operation were then cumulatively summed for all time steps to obtain the time-series GPU data in Fig. 4g.

We note that the kernels utilize the GPU’s regular FP cores rather than the tensor cores because the operations (W+F, Decay, ΔF and weight multiplication) do not compute matrix-matrix products.

The specifications of our test system are:

Hardware:

GPU: NVIDIA A100 with 40GB memory
CPU: 2x AMD EPYC 7742 @ 2.25 Ghz (2 x 64/128 Physical/Logical Cores)
RAM: 512 GB

Software:

Rocky Linux release 8.4
Python 3.11.5
Pytorch 2.2.0 dev20230913
CUDA 12.1.1

Data availability

Source data is available from the corresponding author on request.

Code availability

The repository with the m-STPN network source code can be found here⁶⁸: https://bitbucket.org/weilen-mc/stpn/src/publication/ The GPU energy calculations are available here⁶⁹: https://github.com/Nano-TCAD/SpikeDecay.

References

Markram, H., Gerstner, W. & Sjöström, P. J. Spike-timing-dependent plasticity: a comprehensive overview. Front. Synaptic Neurosci. 4, 2 (2012).
Article CAS PubMed PubMed Central Google Scholar
Erickson, M. A., Maramara, L. A. & Lisman, J. A single brief burst induces GluR1-dependent associative short-term potentiation: a potential mechanism for short-term memory. J. Cogn. Neurosci. 22, 2530–2540 (2010).
Article PubMed PubMed Central Google Scholar
Zucker, R. S. & Regehr, W. G. Short-term synaptic plasticity. Annu. Rev. Physiol. 64, 355–405 (2002).
Article CAS PubMed Google Scholar
Wang, Y. et al. Heterogeneity in the pyramidal network of the medial prefrontal cortex. Nat. Neurosci. 9, 534–542 (2006).
Article CAS PubMed Google Scholar
Abraham, W. C. & Bear, M. F. Metaplasticity: the plasticity of synaptic plasticity. Trends Neurosci. 19, 126–130 (1996).
Article CAS PubMed Google Scholar
Barrett, A. B., Billings, G. O., Morris, R. G. & Van Rossum, M. C. State based model of long-term potentiation and synaptic tagging and capture. PLoS Comput. Biol. 5, 1–12 (2009).
Article MathSciNet Google Scholar
Finn, C., Abbeel, P. & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. International Conference on Machine Learning 1126–1135 (2017).
Miconi, T., Stanley, K. & Clune, J. Differentiable plasticity: training plastic neural networks with backpropagation. International Conference on Machine Learning 3559–3568 (2018).
Miconi, T., Rawal, A., Clune, J. & Stanley, K. O. Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity https://arxiv.org/abs/2002.10585 (2020).
Tyulmankov, D., Yang, G. R. & Abbott, L. F. Meta-learning synaptic plasticity and memory addressing for continual familiarity detection. Neuron 110, 544–557 (2022).
Article CAS PubMed Google Scholar
Najarro, E. & Risi, S. Meta-learning through hebbian plasticity in random networks. Adv. Neural Inf. Process. Syst. 33, 20719–20731 (2020).
Google Scholar
Hospedales, T., Antoniou, A., Micaelli, P. & Storkey, A. Meta-learning in neural networks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44, 5149–5169 (2022).
PubMed Google Scholar
Nadim, F. & Manor, Y. The role of short-term synaptic dynamics in motor control. Curr. Opin. Neurobiol. 10, 683–690 (2000).
Article CAS PubMed Google Scholar
Citri, A. & Malenka, R. C. Synaptic plasticity: multiple forms, functions, and mechanisms. Neuropsychopharmacology 33, 18–41 (2008).
Article PubMed Google Scholar
Shimizu, G., Yoshida, K., Kasai, H. & Toyoizumi, T. Computational roles of intrinsic synaptic dynamics. Curr. Opin. Neurobiol. 70, 34–42 (2021).
Article CAS PubMed Google Scholar
Zador, A. et al. Catalyzing next-generation artificial intelligence through neuroai. Nat. Commun. 14, 1597 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Canziani, A., Paszke, A. & Culurciello, E. An analysis of deep neural network models for practical applications http://arxiv.org/abs/1605.07678 (2016).
Patterson, D. et al. Carbon emissions and large neural network training http://arxiv.org/abs/2104.10350 (2021).
Moraitis, T., Sebastian, A. & Eleftheriou, E. Short-term synaptic plasticity optimally models continuous environments http://arxiv.org/abs/2009.06808 (2020).
Rodriguez, H. G., Guo, Q. & Moraitis, T. Short-term plasticity neurons learning to learn and forget. Proc. 39th Int. Conf. Mach. Learn. 162, 18704–18722 (2022).
Google Scholar
Xu, X. et al. Scaling for edge inference of deep neural networks. Nat. Electron. 1, 216–222 (2018).
Article Google Scholar
Yu, S. Neuro-inspired computing with emerging nonvolatile memorys. Proc. IEEE 106, 260–285 (2018).
Article CAS Google Scholar
Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. & Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529–544 (2020).
Article ADS CAS PubMed Google Scholar
Jo, S. H. et al. Nanoscale memristor device as synapse in neuromorphic systems. Nano Lett. 10, 1297–1301 (2010).
Article ADS CAS PubMed Google Scholar
Waser, R. Nanoelectronics and Information Technology (John Wiley and Sons, 2012).
Emboras, A. et al. Opto-electronic memristors: prospects and challenges in neuromorphic computing. Appl. Phys. Lett. 117, 230502 (2020).
Article ADS CAS Google Scholar
Portner, K. et al. Analog nanoscale electro-optical synapses for neuromorphic computing applications. ACS Nano 15, 14776–14785 (2021).
Article CAS PubMed Google Scholar
Kumar, S., Wang, X., Strachan, J. P., Yang, Y. & Lu, W. D. Dynamical memristors for higher-complexity neuromorphic computing. Nat. Rev. Mater. 7, 575–591 (2022).
Article ADS Google Scholar
Demirağ, Y. et al. Pcm-trace: scalable synaptic eligibility traces with resistivity drift of phase-change materials. 2021 IEEE International Symposium on Circuits and Systems (ISCAS) 1–5 (2021).
Yang, R., Huang, H. M. & Guo, X. Memristive synapses and neurons for bioinspired computing. Adv. Electron. Mater. 5, 1–32 (2019).
Article Google Scholar
Choi, S., Yang, J. & Wang, G. Emerging memristive artificial synapses and neurons for energy-efficient neuromorphic computing. Adv. Mater. 32, 1–26 (2020).
Article ADS Google Scholar
Yang, R. et al. Synaptic suppression triplet-STDP learning rule realized in second-order memristors. Adv. Funct. Mater. 28, 1–10 (2018).
ADS CAS Google Scholar
Xiong, J. et al. Bienenstock, cooper, and munro learning rules realized in second-order memristors with tunable forgetting rate. Adv. Funct. Mater. 29, 1–8 (2019).
Article ADS CAS Google Scholar
Pfeiffer, M. & Pfeil, T. Deep learning with spiking neurons: opportunities and challenges. Front. Neurosci. 12, 409662 (2018).
Article Google Scholar
Sarwat, S. G., Kersting, B., Moraitis, T., Jonnalagadda, V. P. & Sebastian, A. Phase-change memtransistive synapses for mixed-plasticity neural computations. Nat. Nanotechnol. 17, 507–513 (2022).
Article ADS CAS PubMed Google Scholar
Regina Dittmann, S. M. & Waser, R. Nanoionic memristive phenomena in metal oxides: the valence change mechanism. Adv. Phys. 70, 155–349 (2021).
Article ADS Google Scholar
Li, Y. et al. Filament-free bulk resistive memory enables deterministic analogue switching. Adv. Mater. 32, 2003984 (2020).
Article CAS Google Scholar
Cruz-Albrecht, J. M., Yung, M. W. & Srinivasa, N. Energy-efficient neuron, synapse and STDP integrated circuits. IEEE Trans. Biomed. Circuits Syst. 6, 246–256 (2012).
Article PubMed Google Scholar
Joubert, A., Belhadj, B., Temam, O. & Héliot, R. Hardware spiking neurons design: Analog or digital? The 2012 International Joint Conference on Neural Networks (IJCNN) 1–5 (2012).
Gopalakrishnan, R. & Basu, A. Triplet spike time-dependent plasticity in a floating-gate synapse. IEEE Trans. Neural Netw. Learn. Syst. 28, 778–790 (2015).
Article Google Scholar
Jiang, W. et al. Mobility of oxygen vacancy in SrTiO₃ and its implications for oxygen-migration-based resistance switching. J. Appl. Phys. 110, 034509 (2011).
Article ADS Google Scholar
Cooper, D. et al. Anomalous resistance hysteresis in oxide ReRAM: oxygen evolution and reincorporation revealed by in situ TEM. Adv. Mater. 29, 1–8 (2017).
Article ADS Google Scholar
Gwon, M., Lee, E., Sohn, A., Bourim, E. M. & Kim, D. W. Doping-level dependences of switching speeds and the retention characteristics of resistive switching Pt/SrTiO₃ junctions. J. Korean Phys. Soc. 57, 1432–1436 (2010).
Article CAS Google Scholar
Goossens, A. S. & Banerjee, T. Tunability of voltage pulse mediated memristive functionality by varying doping concentration in SrTiO3. Appl. Phys. Lett. 122, 034101 (2023).
Article ADS CAS Google Scholar
Rana, K. G., Khikhlovskyi, V. & Banerjee, T. Electrical transport across Au/Nb:SrTiO₃ Schottky interface with different Nb doping. Appl. Phys. Lett. 100, 1–4 (2012).
Article Google Scholar
Park, C., Seo, Y., Jung, J. & Kim, D. W. Electrode-dependent electrical properties of metal/Nb-doped SrTiO₃ junctions. J. Appl. Phys. 103, 054106 (2008).
Article ADS Google Scholar
Hensling, F. V., Heisig, T., Raab, N., Baeumer, C. & Dittmann, R. Tailoring the switching performance of resistive switching SrTiO3 devices by SrO interface engineering. Solid State Ion. 325, 247–250 (2018).
Article CAS Google Scholar
Muenstermann, R., Menke, T., Dittmann, R. & Waser, R. Coexistence of filamentary and homogeneous resistive switching in Fe-doped SrTiO₃ thin-film memristive devices. Adv. Mater. 22, 4819–4822 (2010).
Article CAS PubMed Google Scholar
Baeumer, C. et al. Quantifying redox-induced schottky barrier variations in memristive devices via in operando spectromicroscopy with graphene electrodes. Nat. Commun. 7, 12398 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Menzel, S. & Waser, R. Mechanism of memristive switching in oxram. Advances in Non-Volatile Memory and Storage Technology (2nd Edition) 137–170 (2019).
Siegel, S. et al. Trade-off between data retention and switching speed in resistive switching reram devices. Adv. Electron. Mater. 7, 2000815 (2021).
Article CAS Google Scholar
Zurhelle, A. F. Modeling the oxygen transport at heterointerfaces for oxide-based electronics. Ph.D. thesis, Rheinisch-Westfälische Technische Hochschule Aachen (2023).
Heisig, T. et al. Oxygen exchange processes between oxide memristive devices and water molecules. Adv. Mater. 30, 1–7 (2018).
Article ADS Google Scholar
Mikheev, E., Hoskins, B. D., Strukov, D. B. & Stemmer, S. Resistive switching and its suppression in Pt/Nb:SrTiO₃ junctions. Nat. Commun. 5, 3990 (2014).
Article ADS CAS PubMed Google Scholar
Valov, I. & Tsuruoka, T. Effects of moisture and redox reactions in VCM and ECM resistive switching memories. J. Phys. D Appl. Phys. 51, 413001 (2018).
Article Google Scholar
Kreuer, K. D. Aspects of the formation and mobility of protonic charge carriers and the stability of perovskite-type oxides. Solid State Ion. 125, 285–302 (1999).
Article CAS Google Scholar
Sata, N., Hiramoto, K., Ishigame, M. & Hosoya, S. Site identification of protons in SrTiO₃: Mechanism for large protonic conduction. Phys. Rev. B 54, 15795–15799 (1996).
Article ADS CAS Google Scholar
Neftci, E. O. & Averbeck, B. B. Reinforcement learning in artificial and biological systems. Nat. Mach. Intell. 1, 133–143 (2019).
Article Google Scholar
Aguirre, F. et al. Hardware implementation of memristor-based artificial neural networks. Nat. Commun. 15, 1974 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Speier, W., Szot, K. & Karthaeuser, S. Verfahren zur Herstellung einer B-terminierten Oberfläche auf Perowskit-Einkristallen. German Patent No. DE200410019690 (2005).
Li, Y. et al. Nanoscale chemical and valence evolution at the metal/oxide interface: a case study of Ti/SrTiO₃. Adv. Mater. Interfaces 3, 1–8 (2016).
Article ADS CAS Google Scholar
La Mattina, F., Bednorz, J. G., Alvarado, S. F., Shengelaya, A. & Keller, H. Detection of charge transfer processes in Cr-doped SrTiO₃ single crystals. Appl. Phys. Lett. 93, 022102 (2008).
Article ADS Google Scholar
Liang, E. et al. RLlib: Abstractions for distributed reinforcement learning. Proc. 35th Int. Conf. Mach. Learn. 80, 3053–3062 (2018).
Google Scholar
Mnih, V. et al. Asynchronous methods for deep reinforcement learning. Proc. 33rd Int. Conf. Mach. Learn. 48, 1928–1937 (2016).
Google Scholar
Belgaid, M. C., Rouvoy, R. & Seinturier, L. Pyjoules: Python library that measures python code snippets https://github.com/powerapi-ng/pyJoules (2019).
Dally, B. The path to exascale computing https://images.nvidia.com/events/sc15/pdfs/SC5102-path-exascale-computing.pdf (2015).
Bhalachandra, S., Austin, B., Williams, S. & Wright, N. J. Understanding the impact of input entropy on fpu, cpu, and gpu power https://arxiv.org/abs/2212.08805 (2022).
Weilenmann, C. Single neuromorphic memristor closely emulates multiple synaptic mechanisms for energy efficient neural networks https://doi.org/10.5281/zenodo.12685701 (2024).
Ziogas, A. & Weilenmann, C. Single neuromorphic memristor closely emulates multiple synaptic mechanisms for energy efficient neural networks https://doi.org/10.5281/zenodo.12685560 (2024).

Download references

Acknowledgements

We would like to thank the Operations Team of the Binnig and Rohrer Nanotechnology Center, especially Antonis Olziersky, Roland Germann, Ute Drechsler, and Diana Davila for the generous sharing of their immense fabrication knowledge. We would also like to thank Johannes Hellwig for sharing his insights on the STO switching mechanism, Dhananjeya Kumaar for inspiring the title of this work, and Hector Rodriguez for helping us understand the STPN code better. T.M. was with Huawei Technologies, Zurich Research Center when the authors initially agreed to collaborate. Funding from the Werner Siemens Foundation (A.E., M.L., and M.M.), the SNSF Strategic Japanese-Swiss Science and Technology Program under project metacross (grant number 214068, C.W. and A.E.), the SNSF Sinergia project ALMOND (grant number 198612, M.K., M.L., and T.Z.), and the SNSF Advanced Grant project QuaTrEx (grant number 209358, A.N.Z. and M.L.) is acknowledged. Finally, this work used computational resources from the Swiss National Supercomputing Center (CSCS) under project s1119 (M.K., M.L., and M.M.).

Author information

Authors and Affiliations

Integrated Systems Laboratory, ETH Zurich, Zurich, Switzerland
Christoph Weilenmann, Alexandros Nikolaos Ziogas, Till Zellweger, Kevin Portner, Marko Mladenović, Manasa Kaniselvan, Mathieu Luisier & Alexandros Emboras
Noemon AG, Zurich, Switzerland
Timoleon Moraitis

Authors

Christoph Weilenmann
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Nikolaos Ziogas
View author publications
You can also search for this author in PubMed Google Scholar
Till Zellweger
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Portner
View author publications
You can also search for this author in PubMed Google Scholar
Marko Mladenović
View author publications
You can also search for this author in PubMed Google Scholar
Manasa Kaniselvan
View author publications
You can also search for this author in PubMed Google Scholar
Timoleon Moraitis
View author publications
You can also search for this author in PubMed Google Scholar
Mathieu Luisier
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Emboras
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.W. developed the concept of the paper, fabricated and measured devices, implemented, tested and trained the neural network. C.W. wrote the paper with input from all the authors. A.Z. performed the measurement of the GPU energy consumption. T.Z. developed the characterization setup and helped with device measurements. K.P. assisted with fabrication. M.M. and M.K. provided feedback on the theoretical device operation principle. T.M. wrote part of the abstract and introduction and gave guidance on the original version of the STPN network. M.L. supervised the project and helped on the structuring of the paper. A.E. supervised the project and led the study with inputs on numerous topics including the fabrication/characterization of the devices and the writing of the paper. C.W. and A.E. conceived the project.

Corresponding author

Correspondence to Christoph Weilenmann.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Yiyu Shi, Gaokuo Zhong and Ilia Valov for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Weilenmann, C., Ziogas, A.N., Zellweger, T. et al. Single neuromorphic memristor closely emulates multiple synaptic mechanisms for energy efficient neural networks. Nat Commun 15, 6898 (2024). https://doi.org/10.1038/s41467-024-51093-3

Download citation

Received: 30 January 2024
Accepted: 27 July 2024
Published: 13 August 2024
DOI: https://doi.org/10.1038/s41467-024-51093-3
Springer Nature Limited

Associated content

Neuromorphic Hardware and Computing 2024

Collection 06 May 2024
Devices

Focus 26 January 2021

Single neuromorphic memristor closely emulates multiple synaptic mechanisms for energy efficient neural networks

Abstract

Similar content being viewed by others

Explore related subjects

Introduction

Results and discussion

Multi-functional synaptic behavior in single memristor

DNN with multi-functional memristive synapses

Modified short-term plasticity neuron

Energy consumption of deep STPN network

Methods

Device fabrication

Experimental setup

Modified STPN model

Network training

GPU Energy measurement

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation