Memristor crossbar architectures for implementing deep neural networks

Liu, Xiaoyang; Zeng, Zhigang

doi:10.1007/s40747-021-00282-4

Memristor crossbar architectures for implementing deep neural networks

Original Article
Open access
Published: 20 July 2021

Volume 8, pages 787–802, (2022)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

Memristor crossbar architectures for implementing deep neural networks

Download PDF

Xiaoyang Liu^1,2 &
Zhigang Zeng^1,2

8572 Accesses
35 Citations
Explore all metrics

Abstract

The paper presents memristor crossbar architectures for implementing layers in deep neural networks, including the fully connected layer, the convolutional layer, and the pooling layer. The crossbars achieve positive and negative weight values and approximately realize various nonlinear activation functions. Then the layers constructed by the crossbars are adopted to build the memristor-based multi-layer neural network (MMNN) and the memristor-based convolutional neural network (MCNN). Two kinds of in-situ weight update schemes, which are the fixed-voltage update and the approximately linear update, respectively, are used to train the networks. Consider variations resulted from the inherent characteristics of memristors and the errors of programming voltages, the robustness of MMNN and MCNN to these variations is analyzed. The simulation results on standard datasets show that deep neural networks (DNNs) built by the memristor crossbars work satisfactorily in pattern recognition tasks and have certain robustness to memristor variations.

An efficient full-size convolutional computing method based on memristor crossbar

Article Open access 29 May 2024

Purely self-rectifying memristor-based passive crossbar array for artificial neural network accelerators

Article Open access 02 January 2024

Fully hardware-implemented memristor convolutional neural network

Article 29 January 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Great progress has been made in DNNs in recent years. DNNs have excellent performance in image recognition, speech recognition, machine translation and related fields and have been widely used in artificial intelligence. But as tasks become more and more complex, the requirement of computing power becomes higher and higher. The traditional computing devices based on von Neumann architecture suffer the memory wall problem due to the separation of the computing units and the storage units, which hinders the further improvement of their computing capability. Memristors achieve in-memory and parallel computing, accelerating the operation of DNNs. Meanwhile the plasticity of memristor is very similar to that of synapse. Memristors also have advantages of low power consumption and nanoscale, and are compatible with the complementary metal-oxide-semiconductor (CMOS) technology. Therefore memristors are promising elements to build new computing architectures.

The existence of memristor was predicted in theory in 1971 [6]. Since the memristor was first manufactured in 2008 [31], the research on memristor-based neural networks has developed rapidly. Memristors laid out in the form of crossbars have low power consumption, high density, and could perform the vector-multiplication in parallel. There have been various memristor-based neural networks, such as single-layer and multi-layer neural networks (SNNs, MNNs) [3, 4, 7, 10,11,12, 19, 27, 30, 41, 42, 45, 49, 50], convolutional neural networks (CNNs) [8, 21, 26, 36, 39, 40, 46, 49], Pavlov associative memory networks [5, 24, 25, 34, 43, 52], long short-term memory networks (LSTMs) [1, 2, 9, 20, 23, 28, 29, 35], pulse coupled neural networks (PCNNs) [38, 53], hierarchical temporal memory (HTM) [13, 14, 22, 33, 54].

A memristor bridge synapse-based neural network is proposed in [3]. The memristor bridge synapse achieves positive or negative synaptic weight value using four memristors. A modified chip-in-the-loop learning scheme is put forward to train the network. In [4], a memristor-based SNN is presented and it is trained with ex-situ and in-situ methods. The signed weight value is achieved by subtracting the conductance value of one memristor from that of another memristor. The results show that memristor-based networks are promising implementation of neuromorphic computing systems. In [10], memristor crossbar-based neural network with on-chip back propagation (BP) training is presented. Memristor-based multi-layer neural networks with online gradient descent training are presented in [30]. The network uses one memristor and two CMOS transistors to construct one synapse. Compared with COMS-based counterparts, the memristor-based MNNs [30] consume between 2% and 8% of the area and static power. In [48], a sign backpropagation (SBP) method is proposed to train the resistive random access memory (RRAM)-based neural networks. In [50], circuit design for memristor-based MNNs is presented and a modified BP algorithm is adopted to train the networks. In [19], the in-situ learning capability of the MNN based on the hafnium oxide-based memristor crossbar is experimentally demonstrated. In [49], memristor-based quantized neural networks are presented. The weights are quantized to accelerate the operation of the neural networks.

Besides memristor-based MNNs, there are also works for memristor-based CNNs. In [39], a memristor-based CNN is presented, which is the first time that the memristor-based circuit implements CNN. One memristor crossbar represents all groups of convolution kernels in one convolutional layer and performs the convolutional operation. An extremely parallel implementation of memristor crossbar-based CNN is presented in [40]. It uses a very sparse crossbar reproducing convolution kernels to implement the convolutional operation, and one feature map is convolved at a time. In [8], convolutional layers are mapped to resistive cross-point arrays, and the impacts of noises and bound limitations on the performance of the CNN are analyzed. In [36], a memristor-based fully convolutional network (MFCN) is put forward for semantic segmentation tasks. A fully hardware-implemented memristor-based CNN is presented in [46]. High-yield, high-performance, and uniform memristor crossbars are reported in [46], and an effective hybrid-training method which could adapt to device imperfections is put forward to train the memristor crossbar-based neural networks.

In this paper, memristor-based crossbar architectures, which have few elements in each synapse circuit and meanwhile approximately achieves many activation functions, for implementing memristor-based DNNs are presented. In the crossbars, signed weight values are achieved by subtracting the conductance values of memristors from that of reference resistors [32]. Nonlinear activation functions are approximately implemented through circuits. MMNN and MCNN are built by the presented crossbars, which also substantiate the effectiveness of the crossbars. The networks are trained by two kinds of in-situ update schemes, which are the fixed-voltage update and the approximately linear update. The performance of MMNN and MCNN trained by the two update schemes is analyzed. The robustness of the two networks to conductance variations, which are caused by the inherent characteristics of memristors and errors of programming voltages, is also explored.

The rest of the paper is organized as follows. Section “Memristor crossbar architectures” introduces the memristor model and the memristor crossbar architectures designed for fully connected (FC) layer, convolutional operation, and average pooling operation. Section “Operation of the memristor-based DNNs” introduces the operation of the DNNs built by the crossbars. Simulations and analyses are conducted in Section “Simulations and analysis”. Section “Conclusions” concludes the paper.

Memristor crossbar architectures

In this section, memristor crossbar architectures for DNNs are presented. Memristor crossbars perform vector-matrix multiplications, which are computational complicated operations in neural networks, in parallel through Kirchhoff’s law. This section first introduces the memristor model and then presents memristor crossbars for the FC layer, convolutional operation, and average pooling operation. These memristor-based crossbars could be used to build DNNs.

Memristor model

The memristor model is established to describe the behavior of realistic memristor in mathematical formula, and it can be used to explore characteristics of the memristor. It can also be adopted in simulations to speed up the system design. The HP model is [31]

$$\begin{aligned}&v(t)=i(t)R(t), \end{aligned}$$

(1)

$$\begin{aligned}&R(t)=R_{on}x(t)+R_{off}\left( 1-x(t)\right) , \end{aligned}$$

(2)

where R(t) is the resistance of the memristor, $x\left( t\right) $ is the state variable, $R_{on}$ and $R_{off}$ are the internal low and high resistance of the memristor, respectively, and v(t) and i(t) are the voltage and the current, respectively. And

$$\begin{aligned}&x(t)=\frac{w\left( t\right) }{D}, \end{aligned}$$

(3)

$$\begin{aligned}&\frac{\mathrm{{d}}w\left( t\right) }{\mathrm{{d}}t}=\mu _v\frac{R_{on}}{D}i\left( t\right) f\left( x\left( t\right) \right) , \end{aligned}$$

(4)

where $w\left( t\right) $ is the internal state variable, D is the thickness, and $\mu _v$ is the average ion mobility. D and $\mu _v$ are constants.

The HP model can not model characteristics of many realistic memristors precisely, therefore various memristor models have been put forward to describe behaviors of different memristors [17, 18, 37, 51]. A voltage controlled threshold model [51] that can fit realistic memristors is adopted in the paper

$$\begin{aligned} \frac{\mathrm{{d}}x\left( t\right) }{\mathrm{{d}}t}= {\left\{ \begin{array}{ll} \mu _v\frac{R_{on}}{D^2}\frac{i_{off}}{i(t)-i_0}f\left( x(t)\right) , &{}0<V_{on}<v(t) \\ 0, &{}V_{on}\le v(t)\le V_{off} \\ \mu _v\frac{R_{on}}{D^2}\frac{i(t)}{i_{on}}f\left( x(t)\right) , &{}v(t)<V_{off}<0 \end{array}\right. } \end{aligned}$$

(5)

where $i_0$, $i_{on}$, and $i_{off}$ are constants, and $f\left( x\left( t\right) \right) $ is the window function which is defined as

$$\begin{aligned} f\left( x\left( t\right) \right) =1-\left( 2x\left( t\right) -1\right) ^2. \end{aligned}$$

(6)

Memristor crossbar for FC layer

The FC layer is the basic unit to constitute MNN and is also an essential part of CNN. In the FC layer, inputs are weighted and summed, that is

$$\begin{aligned} y_j=f\left( \sum ^M_{i=1}W_{ji}x_i\right) , \end{aligned}$$

(7)

where $x_i$ is the ith input, $y_j$ is the jth output, $W_{ji}$ is the weight value between the ith input unit and the jth output unit, M is the number of input units, and $f(\cdot )$ is the activation function which could be the binary function, the sigmoid function, the rectified linear unit (ReLU), or the hyperbolic tangent (tanh) function.

The memristor crossbar for the FC layer is shown in Fig. 1. Through Kirchhoff’s law, the memristor crossbar implements the weighted summing up operation in (7), whose computation complexity is generally $O(N^2)$, with computation complexity of O(1). In the inference phase, TGs (transmission gates) in the left column and the row below the memristor rows are closed and there is

$$\begin{aligned} V_f=-R_f\left( \sum ^M_{i=1}\frac{V_{x,i}}{R_s}+\frac{V_b}{R_s}\right) . \end{aligned}$$

(8)

$V_{x,i}=V_r\cdot x_i$, where $V_r$ is the read voltage and $x_i$ is the original input value. The current of the jth column is

$$\begin{aligned} I_j=&\sum ^M_{i=1}\frac{V_{x,i}}{R_{i,j}}+\frac{V_b}{R_{M+1,j}}+\frac{V_f}{R_f} \nonumber \\ =&\sum ^M_{i=1}V_{x,i}\cdot \left( G_{i,j}-G_s\right) \nonumber \\&+V_b\cdot \left( G_{M+1,j}-G_s\right) , \end{aligned}$$

(9)

where $G_s=1/R_s$ and $G_{i,j}=1/R_{i,j}\ (j=1,2,\dots ,N)$ is the conductance of the memristor in the ith row and the jth column. The output voltage is

$$\begin{aligned} -V'_{o,j}=I_jR_{a}+\frac{V_cR_{a}}{R_c}. \end{aligned}$$

(10)

Through setting different values of $R_a$ and $R_c$, various activation functions can be approximately achieved. Denote the source voltages of amplifiers in the dotted box by $V_s$ and $V_d$. When the resistance of $R_a$ is very large and $V_s=0$, $V_d=-1\mathrm V$, then it can be approximately obtained that

$$\begin{aligned} -V'_{o,j}= {\left\{ \begin{array}{ll} 1, &{}I_j>0 \\ 0. &{}I_j\le 0 \end{array}\right. } \end{aligned}$$

(11)

It is a binary activation function.

Set $R_{a}={0.25\mathrm {V}}/\left( V_rr_{gw}\right) $, $V_cR_a/R_c=0.5\mathrm V$, where V is volt, and then

$$\begin{aligned} -V'_{o,j}=0.25\frac{I_j}{V_rr_{gw}}\mathrm {V}+0.5\mathrm {V}, \end{aligned}$$

(12)

where $I_j/\left( V_rr_{gw}\right) $ is the numerical value of the output of the jth column and $r_{gw}$ is the ratio of the conductance value and the weight value. Let $x=I_j/\left( V_rr_{gw}\right) $, $y=-V'_{o,j}$, $V_s=0\mathrm V$, and $V_d=-1V$. Ignoring the voltage unit volt, there is

$$\begin{aligned} y= {\left\{ \begin{array}{ll} 1, &{}x>2 \\ 0.25x+0.5, &{}-2\le x\le 2 \\ 0. &{}x<-2 \end{array}\right. } \end{aligned}$$

(13)

This formula is an approximate realization of the sigmoid function [39].

Similarly, set $R_{a}={1\mathrm {V}}/\left( V_rr_{gw}\right) $, $V_c=0$, $V_s=1\mathrm V$, and $V_d=-1\mathrm V$, there is

$$\begin{aligned} y= {\left\{ \begin{array}{ll} 1, &{}x>1 \\ x, &{}-1\le x\le 1 \\ -1, &{}x<-1 \end{array}\right. } \end{aligned}$$

(14)

It approximately achieves the tanh function [35].

Set $R_{a}={1\mathrm {V}}/\left( V_rr_{gw}\right) $, $V_c=0$, $V_s=0$, and $V_d=-v_H$, and then

$$\begin{aligned} y= {\left\{ \begin{array}{ll} v_H, &{}x>v_H \\ x, &{}0\le x\le v_H \\ 0. &{}x<0 \end{array}\right. } \end{aligned}$$

(15)

It is an approximate realization of the ReLU function with an upper bound of $v_H$.

Then

$$\begin{aligned} V_{o,j}=-\frac{R_1}{R_0}V'_{o,j}, \end{aligned}$$

(16)

where $\frac{R_1}{R_0}$ is to rescale the amplitude of the output voltage to be within thresholds of memristors.

For the classification layer, the activation function part of the crossbar is shown in Fig. 2, and it meanwhile calculates the error between the prediction and the target which is

$$\begin{aligned} \varDelta V_j=V_{o,j}-V_{T,j}. \end{aligned}$$

(17)

In the training phase, TGs in the first row and the right column are closed by setting S low level and ${{\bar{S}}}$ high level. And now the crossbar back propagates errors. $V_{\delta ,1}$ to $V_{\delta ,N}$ are errors to be back propagated and $I_{\delta ,1}$ to $I_{\delta ,M}$, $I_{\delta ,b}$ are back propagated errors.

Memristor crossbar for convolutional operation

The convolutional operation uses several groups of convolution kernels to convolve feature maps, as shown in Fig. 3. The number of kernel groups is equal to the number of output feature maps. The size of each kernel is $K_1\times K_2$, where $K_1$ and $K_2$ are the width and height of the kernel, respectively. The convolutional operation is

$$\begin{aligned} y^p_j=\sum ^M_{i=1}\sum ^{K_1}_{k_1=1}\sum ^{K_2}_{k_2=1}W_{k_1,k_2,i,j}x^p_{k_1,k_2,i}, \end{aligned}$$

(18)

where $y^p_j\ (j=1,2,\dots ,N)$ is the pth value in the jth output feature map, $x^p_{k_1,k_2,i}$ is the value at the position $(k_1,k_2)$ of the pth receptive field in the ith input feature map, $W_{k_1,k_2,i,j}$ is the weight value at the position $(k_1,k_2)$ of the ith kernel in the jth kernel groups, M is the number of input channels, and N is the number of output channels. Suppose the convolution stride is s, and the padding size is P, the dimension of the output feature map is

$$\begin{aligned} \left( N,\left[ \left( H_1-K_1+2P\right) /s\right] +1,\left[ \left( H_2-K_2+2P\right) /s\right] +1\right) ,\nonumber \\ \end{aligned}$$

(19)

where $H_1$ and $H_2$ is the width and the height of the input feature map, respectively, and $\left[ \cdot \right] $ is the integral function.

There are two methods to implement the convolutional operation by means of memristor crossbar. One is to consider a compact memristor crossbar as a set of sliding windows that slide over input feature maps in turn to obtain the output feature map [39]. The other is to input an entire feature map to a sparse crossbar [40], but this method needs lots of redundant memristors and it is also challenging to make the conductance of the same convolution kernel the same. This paper adopts the first method whose crossbar scale is much smaller.

The memristor crossbar for convolutional operation is shown in Fig. 4. The current of the jth column is

$$\begin{aligned} I^p_j=&\sum ^M_{i=1}\sum ^{K_1\times K_2}_{k=1}\frac{V^p_{k,i}}{R_{k,i,j}}+\frac{V_b}{R_{M\times K_1\times K_2+1,j}}+\frac{V_f}{R_f} \nonumber \\ =&\sum ^M_{i=1}\sum ^{K_1\times K_2}_{k=1}V^p_{k,i}\cdot \left( G_{k,i,j}-G_s\right) \nonumber \\&+V_b\cdot \left( G_{M\times K_1\times K_2+1,j}-G_s\right) , \end{aligned}$$

(20)

where $G_{k,i,j}$ is the conductance value of the memristor in the jth column that receives $V^p_{k,i}$ and $G_{M\times K_1\times K_2+1,j}$ is the conductance of the memristor in the jth column that receives $V_b$. Then

$$\begin{aligned} V^p_{o,j}=\frac{R_1}{R_0}\left( I^p_jR_{a}+\frac{V_cR_{a}}{R_c}\right) , \end{aligned}$$

(21)

where j is also the index of the jth output feature map. Each column in the crossbar represents one kernel group.

Memristor array for average pooling operation

The average pooling operation is

$$\begin{aligned} y^p=\sum ^{K_1}_{i=1}\sum ^{K_2}_{j=1}\frac{x^p_{ij}}{K_1\times K_2}, \end{aligned}$$

(22)

where $x^p_{ij}$ is the input value at the position (i, j) of the pth receptive field, $y^p$ is the output value of the pth receptive field, and $K_1$ and $K_2$ is the width and the height of the pooling kernel, respectively. This operation could be implemented by convolutional operation whose stride size is equal to the kernel size and all weight values are $1/\left( K_1\times K_2\right) $.

The memristor array for average pooling operation is shown in Fig. 5. All memristors have resistance values of $K_1\times K_2\times R_a$, where $R_a$ represents the resistance of the resistor $R_a$ in Fig. 5. The output voltage of each column is

$$\begin{aligned} V_{o,p}=\frac{\mathrm{R}_a \mathrm{R}_1}{\mathrm{R}_0}\sum ^{K_1\times K_2}_{i=1}\frac{V^p_i}{K_1\times K_2}, \end{aligned}$$

(23)

where $p=1,2,\dots ,P$.

Operation of the memristor-based DNNs

The memristor-based DNNs are trained through the error back propagation (BP) algorithm. The memristors are updated in-situ according to the weight update value. The advantages of in-situ learning is that the learning process can adjust to hardware imperfections [4, 19, 47], and the memristors can be updated in parallel. The in-situ learning also provides a possible solution for completely on-chip learning.

Weight update schemes

Two kinds of weight update schemes are adopted to in-situ update memristors in the crossbar. They are the fixed-voltage update and the approximately linear update.

Fixed-voltage update

The fixed-voltage update means that the amplitudes and the duration of writing voltages are fixed. There are two kinds of writing voltages, which are the voltage to increase the conductance and the voltage to decrease the conductance, respectively, and they are different in sign and duration. Which one of them is used depends on the sign of the weight update value. This method is very easy to implement because there is no need to precisely convert weight update values to appropriate writing voltages which is a difficult process because of the nonlinearity of the conductance changing of the memristor [47, 48]. If $\varDelta W\ge \sigma $, the corresponding memristor is applied the positive writing voltage, and if $\varDelta W<-\sigma $, the memristor is applied the negative writing voltage, where $\varDelta W$ is the weight update value and $\sigma $ is a small non-negative constant to filter small update values. Because of the nonlinearity, the conductance updating values of all memristors are not the same. Denote the absolute values of the rising slope and the descending slope of the approximately linear region of conductance changing versus timing of writing voltages by $k_r$ and $k_d$, respectively. The ratio of the duration of the pulse to increase the conductance and that of the pulse to decrease the conductance is equal to $k_d/k_r$.

Approximately linear update

The approximately linear update means that the middle approximately linear region of conductance changing of the memristor is adopted to represent most weight values [23]. The desired conductance update value $\varDelta {G}=\varDelta {W}\cdot r_{gw}$. The desired voltage duration for adjusting the memristor is approximately calculated as $\varDelta {G}/k_r$ for increasing or $\varDelta {G}/k_d$ for decreasing the conductance.

Update memristors in the crossbar

Memristors are updated by applying voltages with appropriate durations through a row-parallel updating method. Illustrate the method by Fig. 6 in which the conductance of $M_1$ and $M_3$ needs to be increased and that of $M_2$ needs to be decreased. The voltages are divided into two phases. The first phase is to increase the conductance and the second phase is to decrease the conductance. $V^+_w$ and $V^-_w$ satisfy that $V^+_w>V_{on}>\frac{1}{2}V^+_w>0$ and $V^-_w<V_{off}<\frac{1}{2}V^-_w<0$. In the first phase, the amplitude of the row voltage is $-\frac{1}{2}V^+_w$ column voltages for increasing the conductance are all $ \frac{1}{2}V^ +_w$. So only voltages across $M_1$ and $M_3$ are beyond the positive threshold voltage of the memristor. In the second phase, the amplitude of the row voltage is $-\frac{1}{2}V^-_w$ and amplitudes of column voltages for decreasing the conductance are all $\frac{1}{2}V^-_w$. So only the voltage across $M_2$ is below the negative threshold voltage. Therefore only memristors in the second row are updated and the rest remain unchanged. For the fixed-voltage update, pulse durations of columns voltages do not vary, but for the approximately linear update, pulse durations are related to weight update values.

The BP training

The BP training of the network is completed through following steps

1.
Reset all memristors to $R_{off}$ by applying reset voltages $V^-_w$, and then adjust the conductance to the approximately linear region by $V^+_w$ with appropriate timing.
2.
S is set high level and TGs in the left column in the memristor crossbar are closed. Feed input voltages to the DNNs and obtain errors through (17). Then the loss is calculated as
$$\begin{aligned} {\mathcal {L}}=\frac{1}{2}\sum ^C_{j=1}|V_{o,j}-V_{T,j}|^2, \end{aligned}$$
(24)
where C is the number of classes. Or
$$\begin{aligned} {\mathcal {L}}=\frac{1}{2}\Vert {\mathbf {V}}_o-{\mathbf {V}}_T\Vert ^2_2, \end{aligned}$$
(25)
where ${\mathbf {V}}_o$ is the final output voltage vector and ${\mathbf {V}}_T$ is the target voltage vector.
3.
Back propagate errors from the ($l+1$)th layer to the lth layer through weights of the ($l+1$)th layer.

For the FC layer, the error voltage vector of the lth layer is
$$\begin{aligned} \varDelta {\mathbf {V}}^{(l)}= {\left\{ \begin{array}{ll} \left( {\mathbf {V}}_o-{\mathbf {V}}_T\right) \odot f'_l\left( {\mathbf {V}}_z\right) ,\text { if }l=L \\ \left( \left( {\mathbf {W}}^{(l+1)}\right) ^\mathrm{T}\varDelta {\mathbf {V}}^{(l+1)}\right) \odot f'_l\left( {\mathbf {V}}^{(l)}_z\right) ,l<L\nonumber \end{array}\right. }\\ \end{aligned}$$
(26)
Table 1 Simulation parameters
Full size table
where ${\mathbf {W}}^{(l+1)}$ is the weight matrix of the $(l+1)$th layer, L is the number of layers, $f'_l(\cdot )$ is the derivation of the activation function in the lth layer, ${\mathbf {V}}^{(l)}_z$ is the unactivated output voltage vector, and $\odot $ is the element-wise multiplication. The backpropagation can be implemented through the memristor crossbar with S being low level, and now the columns are fed error voltages and the rows output propagated values.

For the convolution layer, there is
$$\begin{aligned} \varDelta {\mathbf {V}}^{(l)}=\varDelta {\mathbf {V}}^{(l+1)}\otimes \mathbf {rot180}\left( {\mathbf {W}}^{(l+1)}\right) \odot f'_l\left( {\mathbf {V}}^{(l)}_z\right) ,\nonumber \\ \end{aligned}$$
(27)
where ${\mathbf {W}}^{(l+1)}$ is one kernel in the $(l+1)$th layer, $\varDelta {\mathbf {V}}^{(l+1)}$ is the corresponding receptive field in the error matrix, $\otimes $ is the convolution operation, and $\mathbf {rot180}(\cdot )$ is the function to rotate the matrix 180 degrees. The backpropagation is implemented through weights read out from the crossbar.
4.
Determine the weight update values. For the FC layer
$$\begin{aligned} \varDelta {\mathbf {W}}^{(l)}=\varDelta {\mathbf {V}}^{(l)}\left( {\mathbf {V}}^{(l-1)}_o\right) ^\mathrm{T}, \end{aligned}$$
(28)
where ${\mathbf {V}}^{(l-1)}_o$ is the output voltage vector of the $(l-1)$th layer.

For the convolution layer
$$\begin{aligned} \varDelta {\mathbf {W}}^{(l)}={\mathbf {V}}^{(l-1)}_o\otimes \varDelta {\mathbf {V}}^{(l)}. \end{aligned}$$
(29)
5.
Determine desired writing voltages. For the fixed-voltage update, the pulse durations of writing voltages are
$$\begin{aligned} {\mathbf {t}}_{inc}&=\left( \varDelta {\mathbf {W}}\ge \sigma \right) \cdot t^+_0, \end{aligned}$$
(30)
$$\begin{aligned} {\mathbf {t}}_{dec}&=\left( \varDelta {\mathbf {W}}<-\sigma \right) \cdot t^-_0, \end{aligned}$$
(31)
where $t^+_0$ and $t^-_0$ are pulse durations of writing voltages for increasing and decreasing conductance, respectively, and amplitudes of these two voltages are $V^+_w$ and $V^-_w$, respectively.

For the approximately linear update, conductance update values of memristors are
$$\begin{aligned} \varDelta {\mathbf {G}}=\varDelta {\mathbf {W}}\cdot \ r_{gw}. \end{aligned}$$
(32)
Then desired pulse durations of writing voltages are
$$\begin{aligned}&{\mathbf {t}}_{inc}=\left[ \varDelta {\mathbf {G}}\ge \left( r_{gw}\cdot \sigma \right) \right] /k_r, \end{aligned}$$
(33)
$$\begin{aligned}&{\mathbf {t}}_{dec}=\left[ \varDelta {\mathbf {G}}<\left( -r_{gw}\cdot \sigma \right) \right] /k_d. \end{aligned}$$
(34)
6.
S is set low level. Apply desired writing voltages to memristors to update their conductance through the introduced weight update schemes.
7.
Repeat Step 2 to Step 6 until the loss is smaller than a predefined threshold value.

Because the main purpose of the paper is to evaluate the performance of the memristor-based DNNs constructed by the presented memristor crossbars along with the weight update scheme, the input and the intermediate data of the convolution operation are processed and stored in peripheral digital circuit, update values and so is the calculation of the desired conductance update values and durations of writing voltages. The BP process can also be achieved by analog circuit [15] and the conductance update value can also be determined by look up table (LUT) [49].

Simulations and analysis

MMNN and MCNN are built in this section by the presented memristor crossbars to carry out simulation experiments. The effectiveness of the circuits is substantiated in SPICE. The circuits and the learning process are also evaluated in Matlab under hardware defined constraints. The parameters of simulations are listed in Table 1. In the forward pass, the activation functions are the pseudo formulas (13), (14), (15), and in the backward pass, they are based on their original formulas.

Results of MMNN

Two-layer neural networks are built based on the memristor-based crossbar in Fig. 1 for XOR operation and digits recognition on MNIST (Modified National Institute of Standards and Technology) [44] dataset, respectively.

Table 2 The truth table of the XOR operation

Full size table

The MMNN for XOR operation has two input units, three hidden units, and one output unit [50] and is trained by the approximately linear update scheme. The activation function is the binary function. The truth table of XOR operation is shown in Table 2. Variations of input and output voltages with training cycles of the XOR operation is shown in Fig. 7. After about 36 training cycles, the network correctly performs the XOR operation. The power consumption of crossbars for XOR operation is measured 2.18 mW in SPICE in the inference phase. But the total consumed energy is very low because the inference time is very short, which is nanosecond scale. If smaller input voltages and memristors with larger resistance are adopted, the power consumption can be reduced further.

The MMNN for digits recognition on MNIST has 784 input units, 256 hidden units, and 10 output units. The MNIST dataset contains handwritten digits from 0 to 9. There are total 60,000 training samples and 10,000 test samples of ten classes. Samples of each class are shown in Fig. 8. The input values are first converted to voltages among $[-V_r,V_r]$ through digital to analog converters (DACs), and then they are input to the memristor crossbars. The curves of training accuracy, training loss, test accuracy, and test loss versus training epochs under the two kinds of weight update schemes are shown in Figs. 9, 10, respectively. The classification accuracy of the fixed-voltage update is 96.42% and that of the approximately linear update is 96.29%.

Results of MCNN

Table 3 The architecture of MCNN

Full size table

The architecture of MCNN in simulations is shown in Table 3 [39]. Conv1 is the first convolutional layer and Avgpool1 is the first average pooling layer. $5\times 5$, 6, $s=1$ means that the kernel size is $5\times 5$, the number of output channels is 6, and the convolution stride is 1.

MNIST and CIFAR-10 [16] datasets are adopted to substantiate the effectiveness of MCNN. CIFAR-10 is a widely used benchmark for image recognition. It contains 50,000 color training images and 10,000 test images of 10 classes, and samples of each class are shown in Fig. 11. The classification results of MCNN trained by the approximately linear update on the two datasets are shown in Figs. 12, 13, respectively. The final test accuracies of MNIST and CIFAR-10 are about 98.98% and 60.38%, respectively. MCNN is also trained by the fixed-voltage update scheme on MNIST and the test accuracy is 97.82%.

Table 4 Classification Results of MMNN

Full size table

Table 5 Classification results of MCNN

Full size table

Results analysis

Classification results of MMNN and MCNN on MNIST are listed in Tables 4 and 5 and they are obtained by running the multiple cross-validation. It is seen that MCNN has better results than MMNN, and it can also be seen that the approximately linear update performs better than the fixed-voltage update. The confusion matrices of classification results of MNIST and Fashion-MNIST obtained by MMNN and MCNN are shown in Fig. 14.

Robustness analysis

For the fixed-voltage update, the duration of the writing voltage has an impact on the performance. Test errors of MMNN trained on MNIST by writing voltages with different pulse durations are shown in Fig. 15. In Fig. 15, the duration time is that of the voltage to decrease conductance. It is seen that the test error becomes large if the pulse duration is very large.

Because of the inherent characteristics of memristors, there are cycle-to-cycle (C2C) and device-to-device (D2D) variations in conductance adjustment. And the errors of writing voltages also result in conductance variations. To evaluate impacts of these variations on the performance of the networks, Gaussian noises with means 0 and standard deviations from 0 to $12\%$ of the conductance value are considered as conductance variations in the approximately linear update. The conductance value after updating is $G_{new}=\left( G_{old}+\varDelta G\right) \left( 1+s\right) $ [15], where $G_{old}$ is the conductance before update and s is the noise level. Test errors of MMNN and MCNN under different variation levels are shown in Figs. 16, 17 respectively. It is seen that as the variation degree increases, the test error increases.

Computing complexity analysis

In the FC layer, the computation complexity of the vector-matrix multiplication is generally $O(N^2)$. In the memristor crossbar-based FC layer, the vector-matrix multiplication is performed with complexity O(1). Activation functions are also performed at the same time in the circuit. In the convolutional layer, an efficient way to perform the convolutional operation is to convert it to matrix multiplication whose complexity is generally $O(N^3)$. The conversion is also needed for the memristor crossbar-based convoltional operation, and it is realized outside the crossbar. The advantage of the memristor crossbar-based convolutional operation is that it reduces the complexity of the matrix multiplication to O(N). In the average pooling layer, the average pooling operation is performed with complexity O(1). The weights are stored in the conductance of memristors and the vector-matrix multiplication is performed in-memory. The intermediate data of the convolutional operation is stored in the external storage which could also be realized by the memristor array.

Comparisons

Table 6 Comparisons of MMNN with software-based and other memristor-based MNNs

Full size table

Table 7 Comparisons of MCNN with software-based and other memristor-based CNNs

Full size table

Comparisons of MMNN with software-based MNN and other memristor-based MNNs [19, 41, 48, 50] are shown in Table 6. Comparisons of MCNN with software-based CNN and other memristor-based CNNs [39, 41, 49] are shown in Table 7. It shows that the MMNN and MCNN built by the presented memristor-based crossbars and trained in-situ by the two kinds of weight update schemes have advantages in circuit functions and classification results compared with other memristor-based neural network circuits.

Conclusions

This paper presents memristor crossbar architectures for implementation of DNNs, which include architectures for the FC layer, convolutional operation, and average pooling operation. MMNN and MCNN are built to evaluate the performance of these memristor crossbar architectures. The networks are in-situ trained by two kinds of weight update schemes, which are the fixed-voltage update and the approximately linear update, and simulation results show that the networks trained by the weight update schemes result in satisfying performance. The robustness of MMNN and MCNN to conductance variations of memristors is also analyzed. In summary, the memristor-based DNNs constructed by presented memristor crossbars perform satisfactorily in pattern recognition tasks and have certain robustness to imperfections of hardware.

References

Adam K, Smagulova K, James AP (2018) Memristive LSTM network hardware architecture for time-series predictive modeling problems. In: IEEE Asia Pacific conference on circuits and systems. Chengdu, China, pp 459–462
Adam K, Smagulova K, Krestinskaya O, James AP (2018) Wafer quality inspection using memristive LSTM, ANN, DNN and HTM. In: IEEE electrical design of advanced packaging and systems symposium. Chandigarh, India
Adhikari SP, Yang C, Kim H, Chua LO (2012) Memristor bridge synapse-based neural network and its learning. IEEE Trans Neural Netw Learn Syst 23(9):1426–1435
Article Google Scholar
Alibart F, Zamanidoost E, Strukov DB (2013) Pattern classification by memristive crossbar circuits using ex situ and in situ training. Nat Commun 4:4
Cantley KD, Subramaniam A, Stiegler HJ, Chapman RA, Vogel EM (2012) Neural learning circuits utilizing nano-crystalline silicon transistors and memristors. IEEE Trans Neural Netw Learn Syst 23(4):565–573
Article Google Scholar
Chua LO (1971) Memristor-the missing circuit element. IEEE Trans Circ Theory 18(5):507–519
Article Google Scholar
Di Marco M, Forti M, Pancioni L (2017) Memristor standard cellular neural networks computing in the flux-charge domain. Neural Netw 93:152–164
Article Google Scholar
Gokmen T, Onen M, Haensch W (2017) Training deep convolutional neural networks with resistive cross-point devices. Front Neurosci 11:538
Article Google Scholar
Gokmen T, Rasch M, Haensch W (2018) Training LSTM networks with resistive cross-point devices. URL http://arxiv.org/abs/1806.00166
Hasan R, Taha TM (2014) Enabling back propagation training of memristor crossbar neuromorphic processors. In: International joint conference on neural networks. Beijing, China, pp 21–28
Hasan R, Taha TM, Yakopcic C (2017) On-chip training of memristor crossbar based multi-layer neural networks. Microelectron J 66:31–40
Article Google Scholar
Hu M, Graves CE, Li C, Li Y, Ge N, Montgomery E et al (2018) Memristor-based analog computation and neural network classification with a dot product engine. Adv Mater 30(9):1
Article Google Scholar
Krestinskaya O, Ibrayev T, James AP (2018) Hierarchical temporal memory features with memristor logic circuits for pattern recognition. IEEE Trans Comput Aided Des Integr Circ Syst 37(6):1143–1156
Article Google Scholar
Krestinskaya O, James AP (2018) Feature extraction without learning in an analog spatial pooler memristive-cmos circuit design of hierarchical temporal memory. Analog Integr Circ Signal Process 95(3):457–465
Article Google Scholar
Krestinskaya O, Salama KN, James AP (2019) Learning in memristive neural network architectures using analog backpropagation circuits. IEEE Trans Circ Syst I Reg Papers 66(2):719–732
Article Google Scholar
Krizhevsky A (2009) Learning multiple layers of features from tiny images
Kvatinsky S, Friedman EG, Kolodny A, Weiser UC (2013) TEAM: threshold adaptive memristor model. IEEE Trans Circ Syst I Reg Papers 60(1):211–221
Article MathSciNet Google Scholar
Kvatinsky S, Ramadan M, Friedman EG, Kolodny A (2015) VTEAM: a general model for voltage-controlled memristors. IEEE Trans Circ Syst II Exp Briefs 62(8):786–790
Google Scholar
Li C, Belkin D, Li Y, Yan P, Hu M, Ge N et al (2018) Efficient and self-adaptive in-situ learning in multilayer memristor neural networks. Nat Commun 9(1):1
Article Google Scholar
Li C, Wang Z, Rao M, Belkin D, Song W, Jiang H et al (2019) Long short-term memory networks in memristor crossbar arrays. Nat Mach Intell 1(1):49
Article Google Scholar
Liu J, Li Z, Tang Y, Hu W, Wu J (2020) 3D convolutional neural network based on memristor for video recognition. Pattern Recogn Lett 130:116–124
Article Google Scholar
Liu X, Huang Y, Zeng Z, Wunsch DC II (2020) Memristor-based HTM spatial pooler with on-device learning for pattern recognition. IEEE Trans Syst Man Cybern Syst. https://doi.org/10.1109/TSMC.2020.3035612
Article Google Scholar
Liu X, Zeng Z II, DCW, (2020) Memristor-based LSTM network with in situ training and its applications. Neural Netw 131:300–311
Liu X, Zeng Z, Wen S (2016) Implementation of memristive neural network with full-function Pavlov associative memory. IEEE Trans Circ SystI Reg Papers 63(9):1454–1463
Article MathSciNet Google Scholar
Pershin Y, Di Ventra M (2010) Experimental demonstration of associative memory with memristive neural networks. Neural Netw 23(7):881–886
Article Google Scholar
Shafiee A, Nag A, Muralimanohar N, Balasubramonian R, Strachan JP, Hu M, Williams RS, Srikumar V (2016) ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In: ACM/IEEE 43rd annual international symposium on computer architecture. Seoul, South Korea, pp 14–26
Silva F, Sanz M, Seixas J, Solano E, Omar Y (2020) Perceptrons from memristors. Neural Netw 122:273–278
Article Google Scholar
Smagulova K, Adam K, Krestinskaya O, James AP (2018) Design of CMOS-memristor circuits for LSTM architecture. In: IEEE International conferences on electron devices and solid-state circuits. Shenzhen, China
Smagulova K, Krestinskaya O, James AP (2018) A memristor-based long short term memory circuit. Analog Integ Circ Signal Process 95(3):467–472
Article Google Scholar
Soudry D, Di Castro D, Gal A, Kolodny A, Kvatinsky S (2015) Memristor-based multilayer neural networks with online gradient descent training. IEEE Trans Neural Netw Learn Syst 26(10):2408–2421
Article MathSciNet Google Scholar
Strukov DB, Snider GS, Stewart DR, Williams RS (2008) The missing memristor found. Nature 453(7191):80
Article Google Scholar
Truong SN, Min KS (2014) New memristor-based crossbar array architecture with 50-% area reduction and 48-% power saving for matrix-vector multiplication of analog neuromorphic computing. J Semicond Technol Sci 14(3):356–363
Article Google Scholar
Truong SN, Van Pham K, Min KS (2018) Spatial-pooling memristor crossbar converting sensory information to sparse distributed representation of cortical neurons. IEEE Trans Nanotechnol 17(3):482–491
Article Google Scholar
Wang Z, Wang X (2018) A novel memristor-based circuit implementation of full-function Pavlov associative memory accorded with biological feature. IEEE Trans Circ Syst I Reg Papers 65(7):2210–2220
Article Google Scholar
Wen S, Wei H, Yang Y, Guo Z, Zeng Z, Huang T, Chen Y (2019) Memristive LSTM network for sentiment analysis. IEEE Trans Syst Man Cybern Syst. https://doi.org/10.1109/TSMC.2019.2906098
Article Google Scholar
Wen S, Wei H, Zeng Z, Huang T (2018) Memristive fully convolutional network: An accurate hardware image-segmentor in deep learning. IEEE Trans Emerg Top Comput Intell 2(5):324–334
Article Google Scholar
Wen S, Xie X, Yan Z, Huang T, Zeng Z (2018) General memristor with applications in multilayer neural networks. Neural Netw 103:142–149
Article Google Scholar
Xie X, Wen S, Zeng Z, Huang T (2018) Memristor-based circuit implementation of pulse-coupled neural network with dynamical threshold generators. Neurocomputing 284:10–16
Article Google Scholar
Yakopcic C, Alom MZ, Taha TM (2016) Memristor crossbar deep network implementation based on a convolutional neural network. In: International joint conference on neural networks. Vancouver, Canada, pp 963–970
Yakopcic C, Alom MZ, Taha TM (2017) Extremely parallel memristor crossbar architecture for convolutional neural network implementation. In: International joint conference on neural networks. Anchorage, USA, pp 1696–1703
Yakopcic C, Hasan R, Taha TM (2015) Memristor based neuromorphic circuit for ex-situ training of multi-layer neural network algorithms. In: International joint conference on neural networks. Killarney, Ireland
Yang L, Zeng Z, Shi X (2019) A memristor-based neural network circuit with synchronous weight adjustment. Neurocomputing 363:114–124
Article Google Scholar
Yang L, Zeng Z, Wen S (2018) A full-function Pavlov associative memory implementation with memristance changing circuit. Neurocomputing 272:513–519
Article Google Scholar
Yann L, Léon B, Yoshua B, Patrick H (1998) Gradient-based learning applied to document recognition. Proceed IEEE 86(11):2278–2324
Article Google Scholar
Yao P, Wu H, Gao B, Eryilmaz SB, Huang X, Zhang W et al (2017) Face classification using electronic synapses. Nature Commun 8:1
Article Google Scholar
Yao P, Wu H, Gao B, Tang J, Zhang Q, Zhang W, Yang JJ, Qian H (2020) Fully hardware-implemented memristor convolutional neural network. Nature 577(7792):641–646
Article Google Scholar
Zamanidoost E, Bayat FM, Strukov D, Perceptron AMl (2015) Manhattan rule training for memristive crossbar circuit pattern classifiers. In: IEEE International symposium on intelligent signal processing, pp 1–6. Siena, Italy
Zhang Q, Wu H, Yao P, Zhang W, Gao B (2018) Sign backpropagation: an on-chip learning algorithm for analog RRAM neuromorphic computing systems. Neural Netw 108:217–223
Article Google Scholar
Zhang Y, Cui M, Shen L, Zeng Z (2019) Memristive quantized neural networks: a novel approach to accelerate deep learning on-chip. IEEE Trans Cybern 1:1–13
Google Scholar
Zhang Y, Wang X, Friedman EG (2018) Memristor-based circuit design for multilayer neural networks. IEEE Trans Circ Syst I Reg Papers 65(2):677–686
Article Google Scholar
Zhang Y, Wang X, Li Y, Friedman EG (2017) Memristive model for synaptic circuits. IEEE Trans Circ Syst II Exp Briefs 64(7):767–771
Google Scholar
Zhang Y, Zeng Z, Wen S (2014) Implementation of memristive neural networks with spike-rate-dependent plasticity synapses. In: International joint conference on neural networks, pp 2226–2233
Zhu S, Wang L, Duan S (2017) Memristive pulse coupled neural network with applications in medical image processing. Neurocomputing 227:149–157
Article Google Scholar
Zyarah AM, Kudithipudi D (2019) Neuromemrisitive architecture of HTM with on-device learning and neurogenesis. ACM J Emerg Technol Comput Syst 15(3):24
Article Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Image Processing and Intelligent Control of Education Ministry of China, Wuhan, China
Xiaoyang Liu & Zhigang Zeng
School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, 430074, China
Xiaoyang Liu & Zhigang Zeng

Authors

Xiaoyang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhigang Zeng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhigang Zeng.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, X., Zeng, Z. Memristor crossbar architectures for implementing deep neural networks. Complex Intell. Syst. 8, 787–802 (2022). https://doi.org/10.1007/s40747-021-00282-4

Download citation

Received: 13 October 2020
Accepted: 20 January 2021
Published: 20 July 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s40747-021-00282-4

Memristor crossbar architectures for implementing deep neural networks

Abstract

Similar content being viewed by others

An efficient full-size convolutional computing method based on memristor crossbar

Purely self-rectifying memristor-based passive crossbar array for artificial neural network accelerators

Fully hardware-implemented memristor convolutional neural network

Introduction