International Journal of Fuzzy Logic and Intelligent Systems 2021; 21(4): 317-337
Published online December 25, 2021
https://doi.org/10.5391/IJFIS.2021.21.4.317
© The Korean Institute of Intelligent Systems
Chan Sik Han and Keon Myung Lee
Department of Computer Science, Chungbuk National University, Cheongju, Korea
Correspondence to :
Keon Myung Lee (kmlee@cbnu.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Spiking neural networks (SNNs) have attracted attention as the third generation of neural networks for their promising characteristics of energy-efficiency and biological plausibility. The diversity of spiking neuron models and architectures have made various learning algorithms developed. This paper provides a gentle survey of SNNs to give an overview of what they are and how they are trained. It first presents how biological neurons works and how they are mathematically modelled specially in differential equations. Next it categorizes the learning algorithms of SNNs into groups and presents how their representative algorithms work. Then it briefly describe the neuromorphic hardware on which SNNs run.
Keywords: Spiking neural network, Deep learning, Neural network, Machine learning, Learning algorithms
Deep neural networks (DNNs) have made great successes in various formerly difficult tasks such as vision, speech, natural language, and games [1–5]. They have many layers and parameters and thus require huge computing resources, resulting in high energy consumption. Spiking neural networks (SNNs) are computation models which mimic biological neural networks in a more similar way than artificial neural networks (ANNs) [6]. In SNNs, neurons called spiking neurons receive spikes from other neurons and generates spikes. The SNNs can be implemented in an energy-efficient way since they are operated according to spikes sparsely generated. With this expectation, machine learning people have made efforts to develop SNNs with comparable performance to DNNs [7]. On the other hand, in neuroscience, researchers have been interested in simulating brain-scale SNNs to study brain functions. The diversity of neuron models for biological neurons, spike coding methods, and network architectures have made various learning algorithms developed for SNNs.
This survey focuses on the SNNs in the perspective of machine learning. In Section 2, for the better understanding of SNNs, we first explain the structure of biological neurons and their behaviors, and then several mathematical models of spiking neurons. In conventional ANNs and DNNs, a simple neuron model is used of which operation is expressed as the application of activation function to the weighted sum of its input. On the contrary, in SNNs, spike neurons can be modelled in various ways using differential equations with several hyperparameters like threshold level, time constant, refractory time, latency, and so on. We also present the notion of synaptic plasticity which is used to model local learning occurred in SNNs. Spiking neurons receive and generate a spike or a spike train. Hence all input and output for training should be encoded in spikes.
We present the neuron coding methods for input and output of SNNs. Then we describe how to simulate SNNs in software and what issues there are in the simulation.
In Section 3, we deal with learning methods for SNNs. They can be grouped into the unsupervised methods and the supervised methods. The unsupervised methods include some local learning algorithms. The supervised methods can be categorized into the direct methods, the ANN-SNN conversion methods, and the hybrid methods. The direct methods train directly the SNNs despite of the nonlinearity of the SNN activation functions using some surrogate functions or imposed restrictions. The ANN-SNN methods first train an ANN, and then convert the trained ANN into an SNN with the same architecture as the ANN. The hybrid methods first apply an ANN-SNN method and then fine-tune the converted SNN with a direct method.
Section 4 briefly reviews the neuromorphic hardware which have been developed for neuroscience studies and machine learning. Despite of various publications of neuromorphic hardware, there are few neuromorphic hardware on the market. In Section 5, we draw the conclusions along with the current research trends.
This section presents the behaviors of biological neurons, the computational models of spiking neurons, the notion of synaptic plasticity for brain learning, the architectures of SNNs, neural coding methods for input/output representations, and simulation of SNNs.
Brains consist of a large number of nerve cells called neurons (e.g., approximately 86 billions in the human brain) which are heavily interconnected each other [8]. Neurons communicate with each other via spikes also known as action potentials [9]. Action potentials are the electrical impulses which have a short duration of around 10 ms in brain. A neuron has three main components: dendrites, an axon, and a cell body called soma [10]. Dendrites are tree-shaped component which receive signals from axon of other neurons. Upon arriving spikes through dendrites, a soma increases or decreases its membrane potential and sends an action potential down an axon when the membrane potential reaches its threshold level. Axons are long nerve fiber which conducts action potentials away from the soma. When an action potential is transmitted through an axon, synaptic vesicles move towards an axon terminal and release neurotransmitters in the synaptic cleft [11]. The synaptic cleft is the small gap that separates the presynaptic and postsynaptic neurons. A synapse is the junction between the axon of one neuron and a dendrite of another, through which the two neurons communicate with neurotransmitters. Neurotransmitters are chemicals that influence a receiving neuron either to promote the generation of spikes or to prevent it by binding with their corresponding receptors of the dendrite of the receiving neuron [12]. Such bindings open or close the ion channels which cause the membrane potential to change. Synaptic weight refers to the connection strength at a synapse which is determined by the amount of released neurotransmitters, the amount of receptors to absorb, and the signal propagation resistance in the axons and the dendrite.
Membrane potential experiences a sequence of state phases when generating an action potential: resting state, depolarization, repolarization, and hyperpolarization, as shown in Figure 1 [13]. There exists the imbalance of electrical charges between the interior of neurons and their surroundings in a resting state. The resting state is the ground value of trans-membrane voltage which is negatively charged potential, approximately −70 mV. When neurotransmitter glutamate binds with the AMPA receptor, the sodium channels open, which results in a large influx of sodium ions. It causes a rapid rise of the membrane potential and when the membrane potential reaches the threshold level, the action potential starts to be generated. This phase of a rise of potential from negative towards positive is called depolarization. When membrane potential increases sufficiently enough to open the potassium channels, potassium ions start to move out of the neuron, which results in a rapid drop of membrane potential. The phase of such potential drop is called repolarization because membrane potential gets negatived charged back. Action potential is shaped during the phase shift from depolarization and repolarization. Slow close of potassium channels make an undershoot of membrane potential, which results in a period of potential lower than the potential of resting potential. The phase of such lower potential is called hyperpolarization. The period of hyperpolarization is called refractory time. The hyperpolarized potential gradually returns to the resting potential state by ion movement through the membrane.
Membrane potential increases when a spike is received through an excitatory synapse, whereas it decreases when a spike is received through an inhibitory synapse. Membrane potential leaks away exponentially over time.
Several mathematical models for spiking neurons have been developed to describe the characteristics of membrane potential change as shown in Figure 1. A spiking neurons receives spikes, accumulates them into its membrane potential, generates a spike when its potential reaches the threshold, and consumes its potential. In the spiking neuron models, there are Hodgkin-Huxley model, leaky integrate- and-fire (LIF) model, integrateand-fire (IF) model, spike response model (SRM), Izhikevich’s model, FitzHugh-Nagumo (FHN) model, and so on.
The Hodgkin-Huxley model [14] describes membrane potential in terms of sodium channel potential, potassium channel potential, and leak potential. For a neuron with sodium and potassium channels, it models the total current
where
LIF model [15] describes the membrane potential without introducing channel potentials in a simpler way than the Hodgkin-Huxley model. The LIF model integrates its injected current into membrane potential but allows membrane potential to slowly leak over time.
The LIF model can be expressed in the electric circuit shown in Figure 3. Its membrane potential
where
where
Due to the integral term of
The IF model is a simplified version of the LIF model, in which the leakage of membrane potential is ignored. It is somewhat weak in expressing the behavior of membrane potential in biological neurons, yet its simplicity can be beneficial in the computational and implementation aspect, especially in hardware. The difference equation for the IF model is as follows:
The SRM is a generalization of the LIF model of which equation is formulated using filters instead of differential equation, as follows [16]:
where
The Izhikevich model [17] is a generalized neuron model that can generate most recognized firing patterns of biological neurons, which is as biologically plausible as the Hodgkin–Huxley model, yet as computationally efficient as the IF model. The model is expressed in the following two differential equations:
with the auxiliary after-spike resetting as follows:
where
The FHN model is a simplified model of the Hodgkin-Huxley model, which is expressed in the following differential equations [18]:
where
There are variants of the above-mentioned neuron models and other models. They usually use Hodgkin-Huxley model, Izhikevich’s model, FHN model, and their variants in neuroscience, whereas they usually use LIF mode, IF model, and SRM in machine learning.
Synaptic strengths, i.e., synaptic weights, between neurons strongly affect the behaviors of the brain. Synaptic plasticity is the ability of synapses to strengthen or weaken over time, which is widely believed to contribute to learning and memory in the brain. Spike-timing-dependent plasticity (STDP) is a biological process that adjusts synaptic weight by tight temporal correlations between the spikes of presynaptic and postsynaptic neurons [19]. There are two major phenomena in synaptic plasticity: long-term potentiation (LTP) and long-term depression (LTD) [20]. LTP is a persistent strengthening of synapses, whereas LTD is a persistent weakening of synapses, on recent patterns of spikes between presynaptic and postsynaptic neurons.
According to STDP, repeated presynaptic spike arrival a few milliseconds prior to postsynaptic spikes leads to LTP of the synapses in many synapse types. On the other hand, repeated presynaptic spike arrival after postsynaptic spikes leads to LTD of the synapses. Figure 5 shows an STDP function that plots the change of synaptic weight as a function of the relative timing of presynaptic and postsynaptic spikes. In the figure,
Neuroscientists have paid attention to how to model the STDP because STDP seems to play key roles in learning and information storage in the brain. There have been proposed several mathematical models for STDP. Machine learning people have also tried to apply STDP to train SNN models.
SNNs are ANNs that consist of spiking neurons, which can more closely mimic biological neural networks. SNNs incorporate the concept of time into their operations in which operations are carried out in a spiking time-dependent manner. In conventional ANNs, neurons transmit information at each propagation cycle regardless of their activation value. On the contrary, in SNNs, neurons transmit information (i.e., spikes) to postsynaptic neurons only when their membrane potential reaches the threshold level. That is, only when a spike is generated at a neuron, it propagates the spike into its postsynaptic neurons. This makes it possible to save the energy consumption because only pulse-shaped spikes are propagated only when spikes are generated. SNNs are called the third generation neuron networks, which have attracted attention as an energy-efficient neural network model in engineering aspects [21].
In SNNs, spiking neurons are connected with synapses by which the transmitted spikes are either amplified or attenuated. There are two types of synapses: excitatory and inhibitory synapses [22]. When a spike is propagated through an excitatory synapse, the membrane potential of the receiving neuron is increased. When a neuron receives a spike through an inhibitory synapse, its membrane potential is decreased.
SNNs can be organized into layered architectures, recurrent architectures, or hybrid architectures [23]. In a hybrid architecture, some subpopulations are layered, and others are recurrent. As the hybrid architectures, there are Synfire chain, liquid state machine (LSM), and so on. A Synfire chain has a multi-layered architecture each of which layer is organized into a recurrent network, as shown in Figure 6 [24]. A LSM is a large, recurrent network of spiking neurons some of which receive inputs and some of which read out their value as the output values as shown in Figure 7 [25]. In an LSM, the weights connected to output neurons are to be trained and all other weights are initialized and fixed.
Neuroscientists have had interest in building nearly human brain-sized neural networks and analyzing their behaviors to understand the mechanisms of various functions in the brain [26]. They have dealt with complex SNNs which are organized into recurrent architectures or hybrid architectures. For such complex networks, there are not yet successful learning algorithms except the STDP or its variants. STDP and its variants are not powerful enough to train complex SNNs. Hence, neuroscientists are usually not so much interested in learning the SNNs. They simulate the operations of huge SNNs and analyze their characteristic patterns. Machine learning model developers are interested in developing SNN models with high accuracy and low energy consumption. They have been usually dealing with layered SNN models and have developed various learning algorithms for SNNs.
In SNNs, spiking neurons receive and produce spike signals. Hence all inputs to SNNs should be encoded into a spike or a spike train which is a sequence of spikes spread over the time dimension. On the other hand, the outputs of SNNs are a spike or a spike train. Such outputs are also needed to be decoded into an understandable format like scalar values. These kinds of encoding and decoding are called neural coding.
A spike occurring at time point
where
The neural encoding methods can be classified into rate coding, temporal coding, population coding, and direct input coding [23]. In rate coding, a value is represented as a spike train of which firing rate is proportional to the value [27]. To generate a spike train of a specific firing rate, we can distribute uniformly as many spikes as (the firing rate) × (the latency) with some random perturbation to the occurrence time of spikes, over the time span of the latency. The latency indicates the number of time steps during which a spike train is presented. To generate a spike train, we can also use a Poisson distribution of which mean corresponds to the firing rate [28]. In addition, we can use a stochastic encoding method which normalizes input values into the interval [0, 1] and uses the normalized values as the probability to generate a spike at each time step.
The temporal coding methods generate a single spike at a specific time point to represent an input value. In the temporal coding methods, there are time-to-first spike code, rank code, and so on [23]. The time-to-first code (a.k.a, latency code) represents a value as such a spike that the larger the input value, the earlier the onset of the induced spike. In the rank code (a.k.a. spike-order code), all the values are first sorted in the decreasing order, and then each of those values is assigned a spike time in a way that the order of the assigned spike times follows the order of those value. The population code makes a population (i.e., set) of input neurons take care of an input value together [29]. Each neuron has its own receptive field of which sensitivity is somewhat a Gaussian function-based as shown in Figure 6. Once an input value is given, each input neuron generates a spike at the time corresponding to its Gaussian-like function value to the input.
In direct input coding [30], we use the input value as a constant current to its corresponding spiking neuron without generation of spikes. The input values are usually normalized into a specific interval like [0, 1].
When the neurons of the output layer come to have the final membrane potentials, they produces spike(s) as the output. To interpret the output spike(s), we use the decoding methods such as spike counting, rate-based coding, temporal coding, and so on. For the SNNs of regression tasks, the spike counting method regards the count of spikes in the output nodes as the estimated value. For the SNNs of classification tasks, the rate-based decoding method selects the node label with the maximum frequency of spikes as the output. For SNNs of which the output neurons generate at most one spike, the node label with the earliest firing time is selected as the class label. Some SNNs use as the output value the membrane potential of output neurons without generating spikes [31,32]. They directly apply the softmax function to the membrane potential values in order to get the probability of classes.
When a spike train is fed into an LIF neuron, the membrane potential of the neuron is expressed in the following differential equation:
where
SNNs can be executed on a neuromorphic hardware which is specialized for SNN model execution, or simulated in software. Only a few neuromorphic hardware are available on the market. Most of them support a limited architecture of SNNs. Hence, in the training and testing phases, software simulations are widely used. The simulation methods can be categorized into synchronous simulation and asynchronous simulation.
In synchronous simulations, also known as clock-driven simulations, all neurons are updated simultaneously at every tick of a clock. In asynchronous simulations, also known as event-driven simulations, neurons are updated only when they receive or produce a spike [33]. In the simulations, spikes are represented as 0 and 1, and neurons have a variable to keep the value of their membrane potential which is updated by the differential or difference equations corresponding to the adopted neuron model.
The differential equations can be rather simple as in
When an SNN processes its input, the input is fed into an SNN with a specified latency. A spike pattern for the input is produced by the adopted neural encoding scheme. In synchronous simulations, it is usually assumed that at each time step (i.e., a tick of clock) an input signal passes through the entire SNN from the input neurons to the output neurons. Machine learning applications usually use the synchronous simulations for training SNN models, whereas neuroscience studies mainly use asynchronous simulations for investigating the brain functions. Input to an SNN is given in a spike or spike train generated by the neural encode scheme. Such spike or spike train is generated either earlier on or on the fly depending on the adopted encoding scheme. When a uniform or Poisson distribution is used to generate a spike train, it is needed to generate the entire spike train earlier on to feed the input to an SNN. When a stochastic encoding method is used, the spikes of a spike train can be, on the fly, generated and fed into the input of an SNN.
When an SNN model is organized for a simulation, the ensemble and connection paradigm can be used. The ensemble component is used to represent a group of neurons that operate together. The connection component is used to connect an ensemble to another or to the same ensemble. When a connection is made from an ensemble to itself, the ensemble becomes a recurrent network. When ensembles are arranged in a chain with connections, a multi-layered SNN is formed.
For the simulations of SNNs, there are various hyperparameters to control the behaviors of spiking neurons as follows: resting potential, minimum potential, threshold level, spike potential, refractory period, membrane time constant, latency, and so on. Simpler models for machine learning applications have a few hyperparameters, while complicated models for neuroscience studies have more hyperparameters. As the markup languages for exchanging the SNN models across the platforms, there are NeuroML, ONNX, NNEF, and so on. As the simulator-independent languages for designing and simulating SNNs, there are PyNN, EDLUT, and Nengo. There are also domain-specific languages such as OptiML and Corelet.
The learning methods for SNNs can be roughly categorized into unsupervised learning and supervised learning.
The typical unsupervised learning method for an SNN is a local training method such as STDP method. As shown in Figure 5, the STDP algorithm adjusts synaptic weights in such a way that synaptic plasticity is controlled by timing difference between presynaptic and postsynaptic neurons’ spike times. There are several variants of STDP algorithms. The vanilla STDP method uses the following update quantity Δ
where
where
There are several modifications for the vanilla STDP method. The following one is the weight change by a modified STDP method which does not use the exponential function of
where
There are other variants of the STDP method as follows [34]:
where
where
The Bienenstock-Cooper-Munro (BCM) rule is an unsupervised learning method with which weights are modified depending on the rates of the presynaptic and postsynaptic spikes [35]. Its update rule can be expressed as follows:
where
Both STDP and BCM rules are biologically-plausible unsupervised training algorithms and relatively easy to implement, yet usually not easy to apply to train high accuracy models having multiple layers.
Diehl and Cook [34] applied the STDP rule to train a handwritten digit recognition SNN which consists of one excitatory neuron layer and an inhibitory neuron layer. Each excitatory neuron receives the spikes for all the pixels as the input, that were encoded using the Poisson distribution-based method. An excitatory neuron has a connection to only one inhibitory neuron while an inhibitory neuron is connected to all excitatory neurons except the one with incoming connection to itself. All synaptic weights from input neurons to excitatory neurons are learned using STDP of
Kheradpisheh et al. [36] used the STDP rule to train a spiking deep convolutional neural networks for object recognition. Their network has a multi-layered architecture in which convolution layer and pooling layer are interleaved, and the feature vector of the last pooling is used as the input to an support vector machine (SVM) classifier model. For an input image, there is a temporal coding cell for each pixel location. The temporal coding cells first apply the difference of Gaussian (DoG) filter to the input image, and then converts each computed contrast into a spike according to the rank order encoding method. The learning for convolutional layers is carried out layer-by-layer using the STDP rule. The last layer is the global max pooling layer applied to each channel of its preceding convolutional layer. The results of global max pooling are used as the input to a linear SVM classifier model.
The supervised learning methods can be categorized into direct training, ANN-SNN conversion, and hybrid training methods. In the direct training approach, the training methods use differentiable surrogate function in place of the discrete activation function during the training phase, and apply a gradient-based optimization technique with the surrogate function. In the ANN-SNN conversion approach, a conventional ANN is first trained for the given training data, and its weights are then used to set the weights of an SNN of the same architecture. In the hybrid approach, we first use the ANN-SNN approach to initialize the weights of an SNN, and then fine-tune the SNN with a direct training method.
The direct training methods make use of inherent characteristics of spike neurons such as spike timing. Table 1 summarizes some direct training methods in terms of their neuron type, architecture, input encoding method, output decoding method, and unique features. In the early days of SNN studies, the direct methods had attracted attention which try to mimic biological behaviors. Later algorithms has paid more attention to apply conventional neural networks’ techniques like gradient-based optimization to SNNs.
SpikePro [37] is a training algorithm for a shallow SNN in which input is encoded using the population code, output is a single spike coded by the time-to-first code, the neurons are the SRM type, and each connection is made of multiple synaptic paths with different fixed delays and trainable weights as shown in Figure 9. Because the output is given in a single spike, the objective of training is to make the actual spike time as close as possible to the desired spike time. Hence the loss function is defined as,
A remote supervised method (ReSuMe) [38] is a training algorithm for an SNN which consists of a front subnetwork and a following output layer, where the front subnetwork can be either feedforward, recurrent, or hybrid network like LSM shown in Figure 7. In ReSuMe, an SNN receives and generates a spike train, and interestingly there are teacher neurons each of which provides the information of desired spike timings for its corresponding output neuron. The weights for the output neurons are trained and the teacher neurons are not connected into the SNN although they provide such supervising information as shown in Figure 10. ReSuMe adjusts the weights so as to make the spike train generated by the SNN similar to the spike train presented by the teacher neurons. For a connection weight
where
The neural engineering framework (NEF) is a general methodology for building large-scale, biologically plausible, neural models of cognition [39]. It represents a vector
where
Two populations of neurons can be connected to do linear or nonlinear transformation of a vector represented by the preceding population into a vector represented by the following population. When such a connection is made, the connection weight
Under the assumption that
When
Backpropagation spike-timing-dependent plasticity (BP-STDP) [32] is an STDP-based training algorithm for multilayered SNNs which may contain convolutional layers as shown in Figure 11. For the SNNs, input vectors are encoded into spike trains in which the numbers of spikes (i.e., spike counts) correspond to scalar values in the input vector. The output of the SNNs is a spike train of which spike count is the output value. The loss function for BP-STDP is the mean square error for the differences of the output spike counts of an SNN and the desired output values. BP-STDP assumes the time step size of duration
BP-STDP uses the following weight change Δ
where
In the above equation,
where
Spatio-temporal backpropagation (STBP) [42] is a direct training algorithm for a shallow fully connected or convolutional SNN with LIF neurons, which receives and generates spike trains and regards the firing rate of output spikes at the output layer as the inferred output value. It pays an attention to the spatial and temporal domains in the execution of an SNN. In the spatial domain, an SNN processes its incoming spike signals from the preceding layer in a layer-by-layer manner. In the temporal domain, an SNN repeatedly updates the states of neurons during the execution latency. This temporal domain aspect is closely related to the execution of recurrent neural networks (RNNs). STBP is a gradient-based training rule derived in a similar manner to backpropagation through time (BPTT) for RNNs. It uses some surrogate gradients for the non-differentiable activation function
Spike-based backpropagation (SBBP) [43] is a training algorithm for an LIF-based SNN model which has no bias in its neurons, receives spike trains generated by the Poisson distribution-based rate coding method, and uses the average membrane potentials of the output neurons as the output values. Such SNNs consist of front-end convolutional layers with average pooling, and back-end fully-connected layers. Each convolution operation generates a spike only when the computed membrane potential is greater than or equal to the specified threshold level. Only when an average pooling result is greater than or equal to the specified threshold level, a spike is generated as the pooling value. Because the output of the SNNs is a scalar value (i.e., average membrane potential), the loss function
where
where
The ANN-SNN conversion methods first train an ANN model and then fine-tune an SNN model of the same architecture as the trained ANN model, of which weights are initialized with the weights of the ANN model. Table 2 summarizes the characteristics of some ANN-SNN conversion methods.
Hunsberger and Eliasmith’s method [44] first trains an ANN which may contain convolutional layers and average pooling, and consists of its neurons with no bias terms. The ANN uses the so-called soft-LIF activation function instead of ReLU. The soft-LIF is a firing rate function of input current similar to that of LIF as shown in Figure 13. The soft-LIF firing rate function is differentiable while the LIF firing rate function is not. The soft-LIF function is defined as follows:
where
Cao et al.’s method [45] first trains a tailored CNN model in which all layers produce positive values, all neurons at convolutional layers and fully connected layers have no biases, and average pooling is used instead of max pooling. For the tailed CNN model, an SNN is organized with the same architecture, which uses IF neurons with no bias terms, receives and generates spike trains, and uses average pooling, if any. Once the CNN model is trained, its weights are used to initialize the weights of the organized SNN.
Diehl et al.’s method [46] is an improvement of Cao et al.’s method [45] which first trains an ANN model, and normalizes its weights before deploying them into an SNN model of the same architecture. When the weights of a trained ANN model are directly used as those of its corresponding SNN model, the neurons of the SNN model may get insufficient membrane potential to reach the threshold level. In addition, some membrane potentials are too large to generate just a single spike. To handle these issues, the weight normalization techniques have been developed. Among them, there are the model-based normalization and the data-based normalization [46]. In the model-based method, weights are normalized in a layer-wise manner, which are divided by the maximum of the positive weights. In the data-based method, we choose as the scaling factors the maximum of the activation of training data for each neuron of the trained ANN model. Then we divide the weights to each neuron by its scaling factor or use the scaling factor as the threshold level of the corresponding neuron for the SNN model. Those weight normalization methods have shown that such a weight-normalized SNN model gives better performance than the baseline model with no weight normalization.
In [30], the authors proposed an algorithm to first train an ANN having some architectural restrictions, and then convert it into an SNN model of which neurons are IF neurons with bias terms. It has established a theoretic foundation for the relationship between LIF activation and firing rate of spiking neurons. It supports the following two reset modes for membrane potential at spike generation: reset-to-zero mode and reset-by-subtraction mode. The transformed SNN model allows its neurons to have bias terms, uses the input as an input current to the neurons of the first layer, uses max-pooling by using a gating function, and generates spikes according to the softmax probability at the output layer. When weights are transferred from the trained ANN model to the SNN, it uses a slightly-modified data-based weight normalization method.
Whetstone method [47] is a process to train binary, threshold-activation SNNs using the existing deep learning methods. It first trains an ANN until performance makes no improvement. Then, it progressively sharpens the activation function toward a step activation at each layer one at a time, beginning from the input layer, while managing performance. The sharpening process is automated with an adaptive sharpening schedule. As the activation function, it uses the bounded rectified linear unit (bReLU)
As
Sengupta et al.’s method [48] is an ANN-SNN conversion method of which SNN models have IF neurons with no bias terms, and may include the average pooling and the identity skip connections for deeper networks. The method first trains an ANN, next initializes an SNN of the same architecture with the trained weights, then do the threshold balancing to adjust the threshold level of spiking neurons. For threshold balancing, it uses Spike-Norm [48] which can be regarded as an improvement of Diehl et al.’s normalization method [46]. Spike-Norm chooses as the scaling factor for each neuron the maximum membrane potential for the training data in the converted SNN. Then it uses the scaling factor as the threshold value of spiking neurons.
RMP-SNN method [49] is an ANN-SNN conversion method where an SNN model uses IF neurons with soft-reset [57]. Hardreset indicates a mechanism to reset the membrane potential of a spike neuron to a pre-specified low potential just after generating a spike when the membrane potential reaches the threshold level. On the other hand, soft-reset (a.k.a., reset by subtraction) is a mechanism that keeps the residual potential above the firing threshold just after generating a spike.
The trained ANN uses ReLU activation function, and its weights are transferred to an SNN of the same architecture. A neuron with ReLU function produces the output proportional to its weighted input sum, but a spiking neuron with hard-reset usually does not produce spikes of which rate is proportioned to its membrane potential. It is assumed that the conversion loss of ANN-SNN conversion is caused by the nonlinearity between membrane potential and the spiking rate in an IF neuron with the hard-reset. In the RMP-SNN method, an SNN consists of IF neurons with soft-reset, which receives spike trains and generates spike trains. To guarantee the linearity between membrane potential
where
Deng and Gu [50] paid attention to the conversion loss from an ANN to its corresponding SNN in terms of activation function and reset operation after spike generation. To begin with, they adopt IF neurons with soft-reset [57] to reduce the loss. In addition, they observed that the shifted threshold ReLU of shift
Ding et al. [51] introduced a weight normalization method called rate normalization. The method adjusts the threshold level
where
Patel et al. [52] applied an ANN-SNN conversion method to develop an SNN-based U-Net [53] for 2D image segmentation. When they train a U-Net, they use a modified ReLU function as shown in Figure 16, which is defined as follows:
where Δ
Once the U-Net is trained, it is converted into an multi-layered SNN. They use a percentile-based loss function which regularize the maximum firing rate of each neuron across all example in the batch to be between a minimum and a maximum value. Once an SNN is obtained, they apply a quantization method and a partitioning method to the SNN so as to deploy it onto the Loihi chip [60] which is a neuromorphic chip.
The hybrid training methods first initialize weights of an SNN model with those trained by the ANN-SNN conversion method, and then fine-tune the SNN model with a direct learning method. Table 3 summarizes the characteristics of some hybrid methods.
Rathi et al. method [54] first uses an ANN-SNN conversion method similar to Diehl et al.’s method [46], to get an SNN which uses LIF neurons with no biases, uses average pooling for pooling operation, receives spike trains generated by the Poisson rate coding method, has the output neurons with no leakage and no spike generation, and applies the softmax function to accumulated membrane potential of output neurons so as to get classification probabilities. For fine-tuning the SNN, it uses the cross-entropy as the loss function
where
Direct input encoding with leakage and threshold optimization in deep spiking neural networks (DIET-SNN) algorithm [55] first trains an ANN, next converts the trained ANN to an SNN, and then fine-tunes the SNN using a surrogate gradient. For an ANN, it uses the ReLU activation function and no bias terms for neurons, does not apply the batch normalization method, and uses the average pooling, if needed. In the ANN-SNN conversion, the converted SNN consists of IF neurons, input is fed directly into the neurons of the SNN without any encoding, and a threshold balancing method is applied. In the fine-tuning phase, the SNN consists of LIF neurons, receives the input as the input current directly to the neurons of the input layer, and produces probabilities computed by applying the softmax function to the accumulated membrane potentials of the output layer. As the loss function, the cross-entropy is used. On computing the gradient of the loss function with respect to weights, the surrogate of
where
Takuka et al.’method [58] uses a knowledge distillation technique which effectively learns a small student model from a large trained teacher model. As shown in Figure 17, the method first trains a large ANN which generates the class probabilities, and next trains a small ANN from the trained large ANN using a knowledge distillation technique which uses the following loss function
where
where
The SNN has the same architecture with the small ANN. In the SNN, each input neuron receives directly the input value as input current, and the output neurons do not generate spikes and produce the class probabilities by applying the softmax function to the membrane potentials.
Neuromorphic hardware are specialized hardware to simulate SNNs very fast and efficiently. Various neuromorphic hardware have been developed in processor cluster, FPGA, or chips. Some neuromorphic hardware support only some specific hardwired SNN architectures, and others allow the architecture of SNNs to be configured.
Some neuromorphic hardware such as SpiNNaker, Brain-ScaleS, and Neurogrid have been developed for neuroscience simulations to study the brain [59]. SpiNNaker uses a network of ARM processors tightly connected to local memory as the building blocks which is housed in 10 racks, with each rack holding over 100,000 cores. It supports several spiking neuron model including LIF and Izhikevich model, and has some software tools for learning SNNs. BrainScaleS is constructed with several wafers interconnected together, each wafer consisting of 384 cores, 200K neurons, and 45M synapses. It is used to simulate brain-scale neural networks. Neurogrid consists of 16 Neurocores, each of which has 65,536 neurons, and allows to simulate one million neurons and six billion synapses in real time.
A few neuromorphic chips such as TrueNorth and Loihi have been developed, which targets low power large scale SNN evaluation. TrueNorth is a chip which consists of 4,096 cores with 1 million neurons and 256 million synapses. Its neuron is a modified LIF neuron. Loihi is a chip with a manycore mesh comprising 128 neuromorphic cores, 3 embedded x86 processor cores, and off-chip communication interfaces that hierarchically extend the mesh in 4 planar directions to other chips, up to 16,384 chip. It supports LIF neuron model which can be used as IF neuron when leakage is set to zero. Both TrueNorth and Loihi are not yet being sold for general research and development at the moment of writing [60].
Several SNN FPGA boards such as PYNQ-Z1 and DE1-SoC are on the market. Such boards support SNNs with a fixed number of layers like 1 to 4 layers, and limited types of neuron models. Some of them support on-line learning like PES learning or STDP learning. There are some analog-digital chips designed to support a fixed SNN architecture [61]. In neuromorphic chips, the SNN models are usually trained offline and later the trained modes are downloaded into them. Due to complexity of circuits, online learning algorithms except STDP are usually not supported in the neuromorphic chips.
In the semiconductor sectors, there have been designed and experimentally fabricated the neuron models and synaptic connections using CMOS and Memristor Technologies [62]. They have just shown the possibility of energy-efficient neuromorphic chips. There are yet no widely accessible neuromphic hardware on which SNNs are deployed and executed at the moment of writing.
Spiking neural networks have attracted attention for their energy efficient operation and biological plausibility. Neuroscientists are interested in simulating brain-scale SNNs to study brain functions. In machine learning, DNNs have been writing stories of success in formerly difficult tasks like vision, speech, and natural language processing. Machine learning people have worked on how to develop SNNs which are as good as DNNs. The performances of SNNs are approaching those of DNNs, but are not yet enough to replace DNNs.
The paper have addressed SNNs mainly in the perspective of machine learning. Various SNN learning algorithms have been developed and being developed. The direct learning algorithms yet seem to be difficult to train a deep SNN. The ANN-SNN conversion algorithms seem to be a best way to build deep SNNs. There will be more efforts to be exerted to reduce the conversion loss from an ANN to its corresponding SNN. The hybrid learning methods will take further advantage of both the ANN-SNN conversion algorithms and the direct training algorithms as each of them makes advances.
An ANN takes just a single cycle from the input layer to the output layer, but an SNN has to experience multiple cycles to get a stable output. Hence, one research direction in SNNs is to reduce the latency of SNN execution.
Most SNN machine learning studies have been conducted in software simulation. The accessibility to neuromorphic hardware is yet limited. Once some low-cost neuromorphic hardware are available, SNN models are expected to be widely deployed in edge devices of the IoT environments due to their energy efficiency.
No potential conflict of interest relevant to this article was reported.
An SNN for BP-STDP training [
Surrogate gradient functions which lim
Table 1. Direct training algorithms.
Algorithm | Neuron model | Architecture | Input encoding | Output decoding | Features |
---|---|---|---|---|---|
SpikeProp (2000, [37]) | SRM | Shallow network | Population code | Time-to-first code | Surrogate gradient; multiple delayed synaptic terminals |
ReSuMe (2005, [38]) | don’t care | (FF, RNN, LSM)+ trainable single layer | Spike train | Spike train | Train the weights for the last layer; STDP & anti-STDP |
PES (2011, [40]) | IF/LIF model | Two-layered network | Spike train (firing rate) | Spike train (firing rate) | MSE loss for decoded value |
STBP (2018, [42]) | LIF | Shallow network | Spike train (rate code) | Spike train (firing rate) | BPTT-like over spatial & time domains |
BP-STDP (2019, [32]) | LIF | Deep network | Spike train (spike count) | Direct output (spike count) | Backpropagation + STDP |
SBBP (2019, [43]) | IF/LIF | Deep network | Spike train (rate code) | Direct output (membrane potential) | Surrogate gradient |
Table 2. ANN-SNN conversion algorithms.
Algorithm | Neuron model | Architecture | Input encoding | Output decoding | Features |
---|---|---|---|---|---|
soft-LIF (2015, [44]) | soft-LIF (ANN) | Deep network | Spike train (rate code) | Spike train (firing rate) | Use soft-LIF in ANN for LIF |
Cao et al. (2015, [45]) | ReLU (ANN) | Shallow network | Spike train (rate code) | Spike train (firing rate) | Constrained arch.; avg. pooling, no bias |
Diehl et al. (2015, [46]) | ReLU (ANN) | Shallow network | Spike train (rate code) | Spike train (firing rate) | Constrained arch.; weight normalization |
Rueckauer et al. (2017, [30]) | ReLU (ANN) | Deep network | Direct input | Spike train (firing rate) | Constrained arch.; batch norm.; softmax |
Whetstone (2018, [47]) | bReLU (ANN) | Deep network | Spike train (rate code) | Spike train (firing rate) | Adaptive sharpening of activation function |
Sengupta et al. (2019, [48]) | ReLU (ANN) | Deep network | Spike train (rate code) | Spike train (firing rate) | Normalization in SNN; Spike-Norm |
RMP-SNN (2020, [49]) | ReLU (ANN) | Deep network | Spike train (rate code) | Spike train (firing rate) | IF with soft-reset; control threshold range; threshold balancing |
Deng et al. (2021, [50]) | thr. ReLU (ANN) | Deep network | Spike train (rate code) | Spike train (firing rate) | Conversion loss-aware bias adaptation; threshold ReLU; shifted bias |
Ding et al. (2021, [51]) | RNL (ANN) | Deep network | Spike train (rate code) | Spike train (rate code) | Optimal scaling factors for threshold balancing |
Patel et al. (2021, [52]) | mod. ReLU (ANN) | Scaled-down | Spike train (rate code) | Spike train (rate code) | image segmentation Loihi deployment |
Table 3. Hybrid training algorithms.
Algorithm | Neuron model | Architecture | Input encoding | Output decoding | Features |
---|---|---|---|---|---|
Rathi et al. (2020, [54]) | ReLU (ANN) | Deep network | Spike train (rate coding) | Direct ouput (membrane potential) | ANN-SNN conv. + STDB; ST-based surrogate gradient |
DIET-SNN (2020, [55]) | ReLU (ANN) | Deep network | Direct input | Direct output | Trainable leakage and threshold in LIF |
Takuya et al. (2021, [58]) | ReLU (ANN) | Deep network | Direct input | Direct output (membrane potential) | Knowledge distillation for conv.; fine-tuning |
E-mail: chatterboy@cbnu.ac.kr
E-mail: kmlee@cbnu.ac.kr
International Journal of Fuzzy Logic and Intelligent Systems 2021; 21(4): 317-337
Published online December 25, 2021 https://doi.org/10.5391/IJFIS.2021.21.4.317
Copyright © The Korean Institute of Intelligent Systems.
Chan Sik Han and Keon Myung Lee
Department of Computer Science, Chungbuk National University, Cheongju, Korea
Correspondence to:Keon Myung Lee (kmlee@cbnu.ac.kr)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Spiking neural networks (SNNs) have attracted attention as the third generation of neural networks for their promising characteristics of energy-efficiency and biological plausibility. The diversity of spiking neuron models and architectures have made various learning algorithms developed. This paper provides a gentle survey of SNNs to give an overview of what they are and how they are trained. It first presents how biological neurons works and how they are mathematically modelled specially in differential equations. Next it categorizes the learning algorithms of SNNs into groups and presents how their representative algorithms work. Then it briefly describe the neuromorphic hardware on which SNNs run.
Keywords: Spiking neural network, Deep learning, Neural network, Machine learning, Learning algorithms
Deep neural networks (DNNs) have made great successes in various formerly difficult tasks such as vision, speech, natural language, and games [1–5]. They have many layers and parameters and thus require huge computing resources, resulting in high energy consumption. Spiking neural networks (SNNs) are computation models which mimic biological neural networks in a more similar way than artificial neural networks (ANNs) [6]. In SNNs, neurons called spiking neurons receive spikes from other neurons and generates spikes. The SNNs can be implemented in an energy-efficient way since they are operated according to spikes sparsely generated. With this expectation, machine learning people have made efforts to develop SNNs with comparable performance to DNNs [7]. On the other hand, in neuroscience, researchers have been interested in simulating brain-scale SNNs to study brain functions. The diversity of neuron models for biological neurons, spike coding methods, and network architectures have made various learning algorithms developed for SNNs.
This survey focuses on the SNNs in the perspective of machine learning. In Section 2, for the better understanding of SNNs, we first explain the structure of biological neurons and their behaviors, and then several mathematical models of spiking neurons. In conventional ANNs and DNNs, a simple neuron model is used of which operation is expressed as the application of activation function to the weighted sum of its input. On the contrary, in SNNs, spike neurons can be modelled in various ways using differential equations with several hyperparameters like threshold level, time constant, refractory time, latency, and so on. We also present the notion of synaptic plasticity which is used to model local learning occurred in SNNs. Spiking neurons receive and generate a spike or a spike train. Hence all input and output for training should be encoded in spikes.
We present the neuron coding methods for input and output of SNNs. Then we describe how to simulate SNNs in software and what issues there are in the simulation.
In Section 3, we deal with learning methods for SNNs. They can be grouped into the unsupervised methods and the supervised methods. The unsupervised methods include some local learning algorithms. The supervised methods can be categorized into the direct methods, the ANN-SNN conversion methods, and the hybrid methods. The direct methods train directly the SNNs despite of the nonlinearity of the SNN activation functions using some surrogate functions or imposed restrictions. The ANN-SNN methods first train an ANN, and then convert the trained ANN into an SNN with the same architecture as the ANN. The hybrid methods first apply an ANN-SNN method and then fine-tune the converted SNN with a direct method.
Section 4 briefly reviews the neuromorphic hardware which have been developed for neuroscience studies and machine learning. Despite of various publications of neuromorphic hardware, there are few neuromorphic hardware on the market. In Section 5, we draw the conclusions along with the current research trends.
This section presents the behaviors of biological neurons, the computational models of spiking neurons, the notion of synaptic plasticity for brain learning, the architectures of SNNs, neural coding methods for input/output representations, and simulation of SNNs.
Brains consist of a large number of nerve cells called neurons (e.g., approximately 86 billions in the human brain) which are heavily interconnected each other [8]. Neurons communicate with each other via spikes also known as action potentials [9]. Action potentials are the electrical impulses which have a short duration of around 10 ms in brain. A neuron has three main components: dendrites, an axon, and a cell body called soma [10]. Dendrites are tree-shaped component which receive signals from axon of other neurons. Upon arriving spikes through dendrites, a soma increases or decreases its membrane potential and sends an action potential down an axon when the membrane potential reaches its threshold level. Axons are long nerve fiber which conducts action potentials away from the soma. When an action potential is transmitted through an axon, synaptic vesicles move towards an axon terminal and release neurotransmitters in the synaptic cleft [11]. The synaptic cleft is the small gap that separates the presynaptic and postsynaptic neurons. A synapse is the junction between the axon of one neuron and a dendrite of another, through which the two neurons communicate with neurotransmitters. Neurotransmitters are chemicals that influence a receiving neuron either to promote the generation of spikes or to prevent it by binding with their corresponding receptors of the dendrite of the receiving neuron [12]. Such bindings open or close the ion channels which cause the membrane potential to change. Synaptic weight refers to the connection strength at a synapse which is determined by the amount of released neurotransmitters, the amount of receptors to absorb, and the signal propagation resistance in the axons and the dendrite.
Membrane potential experiences a sequence of state phases when generating an action potential: resting state, depolarization, repolarization, and hyperpolarization, as shown in Figure 1 [13]. There exists the imbalance of electrical charges between the interior of neurons and their surroundings in a resting state. The resting state is the ground value of trans-membrane voltage which is negatively charged potential, approximately −70 mV. When neurotransmitter glutamate binds with the AMPA receptor, the sodium channels open, which results in a large influx of sodium ions. It causes a rapid rise of the membrane potential and when the membrane potential reaches the threshold level, the action potential starts to be generated. This phase of a rise of potential from negative towards positive is called depolarization. When membrane potential increases sufficiently enough to open the potassium channels, potassium ions start to move out of the neuron, which results in a rapid drop of membrane potential. The phase of such potential drop is called repolarization because membrane potential gets negatived charged back. Action potential is shaped during the phase shift from depolarization and repolarization. Slow close of potassium channels make an undershoot of membrane potential, which results in a period of potential lower than the potential of resting potential. The phase of such lower potential is called hyperpolarization. The period of hyperpolarization is called refractory time. The hyperpolarized potential gradually returns to the resting potential state by ion movement through the membrane.
Membrane potential increases when a spike is received through an excitatory synapse, whereas it decreases when a spike is received through an inhibitory synapse. Membrane potential leaks away exponentially over time.
Several mathematical models for spiking neurons have been developed to describe the characteristics of membrane potential change as shown in Figure 1. A spiking neurons receives spikes, accumulates them into its membrane potential, generates a spike when its potential reaches the threshold, and consumes its potential. In the spiking neuron models, there are Hodgkin-Huxley model, leaky integrate- and-fire (LIF) model, integrateand-fire (IF) model, spike response model (SRM), Izhikevich’s model, FitzHugh-Nagumo (FHN) model, and so on.
The Hodgkin-Huxley model [14] describes membrane potential in terms of sodium channel potential, potassium channel potential, and leak potential. For a neuron with sodium and potassium channels, it models the total current
where
LIF model [15] describes the membrane potential without introducing channel potentials in a simpler way than the Hodgkin-Huxley model. The LIF model integrates its injected current into membrane potential but allows membrane potential to slowly leak over time.
The LIF model can be expressed in the electric circuit shown in Figure 3. Its membrane potential
where
where
Due to the integral term of
The IF model is a simplified version of the LIF model, in which the leakage of membrane potential is ignored. It is somewhat weak in expressing the behavior of membrane potential in biological neurons, yet its simplicity can be beneficial in the computational and implementation aspect, especially in hardware. The difference equation for the IF model is as follows:
The SRM is a generalization of the LIF model of which equation is formulated using filters instead of differential equation, as follows [16]:
where
The Izhikevich model [17] is a generalized neuron model that can generate most recognized firing patterns of biological neurons, which is as biologically plausible as the Hodgkin–Huxley model, yet as computationally efficient as the IF model. The model is expressed in the following two differential equations:
with the auxiliary after-spike resetting as follows:
where
The FHN model is a simplified model of the Hodgkin-Huxley model, which is expressed in the following differential equations [18]:
where
There are variants of the above-mentioned neuron models and other models. They usually use Hodgkin-Huxley model, Izhikevich’s model, FHN model, and their variants in neuroscience, whereas they usually use LIF mode, IF model, and SRM in machine learning.
Synaptic strengths, i.e., synaptic weights, between neurons strongly affect the behaviors of the brain. Synaptic plasticity is the ability of synapses to strengthen or weaken over time, which is widely believed to contribute to learning and memory in the brain. Spike-timing-dependent plasticity (STDP) is a biological process that adjusts synaptic weight by tight temporal correlations between the spikes of presynaptic and postsynaptic neurons [19]. There are two major phenomena in synaptic plasticity: long-term potentiation (LTP) and long-term depression (LTD) [20]. LTP is a persistent strengthening of synapses, whereas LTD is a persistent weakening of synapses, on recent patterns of spikes between presynaptic and postsynaptic neurons.
According to STDP, repeated presynaptic spike arrival a few milliseconds prior to postsynaptic spikes leads to LTP of the synapses in many synapse types. On the other hand, repeated presynaptic spike arrival after postsynaptic spikes leads to LTD of the synapses. Figure 5 shows an STDP function that plots the change of synaptic weight as a function of the relative timing of presynaptic and postsynaptic spikes. In the figure,
Neuroscientists have paid attention to how to model the STDP because STDP seems to play key roles in learning and information storage in the brain. There have been proposed several mathematical models for STDP. Machine learning people have also tried to apply STDP to train SNN models.
SNNs are ANNs that consist of spiking neurons, which can more closely mimic biological neural networks. SNNs incorporate the concept of time into their operations in which operations are carried out in a spiking time-dependent manner. In conventional ANNs, neurons transmit information at each propagation cycle regardless of their activation value. On the contrary, in SNNs, neurons transmit information (i.e., spikes) to postsynaptic neurons only when their membrane potential reaches the threshold level. That is, only when a spike is generated at a neuron, it propagates the spike into its postsynaptic neurons. This makes it possible to save the energy consumption because only pulse-shaped spikes are propagated only when spikes are generated. SNNs are called the third generation neuron networks, which have attracted attention as an energy-efficient neural network model in engineering aspects [21].
In SNNs, spiking neurons are connected with synapses by which the transmitted spikes are either amplified or attenuated. There are two types of synapses: excitatory and inhibitory synapses [22]. When a spike is propagated through an excitatory synapse, the membrane potential of the receiving neuron is increased. When a neuron receives a spike through an inhibitory synapse, its membrane potential is decreased.
SNNs can be organized into layered architectures, recurrent architectures, or hybrid architectures [23]. In a hybrid architecture, some subpopulations are layered, and others are recurrent. As the hybrid architectures, there are Synfire chain, liquid state machine (LSM), and so on. A Synfire chain has a multi-layered architecture each of which layer is organized into a recurrent network, as shown in Figure 6 [24]. A LSM is a large, recurrent network of spiking neurons some of which receive inputs and some of which read out their value as the output values as shown in Figure 7 [25]. In an LSM, the weights connected to output neurons are to be trained and all other weights are initialized and fixed.
Neuroscientists have had interest in building nearly human brain-sized neural networks and analyzing their behaviors to understand the mechanisms of various functions in the brain [26]. They have dealt with complex SNNs which are organized into recurrent architectures or hybrid architectures. For such complex networks, there are not yet successful learning algorithms except the STDP or its variants. STDP and its variants are not powerful enough to train complex SNNs. Hence, neuroscientists are usually not so much interested in learning the SNNs. They simulate the operations of huge SNNs and analyze their characteristic patterns. Machine learning model developers are interested in developing SNN models with high accuracy and low energy consumption. They have been usually dealing with layered SNN models and have developed various learning algorithms for SNNs.
In SNNs, spiking neurons receive and produce spike signals. Hence all inputs to SNNs should be encoded into a spike or a spike train which is a sequence of spikes spread over the time dimension. On the other hand, the outputs of SNNs are a spike or a spike train. Such outputs are also needed to be decoded into an understandable format like scalar values. These kinds of encoding and decoding are called neural coding.
A spike occurring at time point
where
The neural encoding methods can be classified into rate coding, temporal coding, population coding, and direct input coding [23]. In rate coding, a value is represented as a spike train of which firing rate is proportional to the value [27]. To generate a spike train of a specific firing rate, we can distribute uniformly as many spikes as (the firing rate) × (the latency) with some random perturbation to the occurrence time of spikes, over the time span of the latency. The latency indicates the number of time steps during which a spike train is presented. To generate a spike train, we can also use a Poisson distribution of which mean corresponds to the firing rate [28]. In addition, we can use a stochastic encoding method which normalizes input values into the interval [0, 1] and uses the normalized values as the probability to generate a spike at each time step.
The temporal coding methods generate a single spike at a specific time point to represent an input value. In the temporal coding methods, there are time-to-first spike code, rank code, and so on [23]. The time-to-first code (a.k.a, latency code) represents a value as such a spike that the larger the input value, the earlier the onset of the induced spike. In the rank code (a.k.a. spike-order code), all the values are first sorted in the decreasing order, and then each of those values is assigned a spike time in a way that the order of the assigned spike times follows the order of those value. The population code makes a population (i.e., set) of input neurons take care of an input value together [29]. Each neuron has its own receptive field of which sensitivity is somewhat a Gaussian function-based as shown in Figure 6. Once an input value is given, each input neuron generates a spike at the time corresponding to its Gaussian-like function value to the input.
In direct input coding [30], we use the input value as a constant current to its corresponding spiking neuron without generation of spikes. The input values are usually normalized into a specific interval like [0, 1].
When the neurons of the output layer come to have the final membrane potentials, they produces spike(s) as the output. To interpret the output spike(s), we use the decoding methods such as spike counting, rate-based coding, temporal coding, and so on. For the SNNs of regression tasks, the spike counting method regards the count of spikes in the output nodes as the estimated value. For the SNNs of classification tasks, the rate-based decoding method selects the node label with the maximum frequency of spikes as the output. For SNNs of which the output neurons generate at most one spike, the node label with the earliest firing time is selected as the class label. Some SNNs use as the output value the membrane potential of output neurons without generating spikes [31,32]. They directly apply the softmax function to the membrane potential values in order to get the probability of classes.
When a spike train is fed into an LIF neuron, the membrane potential of the neuron is expressed in the following differential equation:
where
SNNs can be executed on a neuromorphic hardware which is specialized for SNN model execution, or simulated in software. Only a few neuromorphic hardware are available on the market. Most of them support a limited architecture of SNNs. Hence, in the training and testing phases, software simulations are widely used. The simulation methods can be categorized into synchronous simulation and asynchronous simulation.
In synchronous simulations, also known as clock-driven simulations, all neurons are updated simultaneously at every tick of a clock. In asynchronous simulations, also known as event-driven simulations, neurons are updated only when they receive or produce a spike [33]. In the simulations, spikes are represented as 0 and 1, and neurons have a variable to keep the value of their membrane potential which is updated by the differential or difference equations corresponding to the adopted neuron model.
The differential equations can be rather simple as in
When an SNN processes its input, the input is fed into an SNN with a specified latency. A spike pattern for the input is produced by the adopted neural encoding scheme. In synchronous simulations, it is usually assumed that at each time step (i.e., a tick of clock) an input signal passes through the entire SNN from the input neurons to the output neurons. Machine learning applications usually use the synchronous simulations for training SNN models, whereas neuroscience studies mainly use asynchronous simulations for investigating the brain functions. Input to an SNN is given in a spike or spike train generated by the neural encode scheme. Such spike or spike train is generated either earlier on or on the fly depending on the adopted encoding scheme. When a uniform or Poisson distribution is used to generate a spike train, it is needed to generate the entire spike train earlier on to feed the input to an SNN. When a stochastic encoding method is used, the spikes of a spike train can be, on the fly, generated and fed into the input of an SNN.
When an SNN model is organized for a simulation, the ensemble and connection paradigm can be used. The ensemble component is used to represent a group of neurons that operate together. The connection component is used to connect an ensemble to another or to the same ensemble. When a connection is made from an ensemble to itself, the ensemble becomes a recurrent network. When ensembles are arranged in a chain with connections, a multi-layered SNN is formed.
For the simulations of SNNs, there are various hyperparameters to control the behaviors of spiking neurons as follows: resting potential, minimum potential, threshold level, spike potential, refractory period, membrane time constant, latency, and so on. Simpler models for machine learning applications have a few hyperparameters, while complicated models for neuroscience studies have more hyperparameters. As the markup languages for exchanging the SNN models across the platforms, there are NeuroML, ONNX, NNEF, and so on. As the simulator-independent languages for designing and simulating SNNs, there are PyNN, EDLUT, and Nengo. There are also domain-specific languages such as OptiML and Corelet.
The learning methods for SNNs can be roughly categorized into unsupervised learning and supervised learning.
The typical unsupervised learning method for an SNN is a local training method such as STDP method. As shown in Figure 5, the STDP algorithm adjusts synaptic weights in such a way that synaptic plasticity is controlled by timing difference between presynaptic and postsynaptic neurons’ spike times. There are several variants of STDP algorithms. The vanilla STDP method uses the following update quantity Δ
where
where
There are several modifications for the vanilla STDP method. The following one is the weight change by a modified STDP method which does not use the exponential function of
where
There are other variants of the STDP method as follows [34]:
where
where
The Bienenstock-Cooper-Munro (BCM) rule is an unsupervised learning method with which weights are modified depending on the rates of the presynaptic and postsynaptic spikes [35]. Its update rule can be expressed as follows:
where
Both STDP and BCM rules are biologically-plausible unsupervised training algorithms and relatively easy to implement, yet usually not easy to apply to train high accuracy models having multiple layers.
Diehl and Cook [34] applied the STDP rule to train a handwritten digit recognition SNN which consists of one excitatory neuron layer and an inhibitory neuron layer. Each excitatory neuron receives the spikes for all the pixels as the input, that were encoded using the Poisson distribution-based method. An excitatory neuron has a connection to only one inhibitory neuron while an inhibitory neuron is connected to all excitatory neurons except the one with incoming connection to itself. All synaptic weights from input neurons to excitatory neurons are learned using STDP of
Kheradpisheh et al. [36] used the STDP rule to train a spiking deep convolutional neural networks for object recognition. Their network has a multi-layered architecture in which convolution layer and pooling layer are interleaved, and the feature vector of the last pooling is used as the input to an support vector machine (SVM) classifier model. For an input image, there is a temporal coding cell for each pixel location. The temporal coding cells first apply the difference of Gaussian (DoG) filter to the input image, and then converts each computed contrast into a spike according to the rank order encoding method. The learning for convolutional layers is carried out layer-by-layer using the STDP rule. The last layer is the global max pooling layer applied to each channel of its preceding convolutional layer. The results of global max pooling are used as the input to a linear SVM classifier model.
The supervised learning methods can be categorized into direct training, ANN-SNN conversion, and hybrid training methods. In the direct training approach, the training methods use differentiable surrogate function in place of the discrete activation function during the training phase, and apply a gradient-based optimization technique with the surrogate function. In the ANN-SNN conversion approach, a conventional ANN is first trained for the given training data, and its weights are then used to set the weights of an SNN of the same architecture. In the hybrid approach, we first use the ANN-SNN approach to initialize the weights of an SNN, and then fine-tune the SNN with a direct training method.
The direct training methods make use of inherent characteristics of spike neurons such as spike timing. Table 1 summarizes some direct training methods in terms of their neuron type, architecture, input encoding method, output decoding method, and unique features. In the early days of SNN studies, the direct methods had attracted attention which try to mimic biological behaviors. Later algorithms has paid more attention to apply conventional neural networks’ techniques like gradient-based optimization to SNNs.
SpikePro [37] is a training algorithm for a shallow SNN in which input is encoded using the population code, output is a single spike coded by the time-to-first code, the neurons are the SRM type, and each connection is made of multiple synaptic paths with different fixed delays and trainable weights as shown in Figure 9. Because the output is given in a single spike, the objective of training is to make the actual spike time as close as possible to the desired spike time. Hence the loss function is defined as,
A remote supervised method (ReSuMe) [38] is a training algorithm for an SNN which consists of a front subnetwork and a following output layer, where the front subnetwork can be either feedforward, recurrent, or hybrid network like LSM shown in Figure 7. In ReSuMe, an SNN receives and generates a spike train, and interestingly there are teacher neurons each of which provides the information of desired spike timings for its corresponding output neuron. The weights for the output neurons are trained and the teacher neurons are not connected into the SNN although they provide such supervising information as shown in Figure 10. ReSuMe adjusts the weights so as to make the spike train generated by the SNN similar to the spike train presented by the teacher neurons. For a connection weight
where
The neural engineering framework (NEF) is a general methodology for building large-scale, biologically plausible, neural models of cognition [39]. It represents a vector
where
Two populations of neurons can be connected to do linear or nonlinear transformation of a vector represented by the preceding population into a vector represented by the following population. When such a connection is made, the connection weight
Under the assumption that
When
Backpropagation spike-timing-dependent plasticity (BP-STDP) [32] is an STDP-based training algorithm for multilayered SNNs which may contain convolutional layers as shown in Figure 11. For the SNNs, input vectors are encoded into spike trains in which the numbers of spikes (i.e., spike counts) correspond to scalar values in the input vector. The output of the SNNs is a spike train of which spike count is the output value. The loss function for BP-STDP is the mean square error for the differences of the output spike counts of an SNN and the desired output values. BP-STDP assumes the time step size of duration
BP-STDP uses the following weight change Δ
where
In the above equation,
where
Spatio-temporal backpropagation (STBP) [42] is a direct training algorithm for a shallow fully connected or convolutional SNN with LIF neurons, which receives and generates spike trains and regards the firing rate of output spikes at the output layer as the inferred output value. It pays an attention to the spatial and temporal domains in the execution of an SNN. In the spatial domain, an SNN processes its incoming spike signals from the preceding layer in a layer-by-layer manner. In the temporal domain, an SNN repeatedly updates the states of neurons during the execution latency. This temporal domain aspect is closely related to the execution of recurrent neural networks (RNNs). STBP is a gradient-based training rule derived in a similar manner to backpropagation through time (BPTT) for RNNs. It uses some surrogate gradients for the non-differentiable activation function
Spike-based backpropagation (SBBP) [43] is a training algorithm for an LIF-based SNN model which has no bias in its neurons, receives spike trains generated by the Poisson distribution-based rate coding method, and uses the average membrane potentials of the output neurons as the output values. Such SNNs consist of front-end convolutional layers with average pooling, and back-end fully-connected layers. Each convolution operation generates a spike only when the computed membrane potential is greater than or equal to the specified threshold level. Only when an average pooling result is greater than or equal to the specified threshold level, a spike is generated as the pooling value. Because the output of the SNNs is a scalar value (i.e., average membrane potential), the loss function
where
where
The ANN-SNN conversion methods first train an ANN model and then fine-tune an SNN model of the same architecture as the trained ANN model, of which weights are initialized with the weights of the ANN model. Table 2 summarizes the characteristics of some ANN-SNN conversion methods.
Hunsberger and Eliasmith’s method [44] first trains an ANN which may contain convolutional layers and average pooling, and consists of its neurons with no bias terms. The ANN uses the so-called soft-LIF activation function instead of ReLU. The soft-LIF is a firing rate function of input current similar to that of LIF as shown in Figure 13. The soft-LIF firing rate function is differentiable while the LIF firing rate function is not. The soft-LIF function is defined as follows:
where
Cao et al.’s method [45] first trains a tailored CNN model in which all layers produce positive values, all neurons at convolutional layers and fully connected layers have no biases, and average pooling is used instead of max pooling. For the tailed CNN model, an SNN is organized with the same architecture, which uses IF neurons with no bias terms, receives and generates spike trains, and uses average pooling, if any. Once the CNN model is trained, its weights are used to initialize the weights of the organized SNN.
Diehl et al.’s method [46] is an improvement of Cao et al.’s method [45] which first trains an ANN model, and normalizes its weights before deploying them into an SNN model of the same architecture. When the weights of a trained ANN model are directly used as those of its corresponding SNN model, the neurons of the SNN model may get insufficient membrane potential to reach the threshold level. In addition, some membrane potentials are too large to generate just a single spike. To handle these issues, the weight normalization techniques have been developed. Among them, there are the model-based normalization and the data-based normalization [46]. In the model-based method, weights are normalized in a layer-wise manner, which are divided by the maximum of the positive weights. In the data-based method, we choose as the scaling factors the maximum of the activation of training data for each neuron of the trained ANN model. Then we divide the weights to each neuron by its scaling factor or use the scaling factor as the threshold level of the corresponding neuron for the SNN model. Those weight normalization methods have shown that such a weight-normalized SNN model gives better performance than the baseline model with no weight normalization.
In [30], the authors proposed an algorithm to first train an ANN having some architectural restrictions, and then convert it into an SNN model of which neurons are IF neurons with bias terms. It has established a theoretic foundation for the relationship between LIF activation and firing rate of spiking neurons. It supports the following two reset modes for membrane potential at spike generation: reset-to-zero mode and reset-by-subtraction mode. The transformed SNN model allows its neurons to have bias terms, uses the input as an input current to the neurons of the first layer, uses max-pooling by using a gating function, and generates spikes according to the softmax probability at the output layer. When weights are transferred from the trained ANN model to the SNN, it uses a slightly-modified data-based weight normalization method.
Whetstone method [47] is a process to train binary, threshold-activation SNNs using the existing deep learning methods. It first trains an ANN until performance makes no improvement. Then, it progressively sharpens the activation function toward a step activation at each layer one at a time, beginning from the input layer, while managing performance. The sharpening process is automated with an adaptive sharpening schedule. As the activation function, it uses the bounded rectified linear unit (bReLU)
As
Sengupta et al.’s method [48] is an ANN-SNN conversion method of which SNN models have IF neurons with no bias terms, and may include the average pooling and the identity skip connections for deeper networks. The method first trains an ANN, next initializes an SNN of the same architecture with the trained weights, then do the threshold balancing to adjust the threshold level of spiking neurons. For threshold balancing, it uses Spike-Norm [48] which can be regarded as an improvement of Diehl et al.’s normalization method [46]. Spike-Norm chooses as the scaling factor for each neuron the maximum membrane potential for the training data in the converted SNN. Then it uses the scaling factor as the threshold value of spiking neurons.
RMP-SNN method [49] is an ANN-SNN conversion method where an SNN model uses IF neurons with soft-reset [57]. Hardreset indicates a mechanism to reset the membrane potential of a spike neuron to a pre-specified low potential just after generating a spike when the membrane potential reaches the threshold level. On the other hand, soft-reset (a.k.a., reset by subtraction) is a mechanism that keeps the residual potential above the firing threshold just after generating a spike.
The trained ANN uses ReLU activation function, and its weights are transferred to an SNN of the same architecture. A neuron with ReLU function produces the output proportional to its weighted input sum, but a spiking neuron with hard-reset usually does not produce spikes of which rate is proportioned to its membrane potential. It is assumed that the conversion loss of ANN-SNN conversion is caused by the nonlinearity between membrane potential and the spiking rate in an IF neuron with the hard-reset. In the RMP-SNN method, an SNN consists of IF neurons with soft-reset, which receives spike trains and generates spike trains. To guarantee the linearity between membrane potential
where
Deng and Gu [50] paid attention to the conversion loss from an ANN to its corresponding SNN in terms of activation function and reset operation after spike generation. To begin with, they adopt IF neurons with soft-reset [57] to reduce the loss. In addition, they observed that the shifted threshold ReLU of shift
Ding et al. [51] introduced a weight normalization method called rate normalization. The method adjusts the threshold level
where
Patel et al. [52] applied an ANN-SNN conversion method to develop an SNN-based U-Net [53] for 2D image segmentation. When they train a U-Net, they use a modified ReLU function as shown in Figure 16, which is defined as follows:
where Δ
Once the U-Net is trained, it is converted into an multi-layered SNN. They use a percentile-based loss function which regularize the maximum firing rate of each neuron across all example in the batch to be between a minimum and a maximum value. Once an SNN is obtained, they apply a quantization method and a partitioning method to the SNN so as to deploy it onto the Loihi chip [60] which is a neuromorphic chip.
The hybrid training methods first initialize weights of an SNN model with those trained by the ANN-SNN conversion method, and then fine-tune the SNN model with a direct learning method. Table 3 summarizes the characteristics of some hybrid methods.
Rathi et al. method [54] first uses an ANN-SNN conversion method similar to Diehl et al.’s method [46], to get an SNN which uses LIF neurons with no biases, uses average pooling for pooling operation, receives spike trains generated by the Poisson rate coding method, has the output neurons with no leakage and no spike generation, and applies the softmax function to accumulated membrane potential of output neurons so as to get classification probabilities. For fine-tuning the SNN, it uses the cross-entropy as the loss function
where
Direct input encoding with leakage and threshold optimization in deep spiking neural networks (DIET-SNN) algorithm [55] first trains an ANN, next converts the trained ANN to an SNN, and then fine-tunes the SNN using a surrogate gradient. For an ANN, it uses the ReLU activation function and no bias terms for neurons, does not apply the batch normalization method, and uses the average pooling, if needed. In the ANN-SNN conversion, the converted SNN consists of IF neurons, input is fed directly into the neurons of the SNN without any encoding, and a threshold balancing method is applied. In the fine-tuning phase, the SNN consists of LIF neurons, receives the input as the input current directly to the neurons of the input layer, and produces probabilities computed by applying the softmax function to the accumulated membrane potentials of the output layer. As the loss function, the cross-entropy is used. On computing the gradient of the loss function with respect to weights, the surrogate of
where
Takuka et al.’method [58] uses a knowledge distillation technique which effectively learns a small student model from a large trained teacher model. As shown in Figure 17, the method first trains a large ANN which generates the class probabilities, and next trains a small ANN from the trained large ANN using a knowledge distillation technique which uses the following loss function
where
where
The SNN has the same architecture with the small ANN. In the SNN, each input neuron receives directly the input value as input current, and the output neurons do not generate spikes and produce the class probabilities by applying the softmax function to the membrane potentials.
Neuromorphic hardware are specialized hardware to simulate SNNs very fast and efficiently. Various neuromorphic hardware have been developed in processor cluster, FPGA, or chips. Some neuromorphic hardware support only some specific hardwired SNN architectures, and others allow the architecture of SNNs to be configured.
Some neuromorphic hardware such as SpiNNaker, Brain-ScaleS, and Neurogrid have been developed for neuroscience simulations to study the brain [59]. SpiNNaker uses a network of ARM processors tightly connected to local memory as the building blocks which is housed in 10 racks, with each rack holding over 100,000 cores. It supports several spiking neuron model including LIF and Izhikevich model, and has some software tools for learning SNNs. BrainScaleS is constructed with several wafers interconnected together, each wafer consisting of 384 cores, 200K neurons, and 45M synapses. It is used to simulate brain-scale neural networks. Neurogrid consists of 16 Neurocores, each of which has 65,536 neurons, and allows to simulate one million neurons and six billion synapses in real time.
A few neuromorphic chips such as TrueNorth and Loihi have been developed, which targets low power large scale SNN evaluation. TrueNorth is a chip which consists of 4,096 cores with 1 million neurons and 256 million synapses. Its neuron is a modified LIF neuron. Loihi is a chip with a manycore mesh comprising 128 neuromorphic cores, 3 embedded x86 processor cores, and off-chip communication interfaces that hierarchically extend the mesh in 4 planar directions to other chips, up to 16,384 chip. It supports LIF neuron model which can be used as IF neuron when leakage is set to zero. Both TrueNorth and Loihi are not yet being sold for general research and development at the moment of writing [60].
Several SNN FPGA boards such as PYNQ-Z1 and DE1-SoC are on the market. Such boards support SNNs with a fixed number of layers like 1 to 4 layers, and limited types of neuron models. Some of them support on-line learning like PES learning or STDP learning. There are some analog-digital chips designed to support a fixed SNN architecture [61]. In neuromorphic chips, the SNN models are usually trained offline and later the trained modes are downloaded into them. Due to complexity of circuits, online learning algorithms except STDP are usually not supported in the neuromorphic chips.
In the semiconductor sectors, there have been designed and experimentally fabricated the neuron models and synaptic connections using CMOS and Memristor Technologies [62]. They have just shown the possibility of energy-efficient neuromorphic chips. There are yet no widely accessible neuromphic hardware on which SNNs are deployed and executed at the moment of writing.
Spiking neural networks have attracted attention for their energy efficient operation and biological plausibility. Neuroscientists are interested in simulating brain-scale SNNs to study brain functions. In machine learning, DNNs have been writing stories of success in formerly difficult tasks like vision, speech, and natural language processing. Machine learning people have worked on how to develop SNNs which are as good as DNNs. The performances of SNNs are approaching those of DNNs, but are not yet enough to replace DNNs.
The paper have addressed SNNs mainly in the perspective of machine learning. Various SNN learning algorithms have been developed and being developed. The direct learning algorithms yet seem to be difficult to train a deep SNN. The ANN-SNN conversion algorithms seem to be a best way to build deep SNNs. There will be more efforts to be exerted to reduce the conversion loss from an ANN to its corresponding SNN. The hybrid learning methods will take further advantage of both the ANN-SNN conversion algorithms and the direct training algorithms as each of them makes advances.
An ANN takes just a single cycle from the input layer to the output layer, but an SNN has to experience multiple cycles to get a stable output. Hence, one research direction in SNNs is to reduce the latency of SNN execution.
Most SNN machine learning studies have been conducted in software simulation. The accessibility to neuromorphic hardware is yet limited. Once some low-cost neuromorphic hardware are available, SNN models are expected to be widely deployed in edge devices of the IoT environments due to their energy efficiency.
Phase shifts of membrane potential [
Hodgkin-Huxley model.
Leaky integrate-and-fire (LIF) model.
Izhikevich model [
An spike-timing-dependent plasticity (STDP) function.
A Synfire chain architecture [
A liquid state machine architecture [
Population coding.
An SNN with multiple synaptic connections [
An SNN for ReSuMe training [
An SNN for BP-STDP training [
Surrogate gradient functions which lim
Firing rate functions for soft-LIF (dotted curve) and LIF (solid curve) [
Sharpening of bReLU function at the Whetstone method.
Activation functions of ReLU, threshold ReLU, and SNN.
A modified ReLU function and its derivative.
Knowledge distillation-based SNN training [
Table 1 . Direct training algorithms.
Algorithm | Neuron model | Architecture | Input encoding | Output decoding | Features |
---|---|---|---|---|---|
SpikeProp (2000, [37]) | SRM | Shallow network | Population code | Time-to-first code | Surrogate gradient; multiple delayed synaptic terminals |
ReSuMe (2005, [38]) | don’t care | (FF, RNN, LSM)+ trainable single layer | Spike train | Spike train | Train the weights for the last layer; STDP & anti-STDP |
PES (2011, [40]) | IF/LIF model | Two-layered network | Spike train (firing rate) | Spike train (firing rate) | MSE loss for decoded value |
STBP (2018, [42]) | LIF | Shallow network | Spike train (rate code) | Spike train (firing rate) | BPTT-like over spatial & time domains |
BP-STDP (2019, [32]) | LIF | Deep network | Spike train (spike count) | Direct output (spike count) | Backpropagation + STDP |
SBBP (2019, [43]) | IF/LIF | Deep network | Spike train (rate code) | Direct output (membrane potential) | Surrogate gradient |
Table 2 . ANN-SNN conversion algorithms.
Algorithm | Neuron model | Architecture | Input encoding | Output decoding | Features |
---|---|---|---|---|---|
soft-LIF (2015, [44]) | soft-LIF (ANN) | Deep network | Spike train (rate code) | Spike train (firing rate) | Use soft-LIF in ANN for LIF |
Cao et al. (2015, [45]) | ReLU (ANN) | Shallow network | Spike train (rate code) | Spike train (firing rate) | Constrained arch.; avg. pooling, no bias |
Diehl et al. (2015, [46]) | ReLU (ANN) | Shallow network | Spike train (rate code) | Spike train (firing rate) | Constrained arch.; weight normalization |
Rueckauer et al. (2017, [30]) | ReLU (ANN) | Deep network | Direct input | Spike train (firing rate) | Constrained arch.; batch norm.; softmax |
Whetstone (2018, [47]) | bReLU (ANN) | Deep network | Spike train (rate code) | Spike train (firing rate) | Adaptive sharpening of activation function |
Sengupta et al. (2019, [48]) | ReLU (ANN) | Deep network | Spike train (rate code) | Spike train (firing rate) | Normalization in SNN; Spike-Norm |
RMP-SNN (2020, [49]) | ReLU (ANN) | Deep network | Spike train (rate code) | Spike train (firing rate) | IF with soft-reset; control threshold range; threshold balancing |
Deng et al. (2021, [50]) | thr. ReLU (ANN) | Deep network | Spike train (rate code) | Spike train (firing rate) | Conversion loss-aware bias adaptation; threshold ReLU; shifted bias |
Ding et al. (2021, [51]) | RNL (ANN) | Deep network | Spike train (rate code) | Spike train (rate code) | Optimal scaling factors for threshold balancing |
Patel et al. (2021, [52]) | mod. ReLU (ANN) | Scaled-down | Spike train (rate code) | Spike train (rate code) | image segmentation Loihi deployment |
Table 3 . Hybrid training algorithms.
Algorithm | Neuron model | Architecture | Input encoding | Output decoding | Features |
---|---|---|---|---|---|
Rathi et al. (2020, [54]) | ReLU (ANN) | Deep network | Spike train (rate coding) | Direct ouput (membrane potential) | ANN-SNN conv. + STDB; ST-based surrogate gradient |
DIET-SNN (2020, [55]) | ReLU (ANN) | Deep network | Direct input | Direct output | Trainable leakage and threshold in LIF |
Takuya et al. (2021, [58]) | ReLU (ANN) | Deep network | Direct input | Direct output (membrane potential) | Knowledge distillation for conv.; fine-tuning |
Nishant Chauhan and Byung-Jae Choi
International Journal of Fuzzy Logic and Intelligent Systems 2019; 19(4): 315-322 https://doi.org/10.5391/IJFIS.2019.19.4.315Sang-jin Oh, Chae-og Lim, Byeong-choel Park, Jae-chul Lee, and Sung-chul Shin
International Journal of Fuzzy Logic and Intelligent Systems 2019; 19(3): 140-146 https://doi.org/10.5391/IJFIS.2019.19.3.140Gayoung Kim
International Journal of Fuzzy Logic and Intelligent Systems 2024; 24(3): 287-294 https://doi.org/10.5391/IJFIS.2024.24.3.287Phase shifts of membrane potential [
Hodgkin-Huxley model.
|@|~(^,^)~|@|Leaky integrate-and-fire (LIF) model.
|@|~(^,^)~|@|Izhikevich model [
An spike-timing-dependent plasticity (STDP) function.
|@|~(^,^)~|@|A Synfire chain architecture [
A liquid state machine architecture [
Population coding.
|@|~(^,^)~|@|An SNN with multiple synaptic connections [
An SNN for ReSuMe training [
An SNN for BP-STDP training [
Surrogate gradient functions which lim
Firing rate functions for soft-LIF (dotted curve) and LIF (solid curve) [
Sharpening of bReLU function at the Whetstone method.
|@|~(^,^)~|@|Activation functions of ReLU, threshold ReLU, and SNN.
|@|~(^,^)~|@|A modified ReLU function and its derivative.
|@|~(^,^)~|@|Knowledge distillation-based SNN training [