Training Neural Networks to Use Time Like the Brain Does

Featured paper: Efficient event-based delay learning in spiking neural networks

Disclaimer: This content was generated by NotebookLM. Dr. Tram doesn’t know anything about this topic and is learning about it.

Artificial Intelligence (AI) has exploded in popularity, driving incredible improvements in technology over the last decade. But this progress comes at a huge cost: traditional Artificial Neural Networks (ANNs) consume immense amounts of energy. Think about it: the human brain, which is the ultimate computing machine, operates on a power budget of only about 20W. Clearly, we need a smarter, more efficient way to build our AI systems.

That’s where Spiking Neural Networks (SNNs) come in.

Why Spiking Neural Networks are the Next Big Thing

SNNs are a type of AI model that takes inspiration directly from neuroscience. Unlike standard ANNs, which communicate using dense, continuous values (like a constant stream of water), neurons in the brain transmit sparse binary events called “spikes”. SNNs leverage these sparse communication patterns, making them excellent candidates for energy-efficient machine learning, especially when running on specialized hardware designed for brain-like computing (called neuromorphic systems).

Because spiking neurons have state and hidden temporal dynamics, they possess an “implicit recurrence,” meaning they are inherently good at processing information that changes over time, like sequences or videos. In short, SNNs are ideal for spatio-temporal tasks.

However, even though SNNs are universal function approximators like ANNs, achieving or surpassing the performance of traditional ANNs remains a major challenge.

The Training Challenge: Gradients and Spikes

The most common way to train an ANN is using a method called gradient descent. This requires calculating gradients (or how much a small change in a parameter affects the final outcome). But in an SNN, things get complicated because the act of a neuron spiking involves a non-differentiable transition, making exact gradient calculation mathematically difficult.

Researchers have tried several workarounds:

Converting ANNs: Training a standard ANN and then transferring its weights to an SNN. This doesn’t capture the full efficiency of sparse spiking during the training process.
Surrogate Gradients: Using an approximation method called Backpropagation Through Time (BPTT) with “surrogate gradients”. The huge drawback here is that this requires storing the state of every neuron at every time step, meaning memory requirements scale linearly with the length of the data sequence. This limits the complexity of tasks SNNs can handle.

A more elegant approach is the EventProp algorithm, which calculates exact gradients. EventProp is an application of the adjoint method, which is like an event-driven version of BPTT used for systems that combine continuous change with discontinuous state shifts (like a spike happening). It efficiently calculates “blame” for network errors only when spikes occur, rather than at every time step.

Unlocking Potential with Delays

Even though SNNs are good at temporal tasks, the duration of their intrinsic memory is limited by parameters like synaptic time constants. This new research, detailed in the paper “Efficient event-based delay learning in spiking neural networks” (Mészáros et al. 2025), introduces a powerful additional mechanism: Delays.

Delays refer to the time it takes for a signal (a spike) to travel from one neuron to the next. In the brain, these synaptic delays are natural and can be adjusted. From a computational perspective, adding delays has been shown to significantly increase a network’s capacity. In fact, specialized neuromorphic hardware like SpiNNaker and Loihi are already built to accommodate synaptic delays efficiently.

While learning delays has been attempted before, most methods rely on inefficient surrogate gradients or use complex temporal convolutions, resulting in large overheads in memory and computation. Previous exact gradient methods for delays (like DelGrad) were limited only to simple feedforward networks where neurons only spiked once.

The Breakthrough: EventProp with Learnable Delays

The authors developed an event-based training method for SNNs that includes learnable delays, which builds upon the EventProp framework.

By extending EventProp, they can calculate exact gradients with respect to both weights and delays. Crucially, this new method supports multiple spikes per neuron and can be applied to recurrent SNNs - networks where connections loop back on themselves.

How does it handle delays efficiently? In the original EventProp, the time a spike is emitted is also the time it arrives at the next neuron. With delays, these two events are separated. The researchers figured out how to adjust the EventProp mathematics so that the “blame” for an error still propagates backward in time, linking the spike arrival time (which is delayed) back to the original spike emission time. The resulting training algorithm remains fully event-based and therefore incredibly efficient.

The algorithm was implemented in mlGeNN, a machine learning library built on a GPU-optimized simulator. Using mlGeNN, increasing the range of delays only requires enlarging a small buffer per neuron, leading to much lower memory overhead than approaches using temporal convolutions.

Impressive Results: Speed, Efficiency, and Smarter Networks

The new method was tested on various machine learning tasks, demonstrating its power and efficiency:

Sequence Detection: On a simple binary task designed to test the network’s ability to sense the order of two inputs, the algorithm successfully optimized the delays from a poor starting distribution to achieve 100% accuracy. This proves that learning delays allows SNNs to master sequence detection, not just recognizing coincidence.
Efficiency Wins: When benchmarked against the current state-of-the-art delay learning method (which uses dilated convolutions), the new EventProp approach was up to 26 times faster and offered memory savings of over 2 times. Importantly, its memory and computational costs did not grow rapidly when handling longer delays, unlike convolution-based methods.
Enhanced Accuracy with Fewer Parameters: On datasets like the Spiking Heidelberg Digits (SHD) and Spiking Speech Commands (SSC), the algorithm enhanced classification accuracy. For instance, on SHD, they achieved comparable state-of-the-art accuracy using approximately five times fewer parameters than a previous EventProp approach.
Recurrent Delays Shine in Small Networks: A major finding across multiple datasets (SHD, SSC, and Braille reading) was that recurrent delays provided substantial advantages when networks were constrained in size. In other words, when you can’t make your network bigger, learning how to precisely time the internal communication dramatically boosts performance.

This research successfully combines the theoretical efficiency necessary for future neuromorphic hardware with the practical speed of current GPUs. By integrating learnable delays using an extended EventProp, the door is opened for SNNs to become a more powerful, versatile, and, most importantly, energy-efficient foundation for machine learning, helping us move toward AI that uses time as intelligently as the human brain does.

Scientists Successfully Teleport Photons Using Telecom Wavelengths, Paving the Way for a Global Quantum Internet

Understanding the Future of Breast Cancer Detection: When AI Starts Explaining Itself