Models Categorization#

Given the brain-decoding framework from the previous page, we define our neural networks, denoted \(f\), as a composition of sequential transformations:

\[f_{\mathrm{method}} \;=\; f_{\mathrm{convolution}} \circ f_{\ell} \circ \cdots \circ f_{\mathrm{linear}}\,\]

where each \(f_\ell\) is a specific \(\ell\) layer in the neural network, focusing mostly of time in learning the mapping \(f_{\mathrm{method}} : \mathcal{X} \to \mathcal{Y}\) on the training data, with parameters \(\theta \in \Theta\). How these core \(\ell\) sequence transformations are structured and combined defines the overall focus and strength of the models.

Here, we categorize the main families of brain decoding models based on their core components and design philosophies. The categories are not mutually exclusive, but an indication of what governs that neural network model; many models blend elements from multiple families to leverage their combined strengths. Beginning directly, the categories are nine: Convolution, Recurrent, Small Attention, Filterbank, Interpretability, Large Language Model, Graph Neural Network, Symmetric Positive-Definite and Channel.

At the moment, not all the categories are implemented, validated, and tested, but there are some that are noteworthy for introducing or popularizing concepts or layer designs that can take decoding further.

The convolutional layer appears as the core primitive across most architectures. This is because convolutions are filtering operations, such as band-pass filters, useful and needed to extract local features from brain signals. More details about each categories can be found in the respective sections below.

Convolution Layers

Convolution

Diagram of a convolutional layer

Applies temporal and/or spatial convolutions to extract local features from brain signals.

Recurrent Layers

Recurrent

Diagram of recurrent/TCN models

Models temporal dependencies via recurrent units or TCNs with dilations.

Small Attention

Small Attention

Diagram of attention modules

Uses attention mechanisms for feature focusing. Can be trained effectively without self-supervised pre-training.

Filterbank Models

Filterbank

Diagram of filterbank models

Decomposes signals into multiple bands (learned or fixed) to capture frequency-specific information.

Interpretability-by-Design

Interpretability

Diagram of interpretable architectures

Architectures with inherently interpretable layers allow direct neuroscientific validation of learned features.

Symmetric Positive-Definite

SPD To be released soon!

Diagram of SPD learning

Learns on covariance/connectivity as SPD matrices using BiMap/ReEig/LogEig layers.

Large Transformer Models

Large Language Model

Diagram of transformer models

Large-scale transformer layers require self-supervised pre-training to work effectively.

Graph Neural Network

Graph Neural Network

Diagram of GNN models

Treats channels/regions as nodes with learned/static edges to model connectivity.

Channel-Domain

Channel

Diagram of channel-domain methods

Usage montage information with spatial filtering / channel / hemisphere / brain region selection strategies.

  • Across most architectures, the earliest stages are convolutional (Convolution), reflecting the brain time series’s noisy, locally structured nature. These layers apply temporal and/or spatial convolutions—often depthwise-separable as in EEGNet, per-channel or across channel groups to extract robust local features. EEGNet, ShallowFBCSPNet, EEGNeX, and EEGInceptionERP

  • In the recurrent family (Recurrent), many modern EEG models actually rely on temporal convolutional networks (TCNs) with dilations to grow the receptive field, rather than explicit recurrence (11), BDTCN,

  • In contrast, several methods employ small attention modules (Small Attention) to capture longer-range dependencies efficiently, e.g., EEGConformer, CTNet, ATCNet, AttentionBaseNet (12, 13, 14).

  • Filterbank-style models (Filterbank) explicitly decompose signals into multiple bands before (or while) learning, echoing the classic FBCSP pipeline; examples include FBCNet and FBMSNet (15, 16).

  • Interpretability-by-design (Interpretability) architectures expose physiologically meaningful primitives (e.g., band-pass/sinc filters, variance or connectivity features), enabling direct neuroscientific inspection; see SincShallowNet and EEGMiner (17, 18).

  • SPD / Riemannian (SPD) methods operate on covariance (or connectivity) matrices as points on the SPD manifold, combining layers such as BiMap, ReEig, and LogEig; deep SPD networks and Riemannian classifiers motivate this family (19). (Coming soon in a dedicate repository.)

  • Large-model / Transformer (Large Language Model) approaches pretrain attention-based encoders on diverse biosignals and fine-tune for EEG tasks; e.g., BIOT (20). These typically need a heavily self-supervised pre-training before decoding.

  • Graph neural networks (Graph Neural Network) treat channels/regions as nodes with learned (static or dynamic) edges to model functional connectivity explicitly; representative EEG-GNN, more common in the epileptic decoding (21).

  • Channel-domain robustness (Channel) techniques target variability in electrode layouts by learning montage-agnostic or channel-selective layers (e.g., dynamic spatial filtering, differentiable channel re-ordering); these strategies improve cross-setup generalization SignalJEPA (22, 23).

We are continually expanding this collection and welcome contributions! If you have implemented a model relevant to EEG, ECoG, or MEG analysis, consider adding it to Braindecode.

Submit a new model#

Want to contribute a new model to Braindecode? Great! You can propose a new model by opening an issue (please include a link to the relevant publication or description) or, even better, directly submit your implementation via a pull request. We appreciate your contributions to expanding the library!

Next: Models Table