class braindecode.models.EEGConformer(n_outputs=None, n_chans=None, n_filters_time=40, filter_time_length=25, pool_time_length=75, pool_time_stride=15, drop_prob=0.5, att_depth=6, att_heads=10, att_drop_prob=0.5, final_fc_length=2440, return_features=False, n_times=None, chs_info=None, input_window_seconds=None, sfreq=None, n_classes=None, n_channels=None, input_window_samples=None, add_log_softmax=True)[source]#

EEG Conformer.

Convolutional Transformer for EEG decoding.

The paper and original code with more details about the methodological choices are available at the [Song2022] and [ConformerCode].

This neural network architecture receives a traditional braindecode input. The input shape should be three-dimensional matrix representing the EEG signals.

(batch_size, n_channels, n_timesteps).

The EEG Conformer architecture is composed of three modules:
  • PatchEmbedding

  • TransformerEncoder

  • ClassificationHead

  • n_outputs (int) – Number of outputs of the model. This is the number of classes in the case of classification.

  • n_chans (int) – Number of EEG channels.

  • n_filters_time (int) – Number of temporal filters, defines also embedding size.

  • filter_time_length (int) – Length of the temporal filter.

  • pool_time_length (int) – Length of temporal pooling filter.

  • pool_time_stride (int) – Length of stride between temporal pooling filters.

  • drop_prob (float) – Dropout rate of the convolutional layer.

  • att_depth (int) – Number of self-attention layers.

  • att_heads (int) – Number of attention heads.

  • att_drop_prob (float) – Dropout rate of the self-attention layer.

  • final_fc_length (int | str) – The dimension of the fully connected layer.

  • return_features (bool) – If True, the forward method returns the features before the last classification layer. Defaults to False.

  • n_times (int) – Number of time samples of the input window.

  • chs_info (list of dict) – Information about each individual EEG channel. This should be filled with info["chs"]. Refer to mne.Info for more details.

  • input_window_seconds (float) – Length of the input window in seconds.

  • sfreq (float) – Sampling frequency of the EEG recordings.

  • n_classes – Alias for n_outputs.

  • n_channels – Alias for n_chans.

  • input_window_samples – Alias for n_times.

  • add_log_softmax (bool) – Whether to use log-softmax non-linearity as the output function. LogSoftmax final layer will be removed in the future. Please adjust your loss function accordingly (e.g. CrossEntropyLoss)! Check the documentation of the torch.nn loss functions:

  • ValueError – If some input signal-related parameters are not specified: and can not be inferred.

  • FutureWarning – If add_log_softmax is True, since LogSoftmax final layer: will be removed in the future.


The authors recommend using data augmentation before using Conformer, e.g. segmentation and recombination, Please refer to the original paper and code for more details.

The model was initially tuned on 4 seconds of 250 Hz data. Please adjust the scale of the temporal convolutional layer, and the pooling layer for better performance.

New in version 0.8.

We aggregate the parameters based on the parts of the models, or when the parameters were used first, e.g. n_filters_time.



Song, Y., Zheng, Q., Liu, B. and Gao, X., 2022. EEG conformer: Convolutional transformer for EEG decoding and visualization. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31, pp.710-719.


Song, Y., Zheng, Q., Liu, B. and Gao, X., 2022. EEG conformer: Convolutional transformer for EEG decoding and visualization. eeyhsong/EEG-Conformer.


forward(x: Tensor) Tensor[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.


Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.


x – The description is missing.