Note

Go to the end to download the full example code.

Basic Brain Decoding on EEG Data#

This tutorial shows you how to train and test deep learning models with Braindecode in a classical EEG setting: you have trials of data with labels (e.g., Right Hand, Left Hand, etc.).

Loading and preparing the data #

Loading the dataset #

First, we load the data. In this tutorial, we load the BCI Competition IV 2a data [1] using braindecode’s wrapper to load via MOABB library [2].

Note

To load your own datasets either via mne or from preprocessed X/y numpy arrays, see MNE Dataset Example and Custom Dataset Example.

from braindecode.datasets import MOABBDataset

subject_id = 3
dataset = MOABBDataset(dataset_name="BNCI2014_001", subject_ids=[subject_id])

Now we apply preprocessing like bandpass filtering to our dataset. You can either apply functions provided by mne.io.Raw or mne.Epochs or apply your own functions, either to the MNE object or the underlying numpy array.

Note

Generally, braindecode prepocessing is directly applied to the loaded data, and not applied on-the-fly as transformations, such as in PyTorch-libraries like torchvision.

from numpy import multiply

from braindecode.preprocessing import (
    Preprocessor,
    exponential_moving_standardize,
    preprocess,
)

low_cut_hz = 4.0  # low cut frequency for filtering
high_cut_hz = 38.0  # high cut frequency for filtering
# Parameters for exponential moving standardization
factor_new = 1e-3
init_block_size = 1000
# Factor to convert from V to uV
factor = 1e6

preprocessors = [
    Preprocessor("pick_types", eeg=True, meg=False, stim=False),  # Keep EEG sensors
    Preprocessor(lambda data: multiply(data, factor)),  # Convert from V to uV
    Preprocessor("filter", l_freq=low_cut_hz, h_freq=high_cut_hz),  # Bandpass filter
    Preprocessor(
        exponential_moving_standardize,  # Exponential moving standardization
        factor_new=factor_new,
        init_block_size=init_block_size,
    ),
]

# Transform the data
preprocess(dataset, preprocessors, n_jobs=-1)

/home/runner/work/braindecode/braindecode/braindecode/preprocessing/preprocess.py:78: UserWarning: apply_on_array can only be True if fn is a callable function. Automatically correcting to apply_on_array=False.
  warn(
/home/runner/work/braindecode/braindecode/braindecode/preprocessing/preprocess.py:76: UserWarning: Preprocessing choices with lambda functions cannot be saved.
  warn("Preprocessing choices with lambda functions cannot be saved.")

BaseConcatDataset
Type	BaseConcatDataset of RawDataset
Recordings	12
Total samples	1160820
Sfreq*	250.0 Hz
Channels*	22 (22 EEG)
Ch. names*	Fz, FC3, FC1, FCz, FC2, FC4, C5, C3, C1, Cz, ... (+12 more)
Montage*	head
Duration*	386.9 s
* from first recording
Description	12 recordings × 3 columns [subject, session, run]

Extracting Compute Windows #

Now we extract compute windows from the signals, these will be the inputs to the deep networks during training. In the case of trialwise decoding, we just have to decide if we want to include some part before and/or after the trial. For our work with this dataset, it was often beneficial to also include the 500 ms before the trial.

from braindecode.preprocessing import create_windows_from_events

trial_start_offset_seconds = -0.5
# Extract sampling frequency, check that they are same in all datasets
sfreq = dataset.datasets[0].raw.info["sfreq"]
assert all([ds.raw.info["sfreq"] == sfreq for ds in dataset.datasets])
# Calculate the trial start offset in samples.
trial_start_offset_samples = int(trial_start_offset_seconds * sfreq)

# Create windows using braindecode function for this. It needs parameters to define how
# trials should be used.
windows_dataset = create_windows_from_events(
    dataset,
    trial_start_offset_samples=trial_start_offset_samples,
    trial_stop_offset_samples=0,
    preload=True,
)

Splitting the dataset into training and validation sets #

We can easily split the dataset using additional info stored in the description attribute, in this case session column. We select 0train for training and 1test for validation.

splitted = windows_dataset.split("session")
train_set = splitted["0train"]  # Session train
valid_set = splitted["1test"]  # Session evaluation

Creating a model #

Now we create the deep learning model! First thing we need to do is know the properties of our signals. For this, we use the braindecode.datautil.infer_signal_properties() function:

from braindecode.datautil import infer_signal_properties

sig_props = infer_signal_properties(train_set, mode="classification")
print(sig_props)

{'n_times': 1125, 'n_chans': 22, 'n_outputs': 4}

Braindecode comes with some predefined convolutional neural network architectures for raw time-domain EEG. Here, we use the ShallowFBCSPNet model from [3]. These models are pure PyTorch deep learning models, therefore to use your own model, it just has to be a normal PyTorch torch.nn.Module.

import torch

from braindecode.models import ShallowFBCSPNet
from braindecode.util import set_random_seeds

cuda = torch.cuda.is_available()  # check if GPU is available, if True chooses to use it
device = "cuda" if cuda else "cpu"
if cuda:
    torch.backends.cudnn.benchmark = True
# Set random seed to be able to roughly reproduce results
# Note that with cudnn benchmark set to True, GPU indeterminism
# may still make results substantially different between runs.
# To obtain more consistent results at the cost of increased computation time,
# you can set `cudnn_benchmark=False` in `set_random_seeds`
# or remove `torch.backends.cudnn.benchmark = True`
seed = 20200220
set_random_seeds(seed=seed, cuda=cuda)

model = ShallowFBCSPNet(
    n_chans=sig_props["n_chans"],
    n_outputs=sig_props["n_outputs"],
    n_times=sig_props["n_times"],
    final_conv_length="auto",
)

# Display torchinfo table describing the model
print(model)

# Send model to GPU
if cuda:
    model = model.cuda()

=================================================================================================================================================
Layer (type (var_name):depth-idx)             Input Shape               Output Shape              Param #                   Kernel Shape
=================================================================================================================================================
ShallowFBCSPNet (ShallowFBCSPNet)             [1, 22, 1125]             [1, 4]                    --                        --
├─Ensure4d (ensuredims): 1-1                  [1, 22, 1125]             [1, 22, 1125, 1]          --                        --
├─Rearrange (dimshuffle): 1-2                 [1, 22, 1125, 1]          [1, 1, 1125, 22]          --                        --
├─CombinedConv (conv_time_spat): 1-3          [1, 1, 1125, 22]          [1, 40, 1101, 1]          36,240                    --
├─BatchNorm2d (bnorm): 1-4                    [1, 40, 1101, 1]          [1, 40, 1101, 1]          80                        --
├─Square (conv_nonlin_exp): 1-5               [1, 40, 1101, 1]          [1, 40, 1101, 1]          --                        --
├─AvgPool2d (pool): 1-6                       [1, 40, 1101, 1]          [1, 40, 69, 1]            --                        [75, 1]
├─SafeLog (pool_nonlin_exp): 1-7              [1, 40, 69, 1]            [1, 40, 69, 1]            --                        --
├─Dropout (drop): 1-8                         [1, 40, 69, 1]            [1, 40, 69, 1]            --                        --
├─Sequential (final_layer): 1-9               [1, 40, 69, 1]            [1, 4]                    --                        --
│    └─Conv2d (conv_classifier): 2-1          [1, 40, 69, 1]            [1, 4, 1, 1]              11,044                    [69, 1]
│    └─SqueezeFinalOutput (squeeze): 2-2      [1, 4, 1, 1]              [1, 4]                    --                        --
│    │    └─Rearrange (squeeze): 3-1          [1, 4, 1, 1]              [1, 4, 1]                 --                        --
=================================================================================================================================================
Total params: 47,364
Trainable params: 47,364
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 0.01
=================================================================================================================================================
Input size (MB): 0.10
Forward/backward pass size (MB): 0.35
Params size (MB): 0.04
Estimated Total Size (MB): 0.50
=================================================================================================================================================

Model Training #

Now we will train the network! EEGClassifier is a Braindecode object responsible for managing the training of neural networks. It inherits from skorch.classifier.NeuralNetClassifier, so the training logic is the same as in skorch.

Note

In this tutorial, we use some default parameters that we have found to work well for motor decoding, however we strongly encourage you to perform your own hyperparameter optimization using cross validation on your training data.

from skorch.callbacks import EarlyStopping, LRScheduler
from skorch.helper import predefined_split

from braindecode import EEGClassifier

# We found these values to be good for the shallow network:
lr = 0.0625 * 0.01
weight_decay = 0

# For deep4 they should be:
# lr = 1 * 0.01
# weight_decay = 0.5 * 0.001

batch_size = 64
n_epochs = 4
classes = list(range(sig_props["n_outputs"]))

clf = EEGClassifier(
    model,
    criterion=torch.nn.CrossEntropyLoss,
    optimizer=torch.optim.AdamW,
    train_split=predefined_split(valid_set),  # using valid_set for validation
    optimizer__lr=lr,
    optimizer__weight_decay=weight_decay,
    batch_size=batch_size,
    callbacks=[
        "accuracy",
        ("lr_scheduler", LRScheduler("CosineAnnealingLR", T_max=max(1, n_epochs - 1))),
        ("early_stopping", EarlyStopping(patience=10, load_best=True)),
    ],
    device=device,
    classes=classes,
)
# Model training for the specified number of epochs. ``y`` is ``None`` as it is
# already supplied in the dataset.
clf.fit(train_set, y=None, epochs=n_epochs)

  epoch    train_accuracy    train_loss    valid_acc    valid_accuracy    valid_loss      lr     dur
-------  ----------------  ------------  -----------  ----------------  ------------  ------  ------
      1            0.2500        1.6341       0.2500            0.2500        5.8027  0.0006  1.9045
      2            0.2500        1.2511       0.2500            0.2500        6.6946  0.0005  1.8016
      3            0.2500        1.1439       0.2500            0.2500        6.0062  0.0002  1.8003
      4            0.2604        1.0877       0.2535            0.2535        4.9447  0.0000  1.7881

<class 'braindecode.classifier.EEGClassifier'>[initialized](
  module_==================================================================================================================================================
  Layer (type (var_name):depth-idx)             Input Shape               Output Shape              Param #                   Kernel Shape
  =================================================================================================================================================
  ShallowFBCSPNet (ShallowFBCSPNet)             [1, 22, 1125]             [1, 4]                    --                        --
  ├─Ensure4d (ensuredims): 1-1                  [1, 22, 1125]             [1, 22, 1125, 1]          --                        --
  ├─Rearrange (dimshuffle): 1-2                 [1, 22, 1125, 1]          [1, 1, 1125, 22]          --                        --
  ├─CombinedConv (conv_time_spat): 1-3          [1, 1, 1125, 22]          [1, 40, 1101, 1]          36,240                    --
  ├─BatchNorm2d (bnorm): 1-4                    [1, 40, 1101, 1]          [1, 40, 1101, 1]          80                        --
  ├─Square (conv_nonlin_exp): 1-5               [1, 40, 1101, 1]          [1, 40, 1101, 1]          --                        --
  ├─AvgPool2d (pool): 1-6                       [1, 40, 1101, 1]          [1, 40, 69, 1]            --                        [75, 1]
  ├─SafeLog (pool_nonlin_exp): 1-7              [1, 40, 69, 1]            [1, 40, 69, 1]            --                        --
  ├─Dropout (drop): 1-8                         [1, 40, 69, 1]            [1, 40, 69, 1]            --                        --
  ├─Sequential (final_layer): 1-9               [1, 40, 69, 1]            [1, 4]                    --                        --
  │    └─Conv2d (conv_classifier): 2-1          [1, 40, 69, 1]            [1, 4, 1, 1]              11,044                    [69, 1]
  │    └─SqueezeFinalOutput (squeeze): 2-2      [1, 4, 1, 1]              [1, 4]                    --                        --
  │    │    └─Rearrange (squeeze): 3-1          [1, 4, 1, 1]              [1, 4, 1]                 --                        --
  =================================================================================================================================================
  Total params: 47,364
  Trainable params: 47,364
  Non-trainable params: 0
  Total mult-adds (Units.MEGABYTES): 0.01
  =================================================================================================================================================
  Input size (MB): 0.10
  Forward/backward pass size (MB): 0.35
  Params size (MB): 0.04
  Estimated Total Size (MB): 0.50
  =================================================================================================================================================,
)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

EEGClassifier

iFitted

Parameters

	module	ShallowFBCSPN...') ) ) )
	criterion	<class 'torch...sEntropyLoss'>
	cropped	False
	callbacks	['accuracy', ('lr_scheduler', ...), ...]
	iterator_train__shuffle	True
	iterator_train__drop_last	True
	aggregate_predictions	True
	optimizer	<class 'torch....adamw.AdamW'>
	lr	0.01
	max_epochs	10
	batch_size	64
	iterator_train	<class 'torch...r.DataLoader'>
	iterator_valid	<class 'torch...r.DataLoader'>
	dataset	<class 'skorc...aset.Dataset'>
	train_split	functools.par...: 72, 3: 72}))
	predict_nonlinearity	'auto'
	warm_start	False
	verbose	1
	device	'cpu'
	compile	False
	use_caching	'auto'
	torch_load_kwargs	None
	_params_to_validate	{'iterator_train__drop_last', 'iterator_train__shuffle', 'optimizer__lr', 'optimizer__weight_decay'}
	optimizer__lr	0.000625
	optimizer__weight_decay	0
	classes	[0, 1, ...]
	callbacks__epoch_timer	<skorch.callb...x7f607d019940>
	callbacks__train_loss	<skorch.callb...x7f607b2b99d0>
	callbacks__train_loss__scoring	<function tra...x7f5fc27325c0>
	callbacks__train_loss__lower_is_better	True
	callbacks__train_loss__on_train	True
	callbacks__train_loss__name	'train_loss'
	callbacks__train_loss__target_extractor	<function noo...x7f5fc27323e0>
	callbacks__train_loss__use_caching	True
	callbacks__valid_loss	<skorch.callb...x7f607b0e5d60>
	callbacks__valid_loss__scoring	<function val...x7f5fc2732660>
	callbacks__valid_loss__lower_is_better	True
	callbacks__valid_loss__on_train	False
	callbacks__valid_loss__name	'valid_loss'
	callbacks__valid_loss__target_extractor	<function noo...x7f5fc27323e0>
	callbacks__valid_loss__use_caching	True
	callbacks__valid_acc	<skorch.callb...x7f607b0e6db0>
	callbacks__valid_acc__scoring	'accuracy'
	callbacks__valid_acc__lower_is_better	False
	callbacks__valid_acc__on_train	False
	callbacks__valid_acc__name	'valid_acc'
	callbacks__valid_acc__target_extractor	<function to_...x7f5fc2731940>
	callbacks__valid_acc__use_caching	True
	callbacks__train_accuracy	<braindecode....x7f607b16d760>
	callbacks__train_accuracy__scoring	'accuracy'
	callbacks__train_accuracy__lower_is_better	False
	callbacks__train_accuracy__on_train	True
	callbacks__train_accuracy__name	'train_accuracy'
	callbacks__train_accuracy__target_extractor	<function to_...x7f5fc2731940>
	callbacks__train_accuracy__use_caching	False
	callbacks__valid_accuracy	<skorch.callb...x7f607b16c7d0>
	callbacks__valid_accuracy__scoring	'accuracy'
	callbacks__valid_accuracy__lower_is_better	False
	callbacks__valid_accuracy__on_train	False
	callbacks__valid_accuracy__name	'valid_accuracy'
	callbacks__valid_accuracy__target_extractor	<function to_...x7f5fc2731940>
	callbacks__valid_accuracy__use_caching	True
	callbacks__lr_scheduler	<skorch.callb...x7f607bad4e60>
	callbacks__lr_scheduler__policy	'CosineAnnealingLR'
	callbacks__lr_scheduler__monitor	'train_loss'
	callbacks__lr_scheduler__event_name	'event_lr'
	callbacks__lr_scheduler__step_every	'epoch'
	callbacks__lr_scheduler__T_max	3
	callbacks__early_stopping	<skorch.callb...x7f607b16cdd0>
	callbacks__early_stopping__monitor	'valid_loss'
	callbacks__early_stopping__lower_is_better	True
	callbacks__early_stopping__patience	10
	callbacks__early_stopping__threshold	0.0001
	callbacks__early_stopping__threshold_mode	'rel'
	callbacks__early_stopping__sink	<built-in function print>
	callbacks__early_stopping__load_best	True
	callbacks__print_log	<skorch.callb...x7f607b0e78f0>
	callbacks__print_log__keys_ignored	None
	callbacks__print_log__sink	<built-in function print>
	callbacks__print_log__tablefmt	'simple'
	callbacks__print_log__floatfmt	'.4f'
	callbacks__print_log__stralign	'right'

Training for longer #

The gallery build above uses only n_epochs = 4. When trained offline for up to 100 epochs with early stopping, the model reaches 68.4 % accuracy on the held-out session (chance = 25 %).

We can load the pretrained checkpoint from the Hugging Face Hub and inspect the full training curves. If the optional huggingface_hub dependency is missing or the download fails, we continue with the locally trained short-run model.

import warnings

repo_id = "braindecode/plot_bcic_iv_2a_moabb_trial"
try:
    from huggingface_hub import hf_hub_download

    clf.initialize()
    clf.load_params(
        f_params=hf_hub_download(repo_id, "params.safetensors"),
        f_history=hf_hub_download(repo_id, "history.json"),
        use_safetensors=True,
    )
except Exception as exc:
    warnings.warn(
        f"Could not load pretrained checkpoint from {repo_id} ({exc}); "
        "continuing with the locally trained short-run model.",
        stacklevel=2,
    )

Re-initializing module.
Re-initializing criterion.
Re-initializing optimizer.

Plot training curves #

import matplotlib.pyplot as plt
import pandas as pd
from matplotlib.lines import Line2D

# Extract loss and accuracy values for plotting from history object
results_columns = ["train_loss", "valid_loss", "train_accuracy", "valid_accuracy"]
df = pd.DataFrame(
    clf.history[:, results_columns],
    columns=results_columns,
    index=clf.history[:, "epoch"],
)

# get percent of misclass for better visual comparison to loss
df = df.assign(
    train_misclass=100 - 100 * df.train_accuracy,
    valid_misclass=100 - 100 * df.valid_accuracy,
)

fig, ax1 = plt.subplots(figsize=(8, 3))
df.loc[:, ["train_loss", "valid_loss"]].plot(
    ax=ax1, style=["-", ":"], marker="o", color="tab:blue", legend=False, fontsize=14
)

ax1.tick_params(axis="y", labelcolor="tab:blue", labelsize=14)
ax1.set_ylabel("Loss", color="tab:blue", fontsize=14)

ax2 = ax1.twinx()  # instantiate a second axes that shares the same x-axis

df.loc[:, ["train_misclass", "valid_misclass"]].plot(
    ax=ax2, style=["-", ":"], marker="o", color="tab:red", legend=False
)
ax2.tick_params(axis="y", labelcolor="tab:red", labelsize=14)
ax2.set_ylabel("Misclassification Rate [%]", color="tab:red", fontsize=14)
ax2.set_ylim(ax2.get_ylim()[0], 85)  # make some room for legend
ax1.set_xlabel("Epoch", fontsize=14)

# where some data has already been plotted to ax
handles = []
handles.append(
    Line2D([0], [0], color="black", linewidth=1, linestyle="-", label="Train")
)
handles.append(
    Line2D([0], [0], color="black", linewidth=1, linestyle=":", label="Valid")
)
plt.legend(handles, [h.get_label() for h in handles], fontsize=14)
plt.tight_layout()

Plotting a Confusion Matrix #

Here we generate a confusion matrix as in [3].

from sklearn.metrics import ConfusionMatrixDisplay

y_true = valid_set.get_metadata().target
y_pred = clf.predict(valid_set)

label_dict = windows_dataset.datasets[0].window_kwargs[0][1]["mapping"]
sorted_items = sorted(label_dict.items(), key=lambda kv: kv[1])
labels = [k for k, _ in sorted_items]
class_ids = [v for _, v in sorted_items]

ConfusionMatrixDisplay.from_predictions(
    y_true, y_pred, labels=class_ids, display_labels=labels
)

<sklearn.metrics._plot.confusion_matrix.ConfusionMatrixDisplay object at 0x7f607b212a80>

References #

Total running time of the script: (0 minutes 15.295 seconds)

Estimated memory usage: 1301 MB

Gallery generated by Sphinx-Gallery