MOABB Dataset Example#

In this example, we show how to fetch and prepare a MOABB dataset for usage with Braindecode.

# Authors: Lukas Gemein <l.gemein@gmail.com>
#          Hubert Banville <hubert.jbanville@gmail.com>
#          Simon Brandt <simonbrandt@protonmail.com>
#          Daniel Wilson <dan.c.wil@gmail.com>
#
# License: BSD (3-clause)

from braindecode.datasets import MOABBDataset
from braindecode.preprocessing import preprocess

First, we create a dataset based on BCIC IV 2a fetched with MOABB,

dataset = MOABBDataset(dataset_name="BNCI2014_001", subject_ids=[1])

The dataset has a pandas DataFrame with additional description of its internal datasets

dataset.description
subject session run
0 1 0train 0
1 1 0train 1
2 1 0train 2
3 1 0train 3
4 1 0train 4
5 1 0train 5
6 1 1test 0
7 1 1test 1
8 1 1test 2
9 1 1test 3
10 1 1test 4
11 1 1test 5


We can iterate through dataset which yields one time point of a continuous signal x, and a target y (which can be None if targets are not defined for the entire continuous signal).

for x, y in dataset:
    print(x.shape, y)
    break
(26, 1) None

We can apply preprocessing transforms that are defined in mne and work in-place. Braindecode provides dedicated preprocessing classes for common operations, or you can use the generic Preprocessor class.

# Using the new dedicated preprocessing classes (recommended):
from braindecode.preprocessing import Pick, Resample

preprocessors = [
    Pick(picks="eeg", exclude=()),  # Keep only EEG channels
    Resample(sfreq=100),  # Resample to 100 Hz
]

# Alternative: using the generic Preprocessor class (legacy approach):
# preprocessors = [
#     Preprocessor("pick_types", eeg=True, meg=False, stim=True),
#     Preprocessor("resample", sfreq=100),
# ]

print(dataset.datasets[0].raw.info["sfreq"])
preprocess(dataset, preprocessors)
print(dataset.datasets[0].raw.info["sfreq"])
250.0
100.0

We can easily split the dataset based on a criteria applied to the description DataFrame:

subsets = dataset.split("session")
print({subset_name: len(subset) for subset_name, subset in subsets.items()})
{'0train': 232164, '1test': 232164}

See our Trialwise Decoding and Cropped Decoding examples for training with this dataset.

Total running time of the script: (0 minutes 4.624 seconds)

Estimated memory usage: 1272 MB

Gallery generated by Sphinx-Gallery