Load and save dataset example#

In this example, we show how to load and save braindecode datasets.

# Authors: Lukas Gemein <l.gemein@gmail.com>
#
# License: BSD (3-clause)

import tempfile

from braindecode.datasets import MOABBDataset
from braindecode.preprocessing import preprocess, Preprocessor
from braindecode.datautil import load_concat_dataset
from braindecode.preprocessing import create_windows_from_events

First, we load some dataset using MOABB.

dataset = MOABBDataset(
    dataset_name='BNCI2014001',
    subject_ids=[1],
)
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]

We can apply preprocessing steps to the dataset. It is also possible to skip this step and not apply any preprocessing.

preprocess(
    concat_ds=dataset,
    preprocessors=[Preprocessor(fn='resample', sfreq=10)]
)
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]

<braindecode.datasets.moabb.MOABBDataset object at 0x7f50b21bca50>

We save the dataset to a an existing directory. It will create a ‘.fif’ file for every dataset in the concat dataset. Additionally it will create two JSON files, the first holding the description of the dataset, the second holding the name of the target. If you want to store to the same directory several times, for example due to trying different preprocessing, you can choose to overwrite the existing files.

tmpdir = tempfile.mkdtemp()  # write in a temporary directory
dataset.save(
    path=tmpdir,
    overwrite=False,
)
Writing /tmp/tmp_73cii3t/0/0-raw.fif
Closing /tmp/tmp_73cii3t/0/0-raw.fif
[done]
Writing /tmp/tmp_73cii3t/1/1-raw.fif
Closing /tmp/tmp_73cii3t/1/1-raw.fif
[done]
Writing /tmp/tmp_73cii3t/2/2-raw.fif
Closing /tmp/tmp_73cii3t/2/2-raw.fif
[done]
Writing /tmp/tmp_73cii3t/3/3-raw.fif
Closing /tmp/tmp_73cii3t/3/3-raw.fif
[done]
Writing /tmp/tmp_73cii3t/4/4-raw.fif
Closing /tmp/tmp_73cii3t/4/4-raw.fif
[done]
Writing /tmp/tmp_73cii3t/5/5-raw.fif
Closing /tmp/tmp_73cii3t/5/5-raw.fif
[done]
Writing /tmp/tmp_73cii3t/6/6-raw.fif
Closing /tmp/tmp_73cii3t/6/6-raw.fif
[done]
Writing /tmp/tmp_73cii3t/7/7-raw.fif
Closing /tmp/tmp_73cii3t/7/7-raw.fif
[done]
Writing /tmp/tmp_73cii3t/8/8-raw.fif
Closing /tmp/tmp_73cii3t/8/8-raw.fif
[done]
Writing /tmp/tmp_73cii3t/9/9-raw.fif
Closing /tmp/tmp_73cii3t/9/9-raw.fif
[done]
Writing /tmp/tmp_73cii3t/10/10-raw.fif
Closing /tmp/tmp_73cii3t/10/10-raw.fif
[done]
Writing /tmp/tmp_73cii3t/11/11-raw.fif
Closing /tmp/tmp_73cii3t/11/11-raw.fif
[done]

We load the saved dataset from a directory. Signals can be preloaded in compliance with mne. Optionally, only specific ‘.fif’ files can be loaded by specifying their ids. The target name can be changed, if the dataset supports it (TUHAbnormal for example supports ‘pathological’, ‘age’, and ‘gender’. If you stored a preprocessed version with target ‘pathological’ it is possible to change the target upon loading).

dataset_loaded = load_concat_dataset(
    path=tmpdir,
    preload=True,
    ids_to_load=[1, 3],
    target_name=None,
)
Opening raw data file /tmp/tmp_73cii3t/1/1-raw.fif...
    Range : 0 ... 3868 =      0.000 ...   386.800 secs
Ready.
Reading 0 ... 3868  =      0.000 ...   386.800 secs...
Opening raw data file /tmp/tmp_73cii3t/3/3-raw.fif...
    Range : 0 ... 3868 =      0.000 ...   386.800 secs
Ready.
Reading 0 ... 3868  =      0.000 ...   386.800 secs...

The serialization utility also supports WindowsDatasets, so we create compute windows next.

windows_dataset = create_windows_from_events(
    concat_ds=dataset_loaded,
    trial_start_offset_samples=0,
    trial_stop_offset_samples=0,
)

windows_dataset.description
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Using data from preloaded Raw for 48 events and 40 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Using data from preloaded Raw for 48 events and 40 original time points ...
0 bad epochs dropped
subject session run
0 1 session_T run_1
1 1 session_T run_3


Again, we save the dataset to an existing directory. It will create a ‘-epo.fif’ file for every dataset in the concat dataset. Additionally it will create a JSON file holding the description of the dataset. If you want to store to the same directory several times, for example due to trying different windowing parameters, you can choose to overwrite the existing files.

windows_dataset.save(
    path=tmpdir,
    overwrite=True,
)
Using data from preloaded Raw for 1 events and 40 original time points ...
Using data from preloaded Raw for 48 events and 40 original time points ...
Using data from preloaded Raw for 1 events and 40 original time points ...
Using data from preloaded Raw for 48 events and 40 original time points ...
/home/runner/work/braindecode/braindecode/braindecode/datasets/base.py:570: UserWarning: The number of saved datasets (2) does not match the number of existing subdirectories (12). You may now encounter a mix of differently preprocessed datasets!
  f"datasets!", UserWarning)
/home/runner/work/braindecode/braindecode/braindecode/datasets/base.py:574: UserWarning: Chosen directory /tmp/tmp_73cii3t contains other subdirectories or files ['5', '11', '3', '7', '2', '10', '4', '9', '8', '6'].
  warnings.warn(f'Chosen directory {path} contains other '

Load the saved dataset from a directory. Signals can be preloaded in compliance with mne. Optionally, only specific ‘-epo.fif’ files can be loaded by specifying their ids.

windows_dataset_loaded = load_concat_dataset(
    path=tmpdir,
    preload=False,
    ids_to_load=[0],
    target_name=None,
)

windows_dataset_loaded.description
Reading /tmp/tmp_73cii3t/0/0-epo.fif ...
    Found the data of interest:
        t =       0.00 ...    3900.00 ms
        0 CTF compensation matrices available
Adding metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
subject session run
0 1 session_T run_1


Total running time of the script: ( 0 minutes 4.616 seconds)

Estimated memory usage: 407 MB

Gallery generated by Sphinx-Gallery