Load and save dataset example¶

In this example, we show how to load and save braindecode datasets.

# Authors: Lukas Gemein <l.gemein@gmail.com>
#
# License: BSD (3-clause)

import tempfile

from braindecode.datasets import MOABBDataset
from braindecode.preprocessing import preprocess, Preprocessor
from braindecode.datautil import load_concat_dataset
from braindecode.preprocessing import create_windows_from_events

First, we load some dataset using MOABB.

dataset = MOABBDataset(
    dataset_name='BNCI2014001',
    subject_ids=[1],
)

Out:

48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]

We can apply preprocessing steps to the dataset. It is also possible to skip this step and not apply any preprocessing.

preprocess(
    concat_ds=dataset,
    preprocessors=[Preprocessor(fn='resample', sfreq=10)]
)

Out:

48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]

<braindecode.datasets.moabb.MOABBDataset object at 0x7f748ee50850>

We save the dataset to a an existing directory. It will create a ‘.fif’ file for every dataset in the concat dataset. Additionally it will create two JSON files, the first holding the description of the dataset, the second holding the name of the target. If you want to store to the same directory several times, for example due to trying different preprocessing, you can choose to overwrite the existing files.

tmpdir = tempfile.mkdtemp()  # write in a temporary directory
dataset.save(
    path=tmpdir,
    overwrite=False,
)

Out:

Writing /tmp/tmp4_0_urks/0/0-raw.fif
Closing /tmp/tmp4_0_urks/0/0-raw.fif
[done]
Writing /tmp/tmp4_0_urks/1/1-raw.fif
Closing /tmp/tmp4_0_urks/1/1-raw.fif
[done]
Writing /tmp/tmp4_0_urks/2/2-raw.fif
Closing /tmp/tmp4_0_urks/2/2-raw.fif
[done]
Writing /tmp/tmp4_0_urks/3/3-raw.fif
Closing /tmp/tmp4_0_urks/3/3-raw.fif
[done]
Writing /tmp/tmp4_0_urks/4/4-raw.fif
Closing /tmp/tmp4_0_urks/4/4-raw.fif
[done]
Writing /tmp/tmp4_0_urks/5/5-raw.fif
Closing /tmp/tmp4_0_urks/5/5-raw.fif
[done]
Writing /tmp/tmp4_0_urks/6/6-raw.fif
Closing /tmp/tmp4_0_urks/6/6-raw.fif
[done]
Writing /tmp/tmp4_0_urks/7/7-raw.fif
Closing /tmp/tmp4_0_urks/7/7-raw.fif
[done]
Writing /tmp/tmp4_0_urks/8/8-raw.fif
Closing /tmp/tmp4_0_urks/8/8-raw.fif
[done]
Writing /tmp/tmp4_0_urks/9/9-raw.fif
Closing /tmp/tmp4_0_urks/9/9-raw.fif
[done]
Writing /tmp/tmp4_0_urks/10/10-raw.fif
Closing /tmp/tmp4_0_urks/10/10-raw.fif
[done]
Writing /tmp/tmp4_0_urks/11/11-raw.fif
Closing /tmp/tmp4_0_urks/11/11-raw.fif
[done]

We load the saved dataset from a directory. Signals can be preloaded in compliance with mne. Optionally, only specific ‘.fif’ files can be loaded by specifying their ids. The target name can be changed, if the dataset supports it (TUHAbnormal for example supports ‘pathological’, ‘age’, and ‘gender’. If you stored a preprocessed version with target ‘pathological’ it is possible to change the target upon loading).

dataset_loaded = load_concat_dataset(
    path=tmpdir,
    preload=True,
    ids_to_load=[1, 3],
    target_name=None,
)

Out:

Opening raw data file /tmp/tmp4_0_urks/1/1-raw.fif...
    Range : 0 ... 3868 =      0.000 ...   386.800 secs
Ready.
Reading 0 ... 3868  =      0.000 ...   386.800 secs...
Opening raw data file /tmp/tmp4_0_urks/3/3-raw.fif...
    Range : 0 ... 3868 =      0.000 ...   386.800 secs
Ready.
Reading 0 ... 3868  =      0.000 ...   386.800 secs...

The serialization utility also supports WindowsDatasets, so we create compute windows next.

windows_dataset = create_windows_from_events(
    concat_ds=dataset_loaded,
    trial_start_offset_samples=0,
    trial_stop_offset_samples=0,
)

windows_dataset.description

Out:

Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 40 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 40 original time points ...
0 bad epochs dropped

	subject	session	run
0	1	session_T	run_1
1	1	session_T	run_3

Again, we save the dataset to an existing directory. It will create a ‘-epo.fif’ file for every dataset in the concat dataset. Additionally it will create a JSON file holding the description of the dataset. If you want to store to the same directory several times, for example due to trying different windowing parameters, you can choose to overwrite the existing files.

windows_dataset.save(
    path=tmpdir,
    overwrite=True,
)

Out:

Loading data for 1 events and 40 original time points ...
Loading data for 48 events and 40 original time points ...
Loading data for 1 events and 40 original time points ...
Loading data for 48 events and 40 original time points ...
/home/runner/work/braindecode/braindecode/braindecode/datasets/base.py:569: UserWarning: The number of saved datasets (2) does not match the number of existing subdirectories (12). You may now encounter a mix of differently preprocessed datasets!
  f"datasets!", UserWarning)
/home/runner/work/braindecode/braindecode/braindecode/datasets/base.py:573: UserWarning: Chosen directory /tmp/tmp4_0_urks contains other subdirectories or files ['10', '8', '5', '7', '3', '11', '6', '2', '4', '9'].
  warnings.warn(f'Chosen directory {path} contains other '

Load the saved dataset from a directory. Signals can be preloaded in compliance with mne. Optionally, only specific ‘-epo.fif’ files can be loaded by specifying their ids.

windows_dataset_loaded = load_concat_dataset(
    path=tmpdir,
    preload=False,
    ids_to_load=[0],
    target_name=None,
)

windows_dataset_loaded.description

Out:

Reading /tmp/tmp4_0_urks/0/0-epo.fif ...
    Found the data of interest:
        t =       0.00 ...    3900.00 ms
        0 CTF compensation matrices available
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated

	subject	session	run
0	1	session_T	run_1

Total running time of the script: ( 0 minutes 4.979 seconds)

Estimated memory usage: 408 MB

Gallery generated by Sphinx-Gallery