Load and save dataset example#

In this example, we show how to load and save braindecode datasets.

# Authors: Lukas Gemein <l.gemein@gmail.com>
#
# License: BSD (3-clause)

import tempfile

from braindecode.datasets import MOABBDataset
from braindecode.preprocessing import preprocess, Preprocessor
from braindecode.datautil import load_concat_dataset
from braindecode.preprocessing import create_windows_from_events

First, we load some dataset using MOABB.

dataset = MOABBDataset(
    dataset_name='BNCI2014001',
    subject_ids=[1],
)
BNCI2014001 has been renamed to BNCI2014_001. BNCI2014001 will be removed in version 1.1.
The dataset class name 'BNCI2014001' must be an abbreviation of its code 'BNCI2014-001'. See moabb.datasets.base.is_abbrev for more information.

We can apply preprocessing steps to the dataset. It is also possible to skip this step and not apply any preprocessing.

preprocess(
    concat_ds=dataset,
    preprocessors=[Preprocessor(fn='resample', sfreq=10)]
)
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]

<braindecode.datasets.moabb.MOABBDataset object at 0x7f45429de380>

We save the dataset to a an existing directory. It will create a ‘.fif’ file for every dataset in the concat dataset. Additionally it will create two JSON files, the first holding the description of the dataset, the second holding the name of the target. If you want to store to the same directory several times, for example due to trying different preprocessing, you can choose to overwrite the existing files.

tmpdir = tempfile.mkdtemp()  # write in a temporary directory
dataset.save(
    path=tmpdir,
    overwrite=False,
)
Writing /tmp/tmpmcfvy2r0/0/0-raw.fif
Closing /tmp/tmpmcfvy2r0/0/0-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/1/1-raw.fif
Closing /tmp/tmpmcfvy2r0/1/1-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/2/2-raw.fif
Closing /tmp/tmpmcfvy2r0/2/2-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/3/3-raw.fif
Closing /tmp/tmpmcfvy2r0/3/3-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/4/4-raw.fif
Closing /tmp/tmpmcfvy2r0/4/4-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/5/5-raw.fif
Closing /tmp/tmpmcfvy2r0/5/5-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/6/6-raw.fif
Closing /tmp/tmpmcfvy2r0/6/6-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/7/7-raw.fif
Closing /tmp/tmpmcfvy2r0/7/7-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/8/8-raw.fif
Closing /tmp/tmpmcfvy2r0/8/8-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/9/9-raw.fif
Closing /tmp/tmpmcfvy2r0/9/9-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/10/10-raw.fif
Closing /tmp/tmpmcfvy2r0/10/10-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/11/11-raw.fif
Closing /tmp/tmpmcfvy2r0/11/11-raw.fif
[done]

We load the saved dataset from a directory. Signals can be preloaded in compliance with mne. Optionally, only specific ‘.fif’ files can be loaded by specifying their ids. The target name can be changed, if the dataset supports it (TUHAbnormal for example supports ‘pathological’, ‘age’, and ‘gender’. If you stored a preprocessed version with target ‘pathological’ it is possible to change the target upon loading).

dataset_loaded = load_concat_dataset(
    path=tmpdir,
    preload=True,
    ids_to_load=[1, 3],
    target_name=None,
)
Opening raw data file /tmp/tmpmcfvy2r0/1/1-raw.fif...
    Range : 0 ... 3868 =      0.000 ...   386.800 secs
Ready.
Reading 0 ... 3868  =      0.000 ...   386.800 secs...
Opening raw data file /tmp/tmpmcfvy2r0/3/3-raw.fif...
    Range : 0 ... 3868 =      0.000 ...   386.800 secs
Ready.
Reading 0 ... 3868  =      0.000 ...   386.800 secs...

The serialization utility also supports WindowsDatasets, so we create compute windows next.

windows_dataset = create_windows_from_events(
    concat_ds=dataset_loaded,
    trial_start_offset_samples=0,
    trial_stop_offset_samples=0,
)

windows_dataset.description
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
subject session run
0 1 0train 1
1 1 0train 3


Again, we save the dataset to an existing directory. It will create a ‘-epo.fif’ file for every dataset in the concat dataset. Additionally it will create a JSON file holding the description of the dataset. If you want to store to the same directory several times, for example due to trying different windowing parameters, you can choose to overwrite the existing files.

windows_dataset.save(
    path=tmpdir,
    overwrite=True,
)
Writing /tmp/tmpmcfvy2r0/0/0-raw.fif
Closing /tmp/tmpmcfvy2r0/0/0-raw.fif
[done]
Writing /tmp/tmpmcfvy2r0/1/1-raw.fif
Closing /tmp/tmpmcfvy2r0/1/1-raw.fif
[done]
/home/runner/work/braindecode/braindecode/braindecode/datasets/base.py:700: UserWarning: The number of saved datasets (2) does not match the number of existing subdirectories (12). You may now encounter a mix of differently preprocessed datasets!
  warnings.warn(f"The number of saved datasets ({i_ds+1+offset}) "
/home/runner/work/braindecode/braindecode/braindecode/datasets/base.py:708: UserWarning: Chosen directory /tmp/tmpmcfvy2r0 contains other subdirectories or files ['8', '7', '10', '5', '4', '9', '3', '2', '6', '11'].
  warnings.warn(f'Chosen directory {path} contains other '

Load the saved dataset from a directory. Signals can be preloaded in compliance with mne. Optionally, only specific ‘-epo.fif’ files can be loaded by specifying their ids.

windows_dataset_loaded = load_concat_dataset(
    path=tmpdir,
    preload=False,
    ids_to_load=[0],
    target_name=None,
)

windows_dataset_loaded.description
Opening raw data file /tmp/tmpmcfvy2r0/0/0-raw.fif...
    Range : 0 ... 3868 =      0.000 ...   386.800 secs
Ready.
subject session run
0 1 0train 1


Total running time of the script: (0 minutes 4.689 seconds)

Estimated memory usage: 10 MB

Gallery generated by Sphinx-Gallery