Split Dataset Example

In this example, we show multiple ways of how to split datasets.

# Authors: Lukas Gemein <l.gemein@gmail.com>
#
# License: BSD (3-clause)

from IPython.display import display

from braindecode.datasets import MOABBDataset
from braindecode.datautil.windowers import create_windows_from_events

First, we create a dataset based on BCIC IV 2a fetched with MOABB,

ds = MOABBDataset(dataset_name="BNCI2014001", subject_ids=[1])

Out:

/home/circleci/.local/lib/python3.7/site-packages/moabb/datasets/download.py:53: RuntimeWarning: Setting non-standard config type: "MNE_DATASETS_BNCI_PATH"
  set_config(key, osp.join(osp.expanduser("~"), "mne_data"))
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]
48 events found
Event IDs: [1 2 3 4]

ds has a pandas DataFrame with additional description of its internal datasets

display(ds.description)

Out:

    subject  ...    run
0         1  ...  run_0
1         1  ...  run_1
2         1  ...  run_2
3         1  ...  run_3
4         1  ...  run_4
5         1  ...  run_5
6         1  ...  run_0
7         1  ...  run_1
8         1  ...  run_2
9         1  ...  run_3
10        1  ...  run_4
11        1  ...  run_5

[12 rows x 3 columns]

We can split the dataset based on the info in the description, for example based on different runs. The returned dictionary will have string keys corresponding to unique entries in the description DataFrame column

splits = ds.split("run")
display(splits)
display(splits["run_4"].description)

Out:

{'run_0': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2e8127950>, 'run_1': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2d8991950>, 'run_2': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2d8810e10>, 'run_3': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2e81c0e10>, 'run_4': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2cf92ed10>, 'run_5': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2cf92ee10>}
   subject  ...    run
0        1  ...  run_4
1        1  ...  run_4

[2 rows x 3 columns]

We can also split the dataset based on a list of integers corresponding to rows in the description. In this case, the returned dictionary will have ‘0’ as the only key

splits = ds.split([0, 1, 5])
display(splits)
display(splits["0"].description)

Out:

{'0': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2cf92ef10>}
   subject  ...    run
0        1  ...  run_0
1        1  ...  run_1
2        1  ...  run_5

[3 rows x 3 columns]

If we want multiple splits based on indices, we can also specify a list of list of integers. In this case, the dictionary will have string keys representing the id of the dataset split in the order of the given list of integers

splits = ds.split([[0, 1, 5], [2, 3, 4], [6, 7, 8, 9, 10, 11]])
display(splits)
display(splits["2"].description)

Out:

{'0': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2e819bf10>, '1': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2da1cf610>, '2': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2da1cff10>}
   subject  ...    run
0        1  ...  run_0
1        1  ...  run_1
2        1  ...  run_2
3        1  ...  run_3
4        1  ...  run_4
5        1  ...  run_5

[6 rows x 3 columns]

Similarly, we can split datasets after creating windows

windows = create_windows_from_events(
    ds, trial_start_offset_samples=0, trial_stop_offset_samples=0)
splits = windows.split("run")
display(splits)
splits = windows.split([4, 8])
display(splits)
splits = windows.split([[4, 8], [5, 9, 11]])
display(splits)

Out:

Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
Used Annotations descriptions: ['feet', 'left_hand', 'right_hand', 'tongue']
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
48 matching events found
No baseline correction applied
0 projection items activated
Loading data for 48 events and 1000 original time points ...
0 bad epochs dropped
{'run_0': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2cf92ef90>, 'run_1': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2d8810e10>, 'run_2': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2cf937790>, 'run_3': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2cf937810>, 'run_4': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2cf912e10>, 'run_5': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2cf912bd0>}
{'0': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2da1cff10>}
{'0': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2cf937810>, '1': <braindecode.datasets.base.BaseConcatDataset object at 0x7fe2e819bf10>}

Total running time of the script: ( 0 minutes 14.083 seconds)

Estimated memory usage: 406 MB

Gallery generated by Sphinx-Gallery