Note
Click here to download the full example code
Custom Dataset Example¶
This example shows how to convert data X and y as numpy arrays to a braindecode compatible data format.
# Authors: Lukas Gemein <l.gemein@gmail.com>
#
# License: BSD (3-clause)
import mne
from braindecode.datasets import create_from_X_y
To set up the example, we first fetch some data using mne:
# 5, 6, 7, 10, 13, 14 are codes for executed and imagined hands/feet
subject_id = 22
event_codes = [5, 6, 9, 10, 13, 14]
# event_codes = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
# This will download the files if you don't have them yet,
# and then return the paths to the files.
physionet_paths = mne.datasets.eegbci.load_data(
subject_id, event_codes, update_path=False)
# Load each of the files
parts = [mne.io.read_raw_edf(path, preload=True, stim_channel='auto')
for path in physionet_paths]
Out:
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R05.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R06.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R09.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R10.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R13.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R14.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
We take the required data, targets and additional information sampling frequency and channel names from the loaded data. Note that this data and information can originate from any source.
Convert to data format compatible with skorch and braindecode:
windows_dataset = create_from_X_y(
X, y, drop_last_window=False, sfreq=sfreq, ch_names=ch_names,
window_stride_samples=500,
window_size_samples=500,
)
windows_dataset.description # look as dataset description
Out:
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
40 matching events found
No baseline correction applied
0 projection items activated
Loading data for 40 events and 500 original time points ...
0 bad epochs dropped
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
40 matching events found
No baseline correction applied
0 projection items activated
Loading data for 40 events and 500 original time points ...
0 bad epochs dropped
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
40 matching events found
No baseline correction applied
0 projection items activated
Loading data for 40 events and 500 original time points ...
0 bad epochs dropped
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
40 matching events found
No baseline correction applied
0 projection items activated
Loading data for 40 events and 500 original time points ...
0 bad epochs dropped
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
40 matching events found
No baseline correction applied
0 projection items activated
Loading data for 40 events and 500 original time points ...
0 bad epochs dropped
Adding metadata with 4 columns
Replacing existing metadata with 4 columns
40 matching events found
No baseline correction applied
0 projection items activated
Loading data for 40 events and 500 original time points ...
0 bad epochs dropped
You can manipulate the dataset
print(len(windows_dataset)) # get the number of samples
Out:
240
You can now index the data
i = 0
x_i, y_i, window_ind = windows_dataset[0]
n_channels, n_times = x_i.shape # the EEG data
_, start_ind, stop_ind = window_ind
print(f"n_channels={n_channels} -- n_times={n_times} -- y_i={y_i}")
print(f"start_ind={start_ind} -- stop_ind={stop_ind}")
Out:
Loading data for 1 events and 500 original time points ...
n_channels=64 -- n_times=500 -- y_i=5
start_ind=0 -- stop_ind=500
Total running time of the script: ( 0 minutes 1.127 seconds)
Estimated memory usage: 96 MB