Note
Click here to download the full example code
Custom Dataset Example#
This example shows how to convert data X and y as numpy arrays to a braindecode compatible data format.
# Authors: Lukas Gemein <l.gemein@gmail.com>
#
# License: BSD (3-clause)
import mne
from braindecode.datasets import create_from_X_y
To set up the example, we first fetch some data using mne:
# 5, 6, 7, 10, 13, 14 are codes for executed and imagined hands/feet
subject_id = 22
event_codes = [5, 6, 9, 10, 13, 14]
# event_codes = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
# This will download the files if you don't have them yet,
# and then return the paths to the files.
physionet_paths = mne.datasets.eegbci.load_data(
subject_id, event_codes, update_path=False)
# Load each of the files
parts = [mne.io.read_raw_edf(path, preload=True, stim_channel='auto')
for path in physionet_paths]
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R05.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R06.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R09.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R10.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R13.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
Extracting EDF parameters from /home/runner/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S022/S022R14.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Reading 0 ... 19999 = 0.000 ... 124.994 secs...
We take the required data, targets and additional information sampling frequency and channel names from the loaded data. Note that this data and information can originate from any source.
Convert to data format compatible with skorch and braindecode:
windows_dataset = create_from_X_y(
X, y, drop_last_window=False, sfreq=sfreq, ch_names=ch_names,
window_stride_samples=500,
window_size_samples=500,
)
windows_dataset.description # look as dataset description
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Creating RawArray with float64 data, n_channels=64, n_times=20000
Range : 0 ... 19999 = 0.000 ... 124.994 secs
Ready.
Using data from preloaded Raw for 40 events and 500 original time points ...
0 bad epochs dropped
Using data from preloaded Raw for 40 events and 500 original time points ...
0 bad epochs dropped
Using data from preloaded Raw for 40 events and 500 original time points ...
0 bad epochs dropped
Using data from preloaded Raw for 40 events and 500 original time points ...
0 bad epochs dropped
Using data from preloaded Raw for 40 events and 500 original time points ...
0 bad epochs dropped
Using data from preloaded Raw for 40 events and 500 original time points ...
0 bad epochs dropped
You can manipulate the dataset
print(len(windows_dataset)) # get the number of samples
240
You can now index the data
i = 0
x_i, y_i, window_ind = windows_dataset[0]
n_channels, n_times = x_i.shape # the EEG data
_, start_ind, stop_ind = window_ind
print(f"n_channels={n_channels} -- n_times={n_times} -- y_i={y_i}")
print(f"start_ind={start_ind} -- stop_ind={stop_ind}")
Using data from preloaded Raw for 1 events and 500 original time points ...
n_channels=64 -- n_times=500 -- y_i=5
start_ind=0 -- stop_ind=500
Total running time of the script: ( 0 minutes 1.164 seconds)
Estimated memory usage: 96 MB