braindecode.datasets package#

Loader code for some datasets.

class braindecode.datasets.BCICompetitionIVDataset4(subject_ids=None)[source]#

Bases: BaseConcatDataset

BCI competition IV dataset 4.

Contains ECoG recordings for three patients moving fingers during the experiment. Targets correspond to the time courses of the flexion of each of five fingers. See http://www.bbci.de/competition/iv/desc_4.pdf and http://www.bbci.de/competition/iv/ for the dataset and competition description. ECoG library containing the dataset: https://searchworks.stanford.edu/view/zk881ps0522

Notes

When using this dataset please cite [1] .

Parameters:: subject_ids (list(int) | int | None) – (list of) int of subject(s) to be loaded. If None, load all available subjects. Should be in range 1-3.

References

[1]

Miller, Kai J. “A library of human electrocorticographic data and analyses.”

Nature human behaviour 3, no. 11 (2019): 1225-1235. https://doi.org/10.1038/s41562-019-0678-3

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

static download(path=None, force_update=False, verbose=None)[source]#

Download the dataset.

Parameters:

location. (path (None | str) – Location of where to look for the data storing) –
None (or None) – If not) –
parameter (the environment variable or config) –
exist (MNE_DATASETS_(dataset)_PATH is used. If it doesn’t) –
“~/mne_data” (the) –
path (directory is used. If the dataset is not found under the given) –
data (the) –
folder. (will be automatically downloaded to the specified) –
exists. (force_update (bool) – Force update of the dataset even if a local copy) –
(bool (verbose) –
str –
int –
None –
level (override default verbose) –
mne.verbose()) ((see) –

possible_subjects = [1, 2, 3]#

class braindecode.datasets.BNCI2014001(subject_ids)[source]#

Bases: MOABBDataset

BNCI 2014-001 Motor Imagery dataset.

Dataset summary

Dataset IIa from BCI Competition 4 [R382d436f3223-1].

Dataset Description

This data set consists of EEG data from 9 subjects. The cue-based BCI paradigm consisted of four different motor imagery tasks, namely the imag- ination of movement of the left hand (class 1), right hand (class 2), both feet (class 3), and tongue (class 4). Two sessions on different days were recorded for each subject. Each session is comprised of 6 runs separated by short breaks. One run consists of 48 trials (12 for each of the four possible classes), yielding a total of 288 trials per session.

The subjects were sitting in a comfortable armchair in front of a computer screen. At the beginning of a trial ( t = 0 s), a fixation cross appeared on the black screen. In addition, a short acoustic warning tone was presented. After two seconds ( t = 2 s), a cue in the form of an arrow pointing either to the left, right, down or up (corresponding to one of the four classes left hand, right hand, foot or tongue) appeared and stayed on the screen for 1.25 s. This prompted the subjects to perform the desired motor imagery task. No feedback was provided. The subjects were ask to carry out the motor imagery task until the fixation cross disappeared from the screen at t = 6 s.

Twenty-two Ag/AgCl electrodes (with inter-electrode distances of 3.5 cm) were used to record the EEG; the montage is shown in Figure 3 left. All signals were recorded monopolarly with the left mastoid serving as reference and the right mastoid as ground. The signals were sampled with. 250 Hz and bandpass-filtered between 0.5 Hz and 100 Hz. The sensitivity of the amplifier was set to 100 μV . An additional 50 Hz notch filter was enabled to suppress line noise

Parameters:

subject_ids: list(int) | int | None: (list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.

See moabb.datasets.bnci.BNCI2014001

class BNCI2014001(*args, **kwargs)[source]#

Bases: BNCI2014_001

BNCI 2014-001 Motor Imagery dataset.

Dataset summary

Dataset IIa from BCI Competition 4 [1].

Dataset Description

This data set consists of EEG data from 9 subjects. The cue-based BCI paradigm consisted of four different motor imagery tasks, namely the imag- ination of movement of the left hand (class 1), right hand (class 2), both feet (class 3), and tongue (class 4). Two sessions on different days were recorded for each subject. Each session is comprised of 6 runs separated by short breaks. One run consists of 48 trials (12 for each of the four possible classes), yielding a total of 288 trials per session.

The subjects were sitting in a comfortable armchair in front of a computer screen. At the beginning of a trial ( t = 0 s), a fixation cross appeared on the black screen. In addition, a short acoustic warning tone was presented. After two seconds ( t = 2 s), a cue in the form of an arrow pointing either to the left, right, down or up (corresponding to one of the four classes left hand, right hand, foot or tongue) appeared and stayed on the screen for 1.25 s. This prompted the subjects to perform the desired motor imagery task. No feedback was provided. The subjects were ask to carry out the motor imagery task until the fixation cross disappeared from the screen at t = 6 s.

Twenty-two Ag/AgCl electrodes (with inter-electrode distances of 3.5 cm) were used to record the EEG; the montage is shown in Figure 3 left. All signals were recorded monopolarly with the left mastoid serving as reference and the right mastoid as ground. The signals were sampled with. 250 Hz and bandpass-filtered between 0.5 Hz and 100 Hz. The sensitivity of the amplifier was set to 100 μV . An additional 50 Hz notch filter was enabled to suppress line noise

References

[1]

Tangermann, M., Müller, K.R., Aertsen, A., Birbaumer, N., Braun, C., Brunner, C., Leeb, R., Mehring, C., Miller, K.J., Mueller-Putz, G. and Nolte, G., 2012. Review of the BCI competition IV. Frontiers in neuroscience, 6, p.55.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

doc = 'See moabb.datasets.bnci.BNCI2014001\n\n Parameters\n ----------\n subject_ids: list(int) | int | None\n (list of) int of subject(s) to be fetched. If None, data of all\n subjects is fetched.\n '#

class braindecode.datasets.BaseConcatDataset(list_of_ds, target_transform=None)[source]#

Bases: ConcatDataset

A base class for concatenated datasets. Holds either mne.Raw or mne.Epoch in self.datasets and has a pandas DataFrame with additional description.

Parameters:

list_of_ds (list) – list of BaseDataset, BaseConcatDataset or WindowsDataset
target_transform (callable | None) – Optional function to call on targets before returning them.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

property description#

get_metadata()[source]#

Concatenate the metadata and description of the wrapped Epochs.

Returns:: metadata – DataFrame containing as many rows as there are windows in the BaseConcatDataset, with the metadata and description information for each window.
Return type:: pd.DataFrame

save(path, overwrite=False, offset=0)[source]#

Save datasets to files by creating one subdirectory for each dataset: path/

0/
0-raw.fif | 0-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)

1/
1-raw.fif | 1-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)

Parameters:

path (str) –

Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.

set_description(description, overwrite=False)[source]#

Update (add or overwrite) the dataset description.

Parameters:

description (dict | pd.DataFrame) – Description in the form key: value where the length of the value has to match the number of datasets.
overwrite (bool) – Has to be True if a key in description already exists in the dataset description.

split(by=None, property=None, split_ids=None)[source]#

Split the dataset based on information listed in its description DataFrame or based on indices.

Parameters:

by (str | list | dict) – If by is a string, splitting is performed based on the description DataFrame column with this name. If by is a (list of) list of integers, the position in the first list corresponds to the split id and the integers to the datapoints of that split. If a dict then each key will be used in the returned splits dict and each value should be a list of int.
property (str) – Some property which is listed in info DataFrame.
split_ids (list | dict) – List of indices to be combined in a subset. It can be a list of int or a list of list of int.

Returns:

splits – A dictionary with the name of the split (a string) as key and the dataset as value.

Return type:

dict

property target_transform#

property transform#

class braindecode.datasets.BaseDataset(raw, description=None, target_name=None, transform=None)[source]#

Bases: Dataset

Returns samples from an mne.io.Raw object along with a target.

Dataset which serves samples from an mne.io.Raw object along with a target. The target is unique for the dataset, and is obtained through the description attribute.

Parameters:

raw (mne.io.Raw) – Continuous data.
description (dict | pandas.Series | None) – Holds additional description about the continuous signal / subject.
target_name (str | tuple | None) – Name(s) of the index in description that should be used to provide the target (e.g., to be used in a prediction task later on).
transform (callable | None) – On-the-fly transform applied to the example before it is returned.

property description#

set_description(description, overwrite=False)[source]#

Update (add or overwrite) the dataset description.

Parameters:

description (dict | pd.Series) – Description in the form key: value.
overwrite (bool) – Has to be True if a key in description already exists in the dataset description.

property transform#

class braindecode.datasets.HGD(subject_ids)[source]#

Bases: MOABBDataset

High-gamma dataset described in Schirrmeister et al. 2017.

Dataset summary

Name

#Subj

#Chan

#Classes

#Trials / class

Trials len

Sampling rate

#Sessions

Schirrmeister2017

14

128

4

120

4s

500Hz

1

Dataset from [R5e478952091a-1]

Our “High-Gamma Dataset” is a 128-electrode dataset (of which we later only use 44 sensors covering the motor cortex, (see Section 2.7.1), obtained from 14 healthy subjects (6 female, 2 left-handed, age 27.2 ± 3.6 (mean ± std)) with roughly 1000 (963.1 ± 150.9, mean ± std) four-second trials of executed movements divided into 13 runs per subject. The four classes of movements were movements of either the left hand, the right hand, both feet, and rest (no movement, but same type of visual cue as for the other classes). The training set consists of the approx. 880 trials of all runs except the last two runs, the test set of the approx. 160 trials of the last 2 runs. This dataset was acquired in an EEG lab optimized for non-invasive detection of high- frequency movement-related EEG components (Ball et al., 2008; Darvas et al., 2010).

Depending on the direction of a gray arrow that was shown on black back- ground, the subjects had to repetitively clench their toes (downward arrow), perform sequential finger-tapping of their left (leftward arrow) or right (rightward arrow) hand, or relax (upward arrow). The movements were selected to require little proximal muscular activity while still being complex enough to keep subjects in- volved. Within the 4-s trials, the subjects performed the repetitive movements at their own pace, which had to be maintained as long as the arrow was showing. Per run, 80 arrows were displayed for 4 s each, with 3 to 4 s of continuous random inter-trial interval. The order of presentation was pseudo-randomized, with all four arrows being shown every four trials. Ideally 13 runs were performed to collect 260 trials of each movement and rest. The stimuli were presented and the data recorded with BCI2000 (Schalk et al., 2004). The experiment was approved by the ethical committee of the University of Freiburg.

Parameters:

subject_ids: list(int) | int | None: (list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.

See moabb.datasets.schirrmeister2017.Schirrmeister2017

class Schirrmeister2017[source]#

Bases: BaseDataset

High-gamma dataset described in Schirrmeister et al. 2017.

Dataset summary

Name	#Subj	#Chan	#Classes	#Trials / class	Trials len	Sampling rate	#Sessions
Schirrmeister2017	14	128	4	120	4s	500Hz	1

Dataset from [1]

Our “High-Gamma Dataset” is a 128-electrode dataset (of which we later only use 44 sensors covering the motor cortex, (see Section 2.7.1), obtained from 14 healthy subjects (6 female, 2 left-handed, age 27.2 ± 3.6 (mean ± std)) with roughly 1000 (963.1 ± 150.9, mean ± std) four-second trials of executed movements divided into 13 runs per subject. The four classes of movements were movements of either the left hand, the right hand, both feet, and rest (no movement, but same type of visual cue as for the other classes). The training set consists of the approx. 880 trials of all runs except the last two runs, the test set of the approx. 160 trials of the last 2 runs. This dataset was acquired in an EEG lab optimized for non-invasive detection of high- frequency movement-related EEG components (Ball et al., 2008; Darvas et al., 2010).

Depending on the direction of a gray arrow that was shown on black back- ground, the subjects had to repetitively clench their toes (downward arrow), perform sequential finger-tapping of their left (leftward arrow) or right (rightward arrow) hand, or relax (upward arrow). The movements were selected to require little proximal muscular activity while still being complex enough to keep subjects in- volved. Within the 4-s trials, the subjects performed the repetitive movements at their own pace, which had to be maintained as long as the arrow was showing. Per run, 80 arrows were displayed for 4 s each, with 3 to 4 s of continuous random inter-trial interval. The order of presentation was pseudo-randomized, with all four arrows being shown every four trials. Ideally 13 runs were performed to collect 260 trials of each movement and rest. The stimuli were presented and the data recorded with BCI2000 (Schalk et al., 2004). The experiment was approved by the ethical committee of the University of Freiburg.

References

[1]

Schirrmeister, Robin Tibor, et al. “Deep learning with convolutional neural networks for EEG decoding and visualization.” Human brain mapping 38.11 (2017): 5391-5420.

data_path(subject, path=None, force_update=False, update_path=None, verbose=None)[source]#

Get path to local copy of a subject data.

Parameters:

subject (int) – Number of subject to use
path (None | str) – Location of where to look for the data storing location. If None, the environment variable or config parameter MNE_DATASETS_(dataset)_PATH is used. If it doesn’t exist, the “~/mne_data” directory is used. If the dataset is not found under the given path, the data will be automatically downloaded to the specified folder.
force_update (bool) – Force update of the dataset even if a local copy exists.
update_path (bool | None Deprecated) – If True, set the MNE_DATASETS_(dataset)_PATH in mne-python config to the given path. If None, the user is prompted.
verbose (bool, str, int, or None) – If not None, override default verbose level (see mne.verbose()).

Returns:

path – Local path to the given data file. This path is contained inside a list of length one, for compatibility.

Return type:

list of str

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

doc = 'See moabb.datasets.schirrmeister2017.Schirrmeister2017\n\n Parameters\n ----------\n subject_ids: list(int) | int | None\n (list of) int of subject(s) to be fetched. If None, data of all\n subjects is fetched.\n '#

class braindecode.datasets.MOABBDataset(dataset_name, subject_ids, dataset_kwargs=None)[source]#

Bases: BaseConcatDataset

A class for moabb datasets.

Parameters:

dataset_name (str) – name of dataset included in moabb to be fetched
subject_ids (list(int) | int | None) – (list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.
dataset_kwargs (dict, optional) – optional dictionary containing keyword arguments to pass to the moabb dataset when instantiating it.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

class braindecode.datasets.SleepPhysionet(subject_ids=None, recording_ids=None, preload=False, load_eeg_only=True, crop_wake_mins=30, crop=None)[source]#

Bases: BaseConcatDataset

Sleep Physionet dataset.

Sleep dataset from https://physionet.org/content/sleep-edfx/1.0.0/. Contains overnight recordings from 78 healthy subjects.

See [MNE example](https://mne.tools/stable/auto_tutorials/sample-datasets/plot_sleep.html).

Parameters:

subject_ids (list(int) | int | None) – (list of) int of subject(s) to be loaded. If None, load all available subjects.
recording_ids (list(int) | None) – Recordings to load per subject (each subject except 13 has two recordings). Can be [1], [2] or [1, 2] (same as None).
preload (bool) – If True, preload the data of the Raw objects.
load_eeg_only (bool) – If True, only load the EEG channels and discard the others (EOG, EMG, temperature, respiration) to avoid resampling the other signals.
crop_wake_mins (float) – Number of minutes of wake time to keep before the first sleep event and after the last sleep event. Used to reduce the imbalance in this dataset. Default of 30 mins.
crop (None | tuple) – If not None crop the raw files (e.g. to use only the first 3h). Example: crop=(0, 3600*3) to keep only the first 3h.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

class braindecode.datasets.TUH(path, recording_ids=None, target_name=None, preload=False, add_physician_reports=False, n_jobs=1)[source]#

Bases: BaseConcatDataset

Temple University Hospital (TUH) EEG Corpus (www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml#c_tueg).

Parameters:

path (str) – Parent directory of the dataset.
recording_ids (list(int) | int) – A (list of) int of recording id(s) to be read (order matters and will overwrite default chronological order, e.g. if recording_ids=[1,0], then the first recording returned by this class will be chronologically later then the second recording. Provide recording_ids in ascending order to preserve chronological order.).
target_name (str) – Can be ‘gender’, or ‘age’.
preload (bool) – If True, preload the data of the Raw objects.
add_physician_reports (bool) – If True, the physician reports will be read from disk and added to the description.
n_jobs (int) – Number of jobs to be used to read files in parallel.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

class braindecode.datasets.TUHAbnormal(path, recording_ids=None, target_name='pathological', preload=False, add_physician_reports=False, n_jobs=1)[source]#

Bases: TUH

Temple University Hospital (TUH) Abnormal EEG Corpus. see www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml#c_tuab

Parameters:

path (str) – Parent directory of the dataset.
recording_ids (list(int) | int) – A (list of) int of recording id(s) to be read (order matters and will overwrite default chronological order, e.g. if recording_ids=[1,0], then the first recording returned by this class will be chronologically later then the second recording. Provide recording_ids in ascending order to preserve chronological order.).
target_name (str) – Can be ‘pathological’, ‘gender’, or ‘age’.
preload (bool) – If True, preload the data of the Raw objects.
add_physician_reports (bool) – If True, the physician reports will be read from disk and added to the description.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

class braindecode.datasets.WindowsDataset(windows, description=None, transform=None, targets_from='metadata', last_target_only=True)[source]#

Bases: BaseDataset

Returns windows from an mne.Epochs object along with a target.

Dataset which serves windows from an mne.Epochs object along with their target and additional information. The metadata attribute of the Epochs object must contain a column called target, which will be used to return the target that corresponds to a window. Additional columns i_window_in_trial, i_start_in_trial, i_stop_in_trial are also required to serve information about the windowing (e.g., useful for cropped training). See braindecode.datautil.windowers to directly create a WindowsDataset from a BaseDataset object.

Parameters:

windows (mne.Epochs) – Windows obtained through the application of a windower to a BaseDataset (see braindecode.datautil.windowers).
description (dict | pandas.Series | None) – Holds additional info about the windows.
transform (callable | None) – On-the-fly transform applied to a window before it is returned.
targets_from (str) – Defines whether targets will be extracted from mne.Epochs metadata or mne.Epochs misc channels (time series targets). It can be metadata (default) or channels.

property description#

set_description(description, overwrite=False)[source]#

Update (add or overwrite) the dataset description.

Parameters:

description (dict | pd.Series) – Description in the form key: value.
overwrite (bool) – Has to be True if a key in description already exists in the dataset description.

property transform#

braindecode.datasets.create_from_X_y(X, y, drop_last_window, sfreq, ch_names=None, window_size_samples=None, window_stride_samples=None)[source]#

Create a BaseConcatDataset of WindowsDatasets from X and y to be used for decoding with skorch and braindecode, where X is a list of pre-cut trials and y are corresponding targets.

Parameters:

X (array-like) – list of pre-cut trials as n_trials x n_channels x n_times
y (array-like) – targets corresponding to the trials
drop_last_window (bool) – whether or not have a last overlapping window, when windows/windows do not equally divide the continuous signal
sfreq (float) – Sampling frequency of signals.
ch_names (array-like) – Names of the channels.
window_size_samples (int) – window size
window_stride_samples (int) – stride between windows

Returns:

windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode

Return type:

BaseConcatDataset

braindecode.datasets.create_from_mne_epochs(list_of_epochs, window_size_samples, window_stride_samples, drop_last_window)[source]#

Create WindowsDatasets from mne.Epochs

Parameters:

list_of_epochs (array-like) – list of mne.Epochs
window_size_samples (int) – window size
window_stride_samples (int) – stride between windows
drop_last_window (bool) – whether or not have a last overlapping window, when windows do not equally divide the continuous signal

Returns:

windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode

Return type:

BaseConcatDataset

braindecode.datasets.create_from_mne_raw(raws, trial_start_offset_samples, trial_stop_offset_samples, window_size_samples, window_stride_samples, drop_last_window, descriptions=None, mapping=None, preload=False, drop_bad_windows=True, accepted_bads_ratio=0.0)[source]#

Create WindowsDatasets from mne.RawArrays

Parameters:

raws (array-like) – list of mne.RawArrays
trial_start_offset_samples (int) – start offset from original trial onsets in samples
trial_stop_offset_samples (int) – stop offset from original trial stop in samples
window_size_samples (int) – window size
window_stride_samples (int) – stride between windows
drop_last_window (bool) – whether or not have a last overlapping window, when windows do not equally divide the continuous signal
descriptions (array-like) – list of dicts or pandas.Series with additional information about the raws
mapping (dict(str: int)) – mapping from event description to target value
preload (bool) – if True, preload the data of the Epochs objects.
drop_bad_windows (bool) – If True, call .drop_bad() on the resulting mne.Epochs object. This step allows identifying e.g., windows that fall outside of the continuous recording. It is suggested to run this step here as otherwise the BaseConcatDataset has to be updated as well.
accepted_bads_ratio (float, optional) – Acceptable proportion of trials withinconsistent length in a raw. If the number of trials whose length is exceeded by the window size is smaller than this, then only the corresponding trials are dropped, but the computation continues. Otherwise, an error is raised. Defaults to 0.0 (raise an error).

Returns:

windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode

Return type:

BaseConcatDataset

Submodules#

braindecode.datasets.base module#

Dataset classes.

class braindecode.datasets.base.BaseConcatDataset(list_of_ds, target_transform=None)[source]#

Bases: ConcatDataset

A base class for concatenated datasets. Holds either mne.Raw or mne.Epoch in self.datasets and has a pandas DataFrame with additional description.

Parameters:

list_of_ds (list) – list of BaseDataset, BaseConcatDataset or WindowsDataset
target_transform (callable | None) – Optional function to call on targets before returning them.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

property description#

get_metadata()[source]#

Concatenate the metadata and description of the wrapped Epochs.

Returns:: metadata – DataFrame containing as many rows as there are windows in the BaseConcatDataset, with the metadata and description information for each window.
Return type:: pd.DataFrame

save(path, overwrite=False, offset=0)[source]#

Save datasets to files by creating one subdirectory for each dataset: path/

0/
0-raw.fif | 0-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)

1/
1-raw.fif | 1-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)

Parameters:

path (str) –

Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.

set_description(description, overwrite=False)[source]#

Update (add or overwrite) the dataset description.

Parameters:

description (dict | pd.DataFrame) – Description in the form key: value where the length of the value has to match the number of datasets.
overwrite (bool) – Has to be True if a key in description already exists in the dataset description.

split(by=None, property=None, split_ids=None)[source]#

Split the dataset based on information listed in its description DataFrame or based on indices.

Parameters:

by (str | list | dict) – If by is a string, splitting is performed based on the description DataFrame column with this name. If by is a (list of) list of integers, the position in the first list corresponds to the split id and the integers to the datapoints of that split. If a dict then each key will be used in the returned splits dict and each value should be a list of int.
property (str) – Some property which is listed in info DataFrame.
split_ids (list | dict) – List of indices to be combined in a subset. It can be a list of int or a list of list of int.

Returns:

splits – A dictionary with the name of the split (a string) as key and the dataset as value.

Return type:

dict

property target_transform#

property transform#

class braindecode.datasets.base.BaseDataset(raw, description=None, target_name=None, transform=None)[source]#

Bases: Dataset

Returns samples from an mne.io.Raw object along with a target.

Dataset which serves samples from an mne.io.Raw object along with a target. The target is unique for the dataset, and is obtained through the description attribute.

Parameters:

raw (mne.io.Raw) – Continuous data.
description (dict | pandas.Series | None) – Holds additional description about the continuous signal / subject.
target_name (str | tuple | None) – Name(s) of the index in description that should be used to provide the target (e.g., to be used in a prediction task later on).
transform (callable | None) – On-the-fly transform applied to the example before it is returned.

property description#

set_description(description, overwrite=False)[source]#

Update (add or overwrite) the dataset description.

Parameters:

description (dict | pd.Series) – Description in the form key: value.
overwrite (bool) – Has to be True if a key in description already exists in the dataset description.

property transform#

class braindecode.datasets.base.EEGWindowsDataset(raw, metadata, description=None, transform=None, targets_from='metadata', last_target_only=True)[source]#

Bases: BaseDataset

Returns windows from an mne.Raw object, its window indices, along with a target.

Parameters:

windows (mne.Raw or mne.Epochs (Epochs is outdated)) – Windows obtained through the application of a windower to a BaseDataset (see braindecode.datautil.windowers).
description (dict | pandas.Series | None) – Holds additional info about the windows.
transform (callable | None) – On-the-fly transform applied to a window before it is returned.
targets_from (str) – Defines whether targets will be extracted from metadata or from misc channels (time series targets). It can be metadata (default) or channels.
last_target_only (bool) – If targets are obtained from misc channels whether all targets if the entire (compute) window will be returned or only the last target in the window.
metadata (pandas.DataFrame) – Dataframe with crop indices, so i_window_in_trial, i_start_in_trial, i_stop_in_trial as well as targets.

property description#

set_description(description, overwrite=False)[source]#

Update (add or overwrite) the dataset description.

Parameters:

description (dict | pd.Series) – Description in the form key: value.
overwrite (bool) – Has to be True if a key in description already exists in the dataset description.

property transform#

class braindecode.datasets.base.WindowsDataset(windows, description=None, transform=None, targets_from='metadata', last_target_only=True)[source]#

Bases: BaseDataset

Returns windows from an mne.Epochs object along with a target.

Parameters:

windows (mne.Epochs) – Windows obtained through the application of a windower to a BaseDataset (see braindecode.datautil.windowers).
description (dict | pandas.Series | None) – Holds additional info about the windows.
transform (callable | None) – On-the-fly transform applied to a window before it is returned.
targets_from (str) – Defines whether targets will be extracted from mne.Epochs metadata or mne.Epochs misc channels (time series targets). It can be metadata (default) or channels.

property description#

set_description(description, overwrite=False)[source]#

Update (add or overwrite) the dataset description.

Parameters:

description (dict | pd.Series) – Description in the form key: value.
overwrite (bool) – Has to be True if a key in description already exists in the dataset description.

property transform#

braindecode.datasets.bbci module#

class braindecode.datasets.bbci.BBCIDataset(filename, load_sensor_names=None, check_class_names=False)[source]#

Bases: object

Loader class for files created by saving BBCI files in matlab (make sure to save with ‘-v7.3’ in matlab, see https://de.mathworks.com/help/matlab/import_export/mat-file-versions.html#buk6i87 )

Parameters:

filename (str) –
load_sensor_names (list of str, optional) – Also speeds up loading if you only load some sensors. None means load all sensors.
check_class_names (bool, optional) – check if the class names are part of some known class names at Translational NeuroTechnology Lab, AG Ball, Freiburg, Germany.

static get_all_sensors(filename, pattern=None)[source]#

Get all sensors that exist in the given file.

Parameters:

filename (str) –
pattern (str, optional) – Only return those sensor names that match the given pattern.

Returns:

sensor_names – Sensor names that match the pattern or all sensor names in the file.

Return type:

list of str

load()[source]#

braindecode.datasets.bbci.load_bbci_sets_from_folder(folder, runs='all')[source]#

Load bbci datasets from files in given folder.

Parameters:

folder (str) – Folder with .BBCI.mat files inside
runs (list of int) – If you only want to load specific runs. Assumes filenames with such kind of part: S001R02 for Run 2. Tries to match this regex: 'S[0-9]{3,3}R[0-9]{2,2}_'.

braindecode.datasets.bcicomp module#

class braindecode.datasets.bcicomp.BCICompetitionIVDataset4(subject_ids=None)[source]#

Bases: BaseConcatDataset

BCI competition IV dataset 4.

Notes

When using this dataset please cite [1] .

Parameters:: subject_ids (list(int) | int | None) – (list of) int of subject(s) to be loaded. If None, load all available subjects. Should be in range 1-3.

References

[1]

Miller, Kai J. “A library of human electrocorticographic data and analyses.”

Nature human behaviour 3, no. 11 (2019): 1225-1235. https://doi.org/10.1038/s41562-019-0678-3

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

static download(path=None, force_update=False, verbose=None)[source]#

Download the dataset.

Parameters:

location. (path (None | str) – Location of where to look for the data storing) –
None (or None) – If not) –
parameter (the environment variable or config) –
exist (MNE_DATASETS_(dataset)_PATH is used. If it doesn’t) –
“~/mne_data” (the) –
path (directory is used. If the dataset is not found under the given) –
data (the) –
folder. (will be automatically downloaded to the specified) –
exists. (force_update (bool) – Force update of the dataset even if a local copy) –
(bool (verbose) –
str –
int –
None –
level (override default verbose) –
mne.verbose()) ((see) –

possible_subjects = [1, 2, 3]#

braindecode.datasets.mne module#

braindecode.datasets.mne.create_from_mne_epochs(list_of_epochs, window_size_samples, window_stride_samples, drop_last_window)[source]#

Create WindowsDatasets from mne.Epochs

Parameters:

list_of_epochs (array-like) – list of mne.Epochs
window_size_samples (int) – window size
window_stride_samples (int) – stride between windows
drop_last_window (bool) – whether or not have a last overlapping window, when windows do not equally divide the continuous signal

Returns:

windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode

Return type:

BaseConcatDataset

braindecode.datasets.mne.create_from_mne_raw(raws, trial_start_offset_samples, trial_stop_offset_samples, window_size_samples, window_stride_samples, drop_last_window, descriptions=None, mapping=None, preload=False, drop_bad_windows=True, accepted_bads_ratio=0.0)[source]#

Create WindowsDatasets from mne.RawArrays

Parameters:

raws (array-like) – list of mne.RawArrays
trial_start_offset_samples (int) – start offset from original trial onsets in samples
trial_stop_offset_samples (int) – stop offset from original trial stop in samples
window_size_samples (int) – window size
window_stride_samples (int) – stride between windows
drop_last_window (bool) – whether or not have a last overlapping window, when windows do not equally divide the continuous signal
descriptions (array-like) – list of dicts or pandas.Series with additional information about the raws
mapping (dict(str: int)) – mapping from event description to target value
preload (bool) – if True, preload the data of the Epochs objects.
drop_bad_windows (bool) – If True, call .drop_bad() on the resulting mne.Epochs object. This step allows identifying e.g., windows that fall outside of the continuous recording. It is suggested to run this step here as otherwise the BaseConcatDataset has to be updated as well.
accepted_bads_ratio (float, optional) – Acceptable proportion of trials withinconsistent length in a raw. If the number of trials whose length is exceeded by the window size is smaller than this, then only the corresponding trials are dropped, but the computation continues. Otherwise, an error is raised. Defaults to 0.0 (raise an error).

Returns:

windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode

Return type:

BaseConcatDataset

braindecode.datasets.moabb module#

Dataset objects for some public datasets.

class braindecode.datasets.moabb.BNCI2014001(subject_ids)[source]#

Bases: MOABBDataset

BNCI 2014-001 Motor Imagery dataset.

Dataset summary

Dataset IIa from BCI Competition 4 [Ra334fadbbade-1].

Dataset Description

This data set consists of EEG data from 9 subjects. The cue-based BCI paradigm consisted of four different motor imagery tasks, namely the imag- ination of movement of the left hand (class 1), right hand (class 2), both feet (class 3), and tongue (class 4). Two sessions on different days were recorded for each subject. Each session is comprised of 6 runs separated by short breaks. One run consists of 48 trials (12 for each of the four possible classes), yielding a total of 288 trials per session.

The subjects were sitting in a comfortable armchair in front of a computer screen. At the beginning of a trial ( t = 0 s), a fixation cross appeared on the black screen. In addition, a short acoustic warning tone was presented. After two seconds ( t = 2 s), a cue in the form of an arrow pointing either to the left, right, down or up (corresponding to one of the four classes left hand, right hand, foot or tongue) appeared and stayed on the screen for 1.25 s. This prompted the subjects to perform the desired motor imagery task. No feedback was provided. The subjects were ask to carry out the motor imagery task until the fixation cross disappeared from the screen at t = 6 s.

Twenty-two Ag/AgCl electrodes (with inter-electrode distances of 3.5 cm) were used to record the EEG; the montage is shown in Figure 3 left. All signals were recorded monopolarly with the left mastoid serving as reference and the right mastoid as ground. The signals were sampled with. 250 Hz and bandpass-filtered between 0.5 Hz and 100 Hz. The sensitivity of the amplifier was set to 100 μV . An additional 50 Hz notch filter was enabled to suppress line noise

Parameters:

subject_ids: list(int) | int | None: (list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.

See moabb.datasets.bnci.BNCI2014001

class BNCI2014001(*args, **kwargs)[source]#

Bases: BNCI2014_001

BNCI 2014-001 Motor Imagery dataset.

Dataset summary

Dataset IIa from BCI Competition 4 [1].

Dataset Description

References

[1]

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

doc = 'See moabb.datasets.bnci.BNCI2014001\n\n Parameters\n ----------\n subject_ids: list(int) | int | None\n (list of) int of subject(s) to be fetched. If None, data of all\n subjects is fetched.\n '#

class braindecode.datasets.moabb.HGD(subject_ids)[source]#

Bases: MOABBDataset

High-gamma dataset described in Schirrmeister et al. 2017.

Dataset summary

Name

#Subj

#Chan

#Classes

#Trials / class

Trials len

Sampling rate

#Sessions

Schirrmeister2017

14

128

4

120

4s

500Hz

1

Dataset from [Ra4397400a5be-1]

Our “High-Gamma Dataset” is a 128-electrode dataset (of which we later only use 44 sensors covering the motor cortex, (see Section 2.7.1), obtained from 14 healthy subjects (6 female, 2 left-handed, age 27.2 ± 3.6 (mean ± std)) with roughly 1000 (963.1 ± 150.9, mean ± std) four-second trials of executed movements divided into 13 runs per subject. The four classes of movements were movements of either the left hand, the right hand, both feet, and rest (no movement, but same type of visual cue as for the other classes). The training set consists of the approx. 880 trials of all runs except the last two runs, the test set of the approx. 160 trials of the last 2 runs. This dataset was acquired in an EEG lab optimized for non-invasive detection of high- frequency movement-related EEG components (Ball et al., 2008; Darvas et al., 2010).

Depending on the direction of a gray arrow that was shown on black back- ground, the subjects had to repetitively clench their toes (downward arrow), perform sequential finger-tapping of their left (leftward arrow) or right (rightward arrow) hand, or relax (upward arrow). The movements were selected to require little proximal muscular activity while still being complex enough to keep subjects in- volved. Within the 4-s trials, the subjects performed the repetitive movements at their own pace, which had to be maintained as long as the arrow was showing. Per run, 80 arrows were displayed for 4 s each, with 3 to 4 s of continuous random inter-trial interval. The order of presentation was pseudo-randomized, with all four arrows being shown every four trials. Ideally 13 runs were performed to collect 260 trials of each movement and rest. The stimuli were presented and the data recorded with BCI2000 (Schalk et al., 2004). The experiment was approved by the ethical committee of the University of Freiburg.

Parameters:

subject_ids: list(int) | int | None: (list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.

See moabb.datasets.schirrmeister2017.Schirrmeister2017

class Schirrmeister2017[source]#

Bases: BaseDataset

High-gamma dataset described in Schirrmeister et al. 2017.

Dataset summary

Name	#Subj	#Chan	#Classes	#Trials / class	Trials len	Sampling rate	#Sessions
Schirrmeister2017	14	128	4	120	4s	500Hz	1

Dataset from [1]

References

[1]

Schirrmeister, Robin Tibor, et al. “Deep learning with convolutional neural networks for EEG decoding and visualization.” Human brain mapping 38.11 (2017): 5391-5420.

data_path(subject, path=None, force_update=False, update_path=None, verbose=None)[source]#

Get path to local copy of a subject data.

Parameters:

subject (int) – Number of subject to use
path (None | str) – Location of where to look for the data storing location. If None, the environment variable or config parameter MNE_DATASETS_(dataset)_PATH is used. If it doesn’t exist, the “~/mne_data” directory is used. If the dataset is not found under the given path, the data will be automatically downloaded to the specified folder.
force_update (bool) – Force update of the dataset even if a local copy exists.
update_path (bool | None Deprecated) – If True, set the MNE_DATASETS_(dataset)_PATH in mne-python config to the given path. If None, the user is prompted.
verbose (bool, str, int, or None) – If not None, override default verbose level (see mne.verbose()).

Returns:

path – Local path to the given data file. This path is contained inside a list of length one, for compatibility.

Return type:

list of str

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

doc = 'See moabb.datasets.schirrmeister2017.Schirrmeister2017\n\n Parameters\n ----------\n subject_ids: list(int) | int | None\n (list of) int of subject(s) to be fetched. If None, data of all\n subjects is fetched.\n '#

class braindecode.datasets.moabb.MOABBDataset(dataset_name, subject_ids, dataset_kwargs=None)[source]#

Bases: BaseConcatDataset

A class for moabb datasets.

Parameters:

dataset_name (str) – name of dataset included in moabb to be fetched
subject_ids (list(int) | int | None) – (list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.
dataset_kwargs (dict, optional) – optional dictionary containing keyword arguments to pass to the moabb dataset when instantiating it.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

braindecode.datasets.moabb.fetch_data_with_moabb(dataset_name, subject_ids, dataset_kwargs=None)[source]#

Fetch data using moabb.

Parameters:

dataset_name (str) – the name of a dataset included in moabb
subject_ids (list(int) | int) – (list of) int of subject(s) to be fetched
dataset_kwargs (dict, optional) – optional dictionary containing keyword arguments to pass to the moabb dataset when instantiating it.

Returns:

raws (mne.Raw)
info (pandas.DataFrame)

braindecode.datasets.sleep_physionet module#

class braindecode.datasets.sleep_physionet.SleepPhysionet(subject_ids=None, recording_ids=None, preload=False, load_eeg_only=True, crop_wake_mins=30, crop=None)[source]#

Bases: BaseConcatDataset

Sleep Physionet dataset.

Sleep dataset from https://physionet.org/content/sleep-edfx/1.0.0/. Contains overnight recordings from 78 healthy subjects.

See [MNE example](https://mne.tools/stable/auto_tutorials/sample-datasets/plot_sleep.html).

Parameters:

subject_ids (list(int) | int | None) – (list of) int of subject(s) to be loaded. If None, load all available subjects.
recording_ids (list(int) | None) – Recordings to load per subject (each subject except 13 has two recordings). Can be [1], [2] or [1, 2] (same as None).
preload (bool) – If True, preload the data of the Raw objects.
load_eeg_only (bool) – If True, only load the EEG channels and discard the others (EOG, EMG, temperature, respiration) to avoid resampling the other signals.
crop_wake_mins (float) – Number of minutes of wake time to keep before the first sleep event and after the last sleep event. Used to reduce the imbalance in this dataset. Default of 30 mins.
crop (None | tuple) – If not None crop the raw files (e.g. to use only the first 3h). Example: crop=(0, 3600*3) to keep only the first 3h.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

braindecode.datasets.tuh module#

Dataset classes for the Temple University Hospital (TUH) EEG Corpus and the TUH Abnormal EEG Corpus.

class braindecode.datasets.tuh.TUH(path, recording_ids=None, target_name=None, preload=False, add_physician_reports=False, n_jobs=1)[source]#

Bases: BaseConcatDataset

Temple University Hospital (TUH) EEG Corpus (www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml#c_tueg).

Parameters:

path (str) – Parent directory of the dataset.
recording_ids (list(int) | int) – A (list of) int of recording id(s) to be read (order matters and will overwrite default chronological order, e.g. if recording_ids=[1,0], then the first recording returned by this class will be chronologically later then the second recording. Provide recording_ids in ascending order to preserve chronological order.).
target_name (str) – Can be ‘gender’, or ‘age’.
preload (bool) – If True, preload the data of the Raw objects.
add_physician_reports (bool) – If True, the physician reports will be read from disk and added to the description.
n_jobs (int) – Number of jobs to be used to read files in parallel.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

class braindecode.datasets.tuh.TUHAbnormal(path, recording_ids=None, target_name='pathological', preload=False, add_physician_reports=False, n_jobs=1)[source]#

Bases: TUH

Temple University Hospital (TUH) Abnormal EEG Corpus. see www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml#c_tuab

Parameters:

path (str) – Parent directory of the dataset.
recording_ids (list(int) | int) – A (list of) int of recording id(s) to be read (order matters and will overwrite default chronological order, e.g. if recording_ids=[1,0], then the first recording returned by this class will be chronologically later then the second recording. Provide recording_ids in ascending order to preserve chronological order.).
target_name (str) – Can be ‘pathological’, ‘gender’, or ‘age’.
preload (bool) – If True, preload the data of the Raw objects.
add_physician_reports (bool) – If True, the physician reports will be read from disk and added to the description.

cumulative_sizes: List[int]#

datasets: List[Dataset[T_co]]#

braindecode.datasets.xy module#

braindecode.datasets.xy.create_from_X_y(X, y, drop_last_window, sfreq, ch_names=None, window_size_samples=None, window_stride_samples=None)[source]#

Create a BaseConcatDataset of WindowsDatasets from X and y to be used for decoding with skorch and braindecode, where X is a list of pre-cut trials and y are corresponding targets.

Parameters:

X (array-like) – list of pre-cut trials as n_trials x n_channels x n_times
y (array-like) – targets corresponding to the trials
drop_last_window (bool) – whether or not have a last overlapping window, when windows/windows do not equally divide the continuous signal
sfreq (float) – Sampling frequency of signals.
ch_names (array-like) – Names of the channels.
window_size_samples (int) – window size
window_stride_samples (int) – stride between windows

Returns:

windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode

Return type:

BaseConcatDataset