braindecode.datasets package#
Loader code for some datasets.
- class braindecode.datasets.BCICompetitionIVDataset4(subject_ids=None)[source]#
Bases:
BaseConcatDataset
BCI competition IV dataset 4.
Contains ECoG recordings for three patients moving fingers during the experiment. Targets correspond to the time courses of the flexion of each of five fingers. See http://www.bbci.de/competition/iv/desc_4.pdf and http://www.bbci.de/competition/iv/ for the dataset and competition description. ECoG library containing the dataset: https://searchworks.stanford.edu/view/zk881ps0522
Notes
When using this dataset please cite [1] .
- Parameters:
subject_ids (list(int) | int | None) – (list of) int of subject(s) to be loaded. If None, load all available subjects. Should be in range 1-3.
References
[1]Miller, Kai J. “A library of human electrocorticographic data and analyses.”
Nature human behaviour 3, no. 11 (2019): 1225-1235. https://doi.org/10.1038/s41562-019-0678-3
- static download(path=None, force_update=False, verbose=None)[source]#
Download the dataset.
- Parameters:
location. (path (None | str) – Location of where to look for the data storing) –
None (or None) – If not) –
parameter (the environment variable or config) –
exist (MNE_DATASETS_(dataset)_PATH is used. If it doesn’t) –
“~/mne_data” (the) –
path (directory is used. If the dataset is not found under the given) –
data (the) –
folder. (will be automatically downloaded to the specified) –
exists. (force_update (bool) – Force update of the dataset even if a local copy) –
(bool (verbose) –
str –
int –
None –
level (override default verbose) –
mne.verbose()) ((see) –
- possible_subjects = [1, 2, 3]#
- class braindecode.datasets.BNCI2014001(subject_ids)[source]#
Bases:
MOABBDataset
BNCI 2014-001 Motor Imagery dataset.
Dataset summary
Dataset IIa from BCI Competition 4 [R382d436f3223-1].
Dataset Description
This data set consists of EEG data from 9 subjects. The cue-based BCI paradigm consisted of four different motor imagery tasks, namely the imag- ination of movement of the left hand (class 1), right hand (class 2), both feet (class 3), and tongue (class 4). Two sessions on different days were recorded for each subject. Each session is comprised of 6 runs separated by short breaks. One run consists of 48 trials (12 for each of the four possible classes), yielding a total of 288 trials per session.
The subjects were sitting in a comfortable armchair in front of a computer screen. At the beginning of a trial ( t = 0 s), a fixation cross appeared on the black screen. In addition, a short acoustic warning tone was presented. After two seconds ( t = 2 s), a cue in the form of an arrow pointing either to the left, right, down or up (corresponding to one of the four classes left hand, right hand, foot or tongue) appeared and stayed on the screen for 1.25 s. This prompted the subjects to perform the desired motor imagery task. No feedback was provided. The subjects were ask to carry out the motor imagery task until the fixation cross disappeared from the screen at t = 6 s.
Twenty-two Ag/AgCl electrodes (with inter-electrode distances of 3.5 cm) were used to record the EEG; the montage is shown in Figure 3 left. All signals were recorded monopolarly with the left mastoid serving as reference and the right mastoid as ground. The signals were sampled with. 250 Hz and bandpass-filtered between 0.5 Hz and 100 Hz. The sensitivity of the amplifier was set to 100 μV . An additional 50 Hz notch filter was enabled to suppress line noise
- Parameters:
- subject_ids: list(int) | int | None
(list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.
See moabb.datasets.bnci.BNCI2014001
- class BNCI2014001(*args, **kwargs)[source]#
Bases:
BNCI2014_001
BNCI 2014-001 Motor Imagery dataset.
Dataset summary
Dataset IIa from BCI Competition 4 [1].
Dataset Description
This data set consists of EEG data from 9 subjects. The cue-based BCI paradigm consisted of four different motor imagery tasks, namely the imag- ination of movement of the left hand (class 1), right hand (class 2), both feet (class 3), and tongue (class 4). Two sessions on different days were recorded for each subject. Each session is comprised of 6 runs separated by short breaks. One run consists of 48 trials (12 for each of the four possible classes), yielding a total of 288 trials per session.
The subjects were sitting in a comfortable armchair in front of a computer screen. At the beginning of a trial ( t = 0 s), a fixation cross appeared on the black screen. In addition, a short acoustic warning tone was presented. After two seconds ( t = 2 s), a cue in the form of an arrow pointing either to the left, right, down or up (corresponding to one of the four classes left hand, right hand, foot or tongue) appeared and stayed on the screen for 1.25 s. This prompted the subjects to perform the desired motor imagery task. No feedback was provided. The subjects were ask to carry out the motor imagery task until the fixation cross disappeared from the screen at t = 6 s.
Twenty-two Ag/AgCl electrodes (with inter-electrode distances of 3.5 cm) were used to record the EEG; the montage is shown in Figure 3 left. All signals were recorded monopolarly with the left mastoid serving as reference and the right mastoid as ground. The signals were sampled with. 250 Hz and bandpass-filtered between 0.5 Hz and 100 Hz. The sensitivity of the amplifier was set to 100 μV . An additional 50 Hz notch filter was enabled to suppress line noise
References
[1]Tangermann, M., Müller, K.R., Aertsen, A., Birbaumer, N., Braun, C., Brunner, C., Leeb, R., Mehring, C., Miller, K.J., Mueller-Putz, G. and Nolte, G., 2012. Review of the BCI competition IV. Frontiers in neuroscience, 6, p.55.
- doc = 'See moabb.datasets.bnci.BNCI2014001\n\n Parameters\n ----------\n subject_ids: list(int) | int | None\n (list of) int of subject(s) to be fetched. If None, data of all\n subjects is fetched.\n '#
- class braindecode.datasets.BaseConcatDataset(list_of_ds, target_transform=None)[source]#
Bases:
ConcatDataset
A base class for concatenated datasets. Holds either mne.Raw or mne.Epoch in self.datasets and has a pandas DataFrame with additional description.
- Parameters:
list_of_ds (list) – list of BaseDataset, BaseConcatDataset or WindowsDataset
target_transform (callable | None) – Optional function to call on targets before returning them.
- property description#
- get_metadata()[source]#
Concatenate the metadata and description of the wrapped Epochs.
- Returns:
metadata – DataFrame containing as many rows as there are windows in the BaseConcatDataset, with the metadata and description information for each window.
- Return type:
pd.DataFrame
- save(path, overwrite=False, offset=0)[source]#
Save datasets to files by creating one subdirectory for each dataset: path/
- 0/
0-raw.fif | 0-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)
- 1/
1-raw.fif | 1-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)
- Parameters:
path (str) –
- Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.
- set_description(description, overwrite=False)[source]#
Update (add or overwrite) the dataset description.
- split(by=None, property=None, split_ids=None)[source]#
Split the dataset based on information listed in its description DataFrame or based on indices.
- Parameters:
by (str | list | dict) – If
by
is a string, splitting is performed based on the description DataFrame column with this name. Ifby
is a (list of) list of integers, the position in the first list corresponds to the split id and the integers to the datapoints of that split. If a dict then each key will be used in the returned splits dict and each value should be a list of int.property (str) – Some property which is listed in info DataFrame.
split_ids (list | dict) – List of indices to be combined in a subset. It can be a list of int or a list of list of int.
- Returns:
splits – A dictionary with the name of the split (a string) as key and the dataset as value.
- Return type:
- property target_transform#
- property transform#
- class braindecode.datasets.BaseDataset(raw, description=None, target_name=None, transform=None)[source]#
Bases:
Dataset
Returns samples from an mne.io.Raw object along with a target.
Dataset which serves samples from an mne.io.Raw object along with a target. The target is unique for the dataset, and is obtained through the description attribute.
- Parameters:
raw (mne.io.Raw) – Continuous data.
description (dict | pandas.Series | None) – Holds additional description about the continuous signal / subject.
target_name (str | tuple | None) – Name(s) of the index in description that should be used to provide the target (e.g., to be used in a prediction task later on).
transform (callable | None) – On-the-fly transform applied to the example before it is returned.
- property description#
- set_description(description, overwrite=False)[source]#
Update (add or overwrite) the dataset description.
- property transform#
- class braindecode.datasets.HGD(subject_ids)[source]#
Bases:
MOABBDataset
High-gamma dataset described in Schirrmeister et al. 2017.
Dataset summary
Name
#Subj
#Chan
#Classes
#Trials / class
Trials len
Sampling rate
#Sessions
Schirrmeister2017
14
128
4
120
4s
500Hz
1
Dataset from [R5e478952091a-1]
Our “High-Gamma Dataset” is a 128-electrode dataset (of which we later only use 44 sensors covering the motor cortex, (see Section 2.7.1), obtained from 14 healthy subjects (6 female, 2 left-handed, age 27.2 ± 3.6 (mean ± std)) with roughly 1000 (963.1 ± 150.9, mean ± std) four-second trials of executed movements divided into 13 runs per subject. The four classes of movements were movements of either the left hand, the right hand, both feet, and rest (no movement, but same type of visual cue as for the other classes). The training set consists of the approx. 880 trials of all runs except the last two runs, the test set of the approx. 160 trials of the last 2 runs. This dataset was acquired in an EEG lab optimized for non-invasive detection of high- frequency movement-related EEG components (Ball et al., 2008; Darvas et al., 2010).
Depending on the direction of a gray arrow that was shown on black back- ground, the subjects had to repetitively clench their toes (downward arrow), perform sequential finger-tapping of their left (leftward arrow) or right (rightward arrow) hand, or relax (upward arrow). The movements were selected to require little proximal muscular activity while still being complex enough to keep subjects in- volved. Within the 4-s trials, the subjects performed the repetitive movements at their own pace, which had to be maintained as long as the arrow was showing. Per run, 80 arrows were displayed for 4 s each, with 3 to 4 s of continuous random inter-trial interval. The order of presentation was pseudo-randomized, with all four arrows being shown every four trials. Ideally 13 runs were performed to collect 260 trials of each movement and rest. The stimuli were presented and the data recorded with BCI2000 (Schalk et al., 2004). The experiment was approved by the ethical committee of the University of Freiburg.
- Parameters:
- subject_ids: list(int) | int | None
(list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.
See moabb.datasets.schirrmeister2017.Schirrmeister2017
- class Schirrmeister2017[source]#
Bases:
BaseDataset
High-gamma dataset described in Schirrmeister et al. 2017.
Dataset summary
Name
#Subj
#Chan
#Classes
#Trials / class
Trials len
Sampling rate
#Sessions
Schirrmeister2017
14
128
4
120
4s
500Hz
1
Dataset from [1]
Our “High-Gamma Dataset” is a 128-electrode dataset (of which we later only use 44 sensors covering the motor cortex, (see Section 2.7.1), obtained from 14 healthy subjects (6 female, 2 left-handed, age 27.2 ± 3.6 (mean ± std)) with roughly 1000 (963.1 ± 150.9, mean ± std) four-second trials of executed movements divided into 13 runs per subject. The four classes of movements were movements of either the left hand, the right hand, both feet, and rest (no movement, but same type of visual cue as for the other classes). The training set consists of the approx. 880 trials of all runs except the last two runs, the test set of the approx. 160 trials of the last 2 runs. This dataset was acquired in an EEG lab optimized for non-invasive detection of high- frequency movement-related EEG components (Ball et al., 2008; Darvas et al., 2010).
Depending on the direction of a gray arrow that was shown on black back- ground, the subjects had to repetitively clench their toes (downward arrow), perform sequential finger-tapping of their left (leftward arrow) or right (rightward arrow) hand, or relax (upward arrow). The movements were selected to require little proximal muscular activity while still being complex enough to keep subjects in- volved. Within the 4-s trials, the subjects performed the repetitive movements at their own pace, which had to be maintained as long as the arrow was showing. Per run, 80 arrows were displayed for 4 s each, with 3 to 4 s of continuous random inter-trial interval. The order of presentation was pseudo-randomized, with all four arrows being shown every four trials. Ideally 13 runs were performed to collect 260 trials of each movement and rest. The stimuli were presented and the data recorded with BCI2000 (Schalk et al., 2004). The experiment was approved by the ethical committee of the University of Freiburg.
References
[1]Schirrmeister, Robin Tibor, et al. “Deep learning with convolutional neural networks for EEG decoding and visualization.” Human brain mapping 38.11 (2017): 5391-5420.
- data_path(subject, path=None, force_update=False, update_path=None, verbose=None)[source]#
Get path to local copy of a subject data.
- Parameters:
subject (int) – Number of subject to use
path (None | str) – Location of where to look for the data storing location. If None, the environment variable or config parameter
MNE_DATASETS_(dataset)_PATH
is used. If it doesn’t exist, the “~/mne_data” directory is used. If the dataset is not found under the given path, the data will be automatically downloaded to the specified folder.force_update (bool) – Force update of the dataset even if a local copy exists.
update_path (bool | None Deprecated) – If True, set the MNE_DATASETS_(dataset)_PATH in mne-python config to the given path. If None, the user is prompted.
verbose (bool, str, int, or None) – If not None, override default verbose level (see
mne.verbose()
).
- Returns:
path – Local path to the given data file. This path is contained inside a list of length one, for compatibility.
- Return type:
- doc = 'See moabb.datasets.schirrmeister2017.Schirrmeister2017\n\n Parameters\n ----------\n subject_ids: list(int) | int | None\n (list of) int of subject(s) to be fetched. If None, data of all\n subjects is fetched.\n '#
- class braindecode.datasets.MOABBDataset(dataset_name, subject_ids, dataset_kwargs=None)[source]#
Bases:
BaseConcatDataset
A class for moabb datasets.
- Parameters:
dataset_name (str) – name of dataset included in moabb to be fetched
subject_ids (list(int) | int | None) – (list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.
dataset_kwargs (dict, optional) – optional dictionary containing keyword arguments to pass to the moabb dataset when instantiating it.
- class braindecode.datasets.SleepPhysionet(subject_ids=None, recording_ids=None, preload=False, load_eeg_only=True, crop_wake_mins=30, crop=None)[source]#
Bases:
BaseConcatDataset
Sleep Physionet dataset.
Sleep dataset from https://physionet.org/content/sleep-edfx/1.0.0/. Contains overnight recordings from 78 healthy subjects.
See [MNE example](https://mne.tools/stable/auto_tutorials/sample-datasets/plot_sleep.html).
- Parameters:
subject_ids (list(int) | int | None) – (list of) int of subject(s) to be loaded. If None, load all available subjects.
recording_ids (list(int) | None) – Recordings to load per subject (each subject except 13 has two recordings). Can be [1], [2] or [1, 2] (same as None).
preload (bool) – If True, preload the data of the Raw objects.
load_eeg_only (bool) – If True, only load the EEG channels and discard the others (EOG, EMG, temperature, respiration) to avoid resampling the other signals.
crop_wake_mins (float) – Number of minutes of wake time to keep before the first sleep event and after the last sleep event. Used to reduce the imbalance in this dataset. Default of 30 mins.
crop (None | tuple) – If not None crop the raw files (e.g. to use only the first 3h). Example:
crop=(0, 3600*3)
to keep only the first 3h.
- class braindecode.datasets.TUH(path, recording_ids=None, target_name=None, preload=False, add_physician_reports=False, n_jobs=1)[source]#
Bases:
BaseConcatDataset
Temple University Hospital (TUH) EEG Corpus (www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml#c_tueg).
- Parameters:
path (str) – Parent directory of the dataset.
recording_ids (list(int) | int) – A (list of) int of recording id(s) to be read (order matters and will overwrite default chronological order, e.g. if recording_ids=[1,0], then the first recording returned by this class will be chronologically later then the second recording. Provide recording_ids in ascending order to preserve chronological order.).
target_name (str) – Can be ‘gender’, or ‘age’.
preload (bool) – If True, preload the data of the Raw objects.
add_physician_reports (bool) – If True, the physician reports will be read from disk and added to the description.
n_jobs (int) – Number of jobs to be used to read files in parallel.
- class braindecode.datasets.TUHAbnormal(path, recording_ids=None, target_name='pathological', preload=False, add_physician_reports=False, n_jobs=1)[source]#
Bases:
TUH
Temple University Hospital (TUH) Abnormal EEG Corpus. see www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml#c_tuab
- Parameters:
path (str) – Parent directory of the dataset.
recording_ids (list(int) | int) – A (list of) int of recording id(s) to be read (order matters and will overwrite default chronological order, e.g. if recording_ids=[1,0], then the first recording returned by this class will be chronologically later then the second recording. Provide recording_ids in ascending order to preserve chronological order.).
target_name (str) – Can be ‘pathological’, ‘gender’, or ‘age’.
preload (bool) – If True, preload the data of the Raw objects.
add_physician_reports (bool) – If True, the physician reports will be read from disk and added to the description.
- class braindecode.datasets.WindowsDataset(windows, description=None, transform=None, targets_from='metadata', last_target_only=True)[source]#
Bases:
BaseDataset
Returns windows from an mne.Epochs object along with a target.
Dataset which serves windows from an mne.Epochs object along with their target and additional information. The metadata attribute of the Epochs object must contain a column called target, which will be used to return the target that corresponds to a window. Additional columns i_window_in_trial, i_start_in_trial, i_stop_in_trial are also required to serve information about the windowing (e.g., useful for cropped training). See braindecode.datautil.windowers to directly create a WindowsDataset from a BaseDataset object.
- Parameters:
windows (mne.Epochs) – Windows obtained through the application of a windower to a BaseDataset (see braindecode.datautil.windowers).
description (dict | pandas.Series | None) – Holds additional info about the windows.
transform (callable | None) – On-the-fly transform applied to a window before it is returned.
targets_from (str) – Defines whether targets will be extracted from mne.Epochs metadata or mne.Epochs misc channels (time series targets). It can be metadata (default) or channels.
- property description#
- set_description(description, overwrite=False)[source]#
Update (add or overwrite) the dataset description.
- property transform#
- braindecode.datasets.create_from_X_y(X, y, drop_last_window, sfreq, ch_names=None, window_size_samples=None, window_stride_samples=None)[source]#
Create a BaseConcatDataset of WindowsDatasets from X and y to be used for decoding with skorch and braindecode, where X is a list of pre-cut trials and y are corresponding targets.
- Parameters:
X (array-like) – list of pre-cut trials as n_trials x n_channels x n_times
y (array-like) – targets corresponding to the trials
drop_last_window (bool) – whether or not have a last overlapping window, when windows/windows do not equally divide the continuous signal
sfreq (float) – Sampling frequency of signals.
ch_names (array-like) – Names of the channels.
window_size_samples (int) – window size
window_stride_samples (int) – stride between windows
- Returns:
windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode
- Return type:
- braindecode.datasets.create_from_mne_epochs(list_of_epochs, window_size_samples, window_stride_samples, drop_last_window)[source]#
Create WindowsDatasets from mne.Epochs
- Parameters:
- Returns:
windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode
- Return type:
- braindecode.datasets.create_from_mne_raw(raws, trial_start_offset_samples, trial_stop_offset_samples, window_size_samples, window_stride_samples, drop_last_window, descriptions=None, mapping=None, preload=False, drop_bad_windows=True, accepted_bads_ratio=0.0)[source]#
Create WindowsDatasets from mne.RawArrays
- Parameters:
raws (array-like) – list of mne.RawArrays
trial_start_offset_samples (int) – start offset from original trial onsets in samples
trial_stop_offset_samples (int) – stop offset from original trial stop in samples
window_size_samples (int) – window size
window_stride_samples (int) – stride between windows
drop_last_window (bool) – whether or not have a last overlapping window, when windows do not equally divide the continuous signal
descriptions (array-like) – list of dicts or pandas.Series with additional information about the raws
mapping (dict(str: int)) – mapping from event description to target value
preload (bool) – if True, preload the data of the Epochs objects.
drop_bad_windows (bool) – If True, call .drop_bad() on the resulting mne.Epochs object. This step allows identifying e.g., windows that fall outside of the continuous recording. It is suggested to run this step here as otherwise the BaseConcatDataset has to be updated as well.
accepted_bads_ratio (float, optional) – Acceptable proportion of trials withinconsistent length in a raw. If the number of trials whose length is exceeded by the window size is smaller than this, then only the corresponding trials are dropped, but the computation continues. Otherwise, an error is raised. Defaults to 0.0 (raise an error).
- Returns:
windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode
- Return type:
Submodules#
braindecode.datasets.base module#
Dataset classes.
- class braindecode.datasets.base.BaseConcatDataset(list_of_ds, target_transform=None)[source]#
Bases:
ConcatDataset
A base class for concatenated datasets. Holds either mne.Raw or mne.Epoch in self.datasets and has a pandas DataFrame with additional description.
- Parameters:
list_of_ds (list) – list of BaseDataset, BaseConcatDataset or WindowsDataset
target_transform (callable | None) – Optional function to call on targets before returning them.
- property description#
- get_metadata()[source]#
Concatenate the metadata and description of the wrapped Epochs.
- Returns:
metadata – DataFrame containing as many rows as there are windows in the BaseConcatDataset, with the metadata and description information for each window.
- Return type:
pd.DataFrame
- save(path, overwrite=False, offset=0)[source]#
Save datasets to files by creating one subdirectory for each dataset: path/
- 0/
0-raw.fif | 0-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)
- 1/
1-raw.fif | 1-epo.fif description.json raw_preproc_kwargs.json (if raws were preprocessed) window_kwargs.json (if this is a windowed dataset) window_preproc_kwargs.json (if windows were preprocessed) target_name.json (if target_name is not None and dataset is raw)
- Parameters:
path (str) –
- Directory in which subdirectories are created to store
-raw.fif | -epo.fif and .json files to.
overwrite (bool) – Whether to delete old subdirectories that will be saved to in this call.
offset (int) – If provided, the integer is added to the id of the dataset in the concat. This is useful in the setting of very large datasets, where one dataset has to be processed and saved at a time to account for its original position.
- set_description(description, overwrite=False)[source]#
Update (add or overwrite) the dataset description.
- split(by=None, property=None, split_ids=None)[source]#
Split the dataset based on information listed in its description DataFrame or based on indices.
- Parameters:
by (str | list | dict) – If
by
is a string, splitting is performed based on the description DataFrame column with this name. Ifby
is a (list of) list of integers, the position in the first list corresponds to the split id and the integers to the datapoints of that split. If a dict then each key will be used in the returned splits dict and each value should be a list of int.property (str) – Some property which is listed in info DataFrame.
split_ids (list | dict) – List of indices to be combined in a subset. It can be a list of int or a list of list of int.
- Returns:
splits – A dictionary with the name of the split (a string) as key and the dataset as value.
- Return type:
- property target_transform#
- property transform#
- class braindecode.datasets.base.BaseDataset(raw, description=None, target_name=None, transform=None)[source]#
Bases:
Dataset
Returns samples from an mne.io.Raw object along with a target.
Dataset which serves samples from an mne.io.Raw object along with a target. The target is unique for the dataset, and is obtained through the description attribute.
- Parameters:
raw (mne.io.Raw) – Continuous data.
description (dict | pandas.Series | None) – Holds additional description about the continuous signal / subject.
target_name (str | tuple | None) – Name(s) of the index in description that should be used to provide the target (e.g., to be used in a prediction task later on).
transform (callable | None) – On-the-fly transform applied to the example before it is returned.
- property description#
- set_description(description, overwrite=False)[source]#
Update (add or overwrite) the dataset description.
- property transform#
- class braindecode.datasets.base.EEGWindowsDataset(raw, metadata, description=None, transform=None, targets_from='metadata', last_target_only=True)[source]#
Bases:
BaseDataset
Returns windows from an mne.Raw object, its window indices, along with a target.
Dataset which serves windows from an mne.Epochs object along with their target and additional information. The metadata attribute of the Epochs object must contain a column called target, which will be used to return the target that corresponds to a window. Additional columns i_window_in_trial, i_start_in_trial, i_stop_in_trial are also required to serve information about the windowing (e.g., useful for cropped training). See braindecode.datautil.windowers to directly create a WindowsDataset from a BaseDataset object.
- Parameters:
windows (mne.Raw or mne.Epochs (Epochs is outdated)) – Windows obtained through the application of a windower to a BaseDataset (see braindecode.datautil.windowers).
description (dict | pandas.Series | None) – Holds additional info about the windows.
transform (callable | None) – On-the-fly transform applied to a window before it is returned.
targets_from (str) – Defines whether targets will be extracted from metadata or from misc channels (time series targets). It can be metadata (default) or channels.
last_target_only (bool) – If targets are obtained from misc channels whether all targets if the entire (compute) window will be returned or only the last target in the window.
metadata (pandas.DataFrame) – Dataframe with crop indices, so i_window_in_trial, i_start_in_trial, i_stop_in_trial as well as targets.
- property description#
- set_description(description, overwrite=False)[source]#
Update (add or overwrite) the dataset description.
- property transform#
- class braindecode.datasets.base.WindowsDataset(windows, description=None, transform=None, targets_from='metadata', last_target_only=True)[source]#
Bases:
BaseDataset
Returns windows from an mne.Epochs object along with a target.
Dataset which serves windows from an mne.Epochs object along with their target and additional information. The metadata attribute of the Epochs object must contain a column called target, which will be used to return the target that corresponds to a window. Additional columns i_window_in_trial, i_start_in_trial, i_stop_in_trial are also required to serve information about the windowing (e.g., useful for cropped training). See braindecode.datautil.windowers to directly create a WindowsDataset from a BaseDataset object.
- Parameters:
windows (mne.Epochs) – Windows obtained through the application of a windower to a BaseDataset (see braindecode.datautil.windowers).
description (dict | pandas.Series | None) – Holds additional info about the windows.
transform (callable | None) – On-the-fly transform applied to a window before it is returned.
targets_from (str) – Defines whether targets will be extracted from mne.Epochs metadata or mne.Epochs misc channels (time series targets). It can be metadata (default) or channels.
- property description#
- set_description(description, overwrite=False)[source]#
Update (add or overwrite) the dataset description.
- property transform#
braindecode.datasets.bbci module#
- class braindecode.datasets.bbci.BBCIDataset(filename, load_sensor_names=None, check_class_names=False)[source]#
Bases:
object
Loader class for files created by saving BBCI files in matlab (make sure to save with ‘-v7.3’ in matlab, see https://de.mathworks.com/help/matlab/import_export/mat-file-versions.html#buk6i87 )
- Parameters:
filename (str) –
load_sensor_names (list of str, optional) – Also speeds up loading if you only load some sensors. None means load all sensors.
check_class_names (bool, optional) – check if the class names are part of some known class names at Translational NeuroTechnology Lab, AG Ball, Freiburg, Germany.
braindecode.datasets.bcicomp module#
- class braindecode.datasets.bcicomp.BCICompetitionIVDataset4(subject_ids=None)[source]#
Bases:
BaseConcatDataset
BCI competition IV dataset 4.
Contains ECoG recordings for three patients moving fingers during the experiment. Targets correspond to the time courses of the flexion of each of five fingers. See http://www.bbci.de/competition/iv/desc_4.pdf and http://www.bbci.de/competition/iv/ for the dataset and competition description. ECoG library containing the dataset: https://searchworks.stanford.edu/view/zk881ps0522
Notes
When using this dataset please cite [1] .
- Parameters:
subject_ids (list(int) | int | None) – (list of) int of subject(s) to be loaded. If None, load all available subjects. Should be in range 1-3.
References
[1]Miller, Kai J. “A library of human electrocorticographic data and analyses.”
Nature human behaviour 3, no. 11 (2019): 1225-1235. https://doi.org/10.1038/s41562-019-0678-3
- static download(path=None, force_update=False, verbose=None)[source]#
Download the dataset.
- Parameters:
location. (path (None | str) – Location of where to look for the data storing) –
None (or None) – If not) –
parameter (the environment variable or config) –
exist (MNE_DATASETS_(dataset)_PATH is used. If it doesn’t) –
“~/mne_data” (the) –
path (directory is used. If the dataset is not found under the given) –
data (the) –
folder. (will be automatically downloaded to the specified) –
exists. (force_update (bool) – Force update of the dataset even if a local copy) –
(bool (verbose) –
str –
int –
None –
level (override default verbose) –
mne.verbose()) ((see) –
- possible_subjects = [1, 2, 3]#
braindecode.datasets.mne module#
- braindecode.datasets.mne.create_from_mne_epochs(list_of_epochs, window_size_samples, window_stride_samples, drop_last_window)[source]#
Create WindowsDatasets from mne.Epochs
- Parameters:
- Returns:
windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode
- Return type:
- braindecode.datasets.mne.create_from_mne_raw(raws, trial_start_offset_samples, trial_stop_offset_samples, window_size_samples, window_stride_samples, drop_last_window, descriptions=None, mapping=None, preload=False, drop_bad_windows=True, accepted_bads_ratio=0.0)[source]#
Create WindowsDatasets from mne.RawArrays
- Parameters:
raws (array-like) – list of mne.RawArrays
trial_start_offset_samples (int) – start offset from original trial onsets in samples
trial_stop_offset_samples (int) – stop offset from original trial stop in samples
window_size_samples (int) – window size
window_stride_samples (int) – stride between windows
drop_last_window (bool) – whether or not have a last overlapping window, when windows do not equally divide the continuous signal
descriptions (array-like) – list of dicts or pandas.Series with additional information about the raws
mapping (dict(str: int)) – mapping from event description to target value
preload (bool) – if True, preload the data of the Epochs objects.
drop_bad_windows (bool) – If True, call .drop_bad() on the resulting mne.Epochs object. This step allows identifying e.g., windows that fall outside of the continuous recording. It is suggested to run this step here as otherwise the BaseConcatDataset has to be updated as well.
accepted_bads_ratio (float, optional) – Acceptable proportion of trials withinconsistent length in a raw. If the number of trials whose length is exceeded by the window size is smaller than this, then only the corresponding trials are dropped, but the computation continues. Otherwise, an error is raised. Defaults to 0.0 (raise an error).
- Returns:
windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode
- Return type:
braindecode.datasets.moabb module#
Dataset objects for some public datasets.
- class braindecode.datasets.moabb.BNCI2014001(subject_ids)[source]#
Bases:
MOABBDataset
BNCI 2014-001 Motor Imagery dataset.
Dataset summary
Dataset IIa from BCI Competition 4 [Ra334fadbbade-1].
Dataset Description
This data set consists of EEG data from 9 subjects. The cue-based BCI paradigm consisted of four different motor imagery tasks, namely the imag- ination of movement of the left hand (class 1), right hand (class 2), both feet (class 3), and tongue (class 4). Two sessions on different days were recorded for each subject. Each session is comprised of 6 runs separated by short breaks. One run consists of 48 trials (12 for each of the four possible classes), yielding a total of 288 trials per session.
The subjects were sitting in a comfortable armchair in front of a computer screen. At the beginning of a trial ( t = 0 s), a fixation cross appeared on the black screen. In addition, a short acoustic warning tone was presented. After two seconds ( t = 2 s), a cue in the form of an arrow pointing either to the left, right, down or up (corresponding to one of the four classes left hand, right hand, foot or tongue) appeared and stayed on the screen for 1.25 s. This prompted the subjects to perform the desired motor imagery task. No feedback was provided. The subjects were ask to carry out the motor imagery task until the fixation cross disappeared from the screen at t = 6 s.
Twenty-two Ag/AgCl electrodes (with inter-electrode distances of 3.5 cm) were used to record the EEG; the montage is shown in Figure 3 left. All signals were recorded monopolarly with the left mastoid serving as reference and the right mastoid as ground. The signals were sampled with. 250 Hz and bandpass-filtered between 0.5 Hz and 100 Hz. The sensitivity of the amplifier was set to 100 μV . An additional 50 Hz notch filter was enabled to suppress line noise
- Parameters:
- subject_ids: list(int) | int | None
(list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.
See moabb.datasets.bnci.BNCI2014001
- class BNCI2014001(*args, **kwargs)[source]#
Bases:
BNCI2014_001
BNCI 2014-001 Motor Imagery dataset.
Dataset summary
Dataset IIa from BCI Competition 4 [1].
Dataset Description
This data set consists of EEG data from 9 subjects. The cue-based BCI paradigm consisted of four different motor imagery tasks, namely the imag- ination of movement of the left hand (class 1), right hand (class 2), both feet (class 3), and tongue (class 4). Two sessions on different days were recorded for each subject. Each session is comprised of 6 runs separated by short breaks. One run consists of 48 trials (12 for each of the four possible classes), yielding a total of 288 trials per session.
The subjects were sitting in a comfortable armchair in front of a computer screen. At the beginning of a trial ( t = 0 s), a fixation cross appeared on the black screen. In addition, a short acoustic warning tone was presented. After two seconds ( t = 2 s), a cue in the form of an arrow pointing either to the left, right, down or up (corresponding to one of the four classes left hand, right hand, foot or tongue) appeared and stayed on the screen for 1.25 s. This prompted the subjects to perform the desired motor imagery task. No feedback was provided. The subjects were ask to carry out the motor imagery task until the fixation cross disappeared from the screen at t = 6 s.
Twenty-two Ag/AgCl electrodes (with inter-electrode distances of 3.5 cm) were used to record the EEG; the montage is shown in Figure 3 left. All signals were recorded monopolarly with the left mastoid serving as reference and the right mastoid as ground. The signals were sampled with. 250 Hz and bandpass-filtered between 0.5 Hz and 100 Hz. The sensitivity of the amplifier was set to 100 μV . An additional 50 Hz notch filter was enabled to suppress line noise
References
[1]Tangermann, M., Müller, K.R., Aertsen, A., Birbaumer, N., Braun, C., Brunner, C., Leeb, R., Mehring, C., Miller, K.J., Mueller-Putz, G. and Nolte, G., 2012. Review of the BCI competition IV. Frontiers in neuroscience, 6, p.55.
- doc = 'See moabb.datasets.bnci.BNCI2014001\n\n Parameters\n ----------\n subject_ids: list(int) | int | None\n (list of) int of subject(s) to be fetched. If None, data of all\n subjects is fetched.\n '#
- class braindecode.datasets.moabb.HGD(subject_ids)[source]#
Bases:
MOABBDataset
High-gamma dataset described in Schirrmeister et al. 2017.
Dataset summary
Name
#Subj
#Chan
#Classes
#Trials / class
Trials len
Sampling rate
#Sessions
Schirrmeister2017
14
128
4
120
4s
500Hz
1
Dataset from [Ra4397400a5be-1]
Our “High-Gamma Dataset” is a 128-electrode dataset (of which we later only use 44 sensors covering the motor cortex, (see Section 2.7.1), obtained from 14 healthy subjects (6 female, 2 left-handed, age 27.2 ± 3.6 (mean ± std)) with roughly 1000 (963.1 ± 150.9, mean ± std) four-second trials of executed movements divided into 13 runs per subject. The four classes of movements were movements of either the left hand, the right hand, both feet, and rest (no movement, but same type of visual cue as for the other classes). The training set consists of the approx. 880 trials of all runs except the last two runs, the test set of the approx. 160 trials of the last 2 runs. This dataset was acquired in an EEG lab optimized for non-invasive detection of high- frequency movement-related EEG components (Ball et al., 2008; Darvas et al., 2010).
Depending on the direction of a gray arrow that was shown on black back- ground, the subjects had to repetitively clench their toes (downward arrow), perform sequential finger-tapping of their left (leftward arrow) or right (rightward arrow) hand, or relax (upward arrow). The movements were selected to require little proximal muscular activity while still being complex enough to keep subjects in- volved. Within the 4-s trials, the subjects performed the repetitive movements at their own pace, which had to be maintained as long as the arrow was showing. Per run, 80 arrows were displayed for 4 s each, with 3 to 4 s of continuous random inter-trial interval. The order of presentation was pseudo-randomized, with all four arrows being shown every four trials. Ideally 13 runs were performed to collect 260 trials of each movement and rest. The stimuli were presented and the data recorded with BCI2000 (Schalk et al., 2004). The experiment was approved by the ethical committee of the University of Freiburg.
- Parameters:
- subject_ids: list(int) | int | None
(list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.
See moabb.datasets.schirrmeister2017.Schirrmeister2017
- class Schirrmeister2017[source]#
Bases:
BaseDataset
High-gamma dataset described in Schirrmeister et al. 2017.
Dataset summary
Name
#Subj
#Chan
#Classes
#Trials / class
Trials len
Sampling rate
#Sessions
Schirrmeister2017
14
128
4
120
4s
500Hz
1
Dataset from [1]
Our “High-Gamma Dataset” is a 128-electrode dataset (of which we later only use 44 sensors covering the motor cortex, (see Section 2.7.1), obtained from 14 healthy subjects (6 female, 2 left-handed, age 27.2 ± 3.6 (mean ± std)) with roughly 1000 (963.1 ± 150.9, mean ± std) four-second trials of executed movements divided into 13 runs per subject. The four classes of movements were movements of either the left hand, the right hand, both feet, and rest (no movement, but same type of visual cue as for the other classes). The training set consists of the approx. 880 trials of all runs except the last two runs, the test set of the approx. 160 trials of the last 2 runs. This dataset was acquired in an EEG lab optimized for non-invasive detection of high- frequency movement-related EEG components (Ball et al., 2008; Darvas et al., 2010).
Depending on the direction of a gray arrow that was shown on black back- ground, the subjects had to repetitively clench their toes (downward arrow), perform sequential finger-tapping of their left (leftward arrow) or right (rightward arrow) hand, or relax (upward arrow). The movements were selected to require little proximal muscular activity while still being complex enough to keep subjects in- volved. Within the 4-s trials, the subjects performed the repetitive movements at their own pace, which had to be maintained as long as the arrow was showing. Per run, 80 arrows were displayed for 4 s each, with 3 to 4 s of continuous random inter-trial interval. The order of presentation was pseudo-randomized, with all four arrows being shown every four trials. Ideally 13 runs were performed to collect 260 trials of each movement and rest. The stimuli were presented and the data recorded with BCI2000 (Schalk et al., 2004). The experiment was approved by the ethical committee of the University of Freiburg.
References
[1]Schirrmeister, Robin Tibor, et al. “Deep learning with convolutional neural networks for EEG decoding and visualization.” Human brain mapping 38.11 (2017): 5391-5420.
- data_path(subject, path=None, force_update=False, update_path=None, verbose=None)[source]#
Get path to local copy of a subject data.
- Parameters:
subject (int) – Number of subject to use
path (None | str) – Location of where to look for the data storing location. If None, the environment variable or config parameter
MNE_DATASETS_(dataset)_PATH
is used. If it doesn’t exist, the “~/mne_data” directory is used. If the dataset is not found under the given path, the data will be automatically downloaded to the specified folder.force_update (bool) – Force update of the dataset even if a local copy exists.
update_path (bool | None Deprecated) – If True, set the MNE_DATASETS_(dataset)_PATH in mne-python config to the given path. If None, the user is prompted.
verbose (bool, str, int, or None) – If not None, override default verbose level (see
mne.verbose()
).
- Returns:
path – Local path to the given data file. This path is contained inside a list of length one, for compatibility.
- Return type:
- doc = 'See moabb.datasets.schirrmeister2017.Schirrmeister2017\n\n Parameters\n ----------\n subject_ids: list(int) | int | None\n (list of) int of subject(s) to be fetched. If None, data of all\n subjects is fetched.\n '#
- class braindecode.datasets.moabb.MOABBDataset(dataset_name, subject_ids, dataset_kwargs=None)[source]#
Bases:
BaseConcatDataset
A class for moabb datasets.
- Parameters:
dataset_name (str) – name of dataset included in moabb to be fetched
subject_ids (list(int) | int | None) – (list of) int of subject(s) to be fetched. If None, data of all subjects is fetched.
dataset_kwargs (dict, optional) – optional dictionary containing keyword arguments to pass to the moabb dataset when instantiating it.
braindecode.datasets.sleep_physionet module#
- class braindecode.datasets.sleep_physionet.SleepPhysionet(subject_ids=None, recording_ids=None, preload=False, load_eeg_only=True, crop_wake_mins=30, crop=None)[source]#
Bases:
BaseConcatDataset
Sleep Physionet dataset.
Sleep dataset from https://physionet.org/content/sleep-edfx/1.0.0/. Contains overnight recordings from 78 healthy subjects.
See [MNE example](https://mne.tools/stable/auto_tutorials/sample-datasets/plot_sleep.html).
- Parameters:
subject_ids (list(int) | int | None) – (list of) int of subject(s) to be loaded. If None, load all available subjects.
recording_ids (list(int) | None) – Recordings to load per subject (each subject except 13 has two recordings). Can be [1], [2] or [1, 2] (same as None).
preload (bool) – If True, preload the data of the Raw objects.
load_eeg_only (bool) – If True, only load the EEG channels and discard the others (EOG, EMG, temperature, respiration) to avoid resampling the other signals.
crop_wake_mins (float) – Number of minutes of wake time to keep before the first sleep event and after the last sleep event. Used to reduce the imbalance in this dataset. Default of 30 mins.
crop (None | tuple) – If not None crop the raw files (e.g. to use only the first 3h). Example:
crop=(0, 3600*3)
to keep only the first 3h.
braindecode.datasets.tuh module#
Dataset classes for the Temple University Hospital (TUH) EEG Corpus and the TUH Abnormal EEG Corpus.
- class braindecode.datasets.tuh.TUH(path, recording_ids=None, target_name=None, preload=False, add_physician_reports=False, n_jobs=1)[source]#
Bases:
BaseConcatDataset
Temple University Hospital (TUH) EEG Corpus (www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml#c_tueg).
- Parameters:
path (str) – Parent directory of the dataset.
recording_ids (list(int) | int) – A (list of) int of recording id(s) to be read (order matters and will overwrite default chronological order, e.g. if recording_ids=[1,0], then the first recording returned by this class will be chronologically later then the second recording. Provide recording_ids in ascending order to preserve chronological order.).
target_name (str) – Can be ‘gender’, or ‘age’.
preload (bool) – If True, preload the data of the Raw objects.
add_physician_reports (bool) – If True, the physician reports will be read from disk and added to the description.
n_jobs (int) – Number of jobs to be used to read files in parallel.
- class braindecode.datasets.tuh.TUHAbnormal(path, recording_ids=None, target_name='pathological', preload=False, add_physician_reports=False, n_jobs=1)[source]#
Bases:
TUH
Temple University Hospital (TUH) Abnormal EEG Corpus. see www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml#c_tuab
- Parameters:
path (str) – Parent directory of the dataset.
recording_ids (list(int) | int) – A (list of) int of recording id(s) to be read (order matters and will overwrite default chronological order, e.g. if recording_ids=[1,0], then the first recording returned by this class will be chronologically later then the second recording. Provide recording_ids in ascending order to preserve chronological order.).
target_name (str) – Can be ‘pathological’, ‘gender’, or ‘age’.
preload (bool) – If True, preload the data of the Raw objects.
add_physician_reports (bool) – If True, the physician reports will be read from disk and added to the description.
braindecode.datasets.xy module#
- braindecode.datasets.xy.create_from_X_y(X, y, drop_last_window, sfreq, ch_names=None, window_size_samples=None, window_stride_samples=None)[source]#
Create a BaseConcatDataset of WindowsDatasets from X and y to be used for decoding with skorch and braindecode, where X is a list of pre-cut trials and y are corresponding targets.
- Parameters:
X (array-like) – list of pre-cut trials as n_trials x n_channels x n_times
y (array-like) – targets corresponding to the trials
drop_last_window (bool) – whether or not have a last overlapping window, when windows/windows do not equally divide the continuous signal
sfreq (float) – Sampling frequency of signals.
ch_names (array-like) – Names of the channels.
window_size_samples (int) – window size
window_stride_samples (int) – stride between windows
- Returns:
windows_datasets – X and y transformed to a dataset format that is compatible with skorch and braindecode
- Return type: