Note
Click here to download the full example code
Multiple discrete targets with the TUH EEG Corpus#
In this example, we showcase usage of multiple discrete targets per recording with the TUH EEG Corpus.
# Author: Lukas Gemein <l.gemein@gmail.com>
#
# License: BSD (3-clause)
import mne
from torch.utils.data import DataLoader
from braindecode.datasets import TUH
from braindecode.preprocessing import create_fixed_length_windows
mne.set_log_level('ERROR') # avoid messages everytime a window is extracted
If you want to try this code with the actual data, please delete the next section. We are required to mock some dataset functionality, since the data is not available at creation time of this example.
from braindecode.datasets.tuh import _TUHMock as TUH # noqa F811
We start by creating a TUH dataset. Instead of just a str, we give it multiple strings as target names. Each of the strings has to exist as a column in the description DataFrame.
Iterating through the dataset gives x as ndarray(n_channels x 1) as well as the target as [age of the subject, gender of the subject]. Let’s look at the last example as it has more interesting age/gender labels (compare to the last row of the dataframe above).
x: [[-1.39513281]
[ 0.87884217]
[ 0.76573971]
[-1.07025866]
[ 2.08539375]
[ 0.04078953]
[-0.49139174]
[-0.00571471]
[ 0.43150701]
[ 0.6832762 ]
[ 0.07593693]
[-0.23194062]
[-1.60383273]
[ 1.10370984]
[-1.33047682]
[-0.56567374]
[-0.52561288]
[ 0.38533583]
[-0.13483532]
[-1.0106733 ]
[ 0.61984129]]
y: [83, 'F']
We will skip preprocessing steps for now, since it is not the aim of this example. Instead, we will directly create compute windows. We specify a mapping from genders ‘M’ and ‘F’ to integers, since this is required for decoding.
tuh_windows = create_fixed_length_windows(
tuh,
start_offset_samples=0,
stop_offset_samples=None,
window_size_samples=1000,
window_stride_samples=1000,
drop_last_window=False,
mapping={'M': 0, 'F': 1}, # map non-digit targets
)
# store the number of windows required for loading later on
tuh_windows.set_description({
"n_windows": [len(d) for d in tuh_windows.datasets]})
Iterating through the dataset gives x as ndarray(n_channels x 1000), y as [age, gender], and ind. Let’s look at the last example again.
x: [[ 1.8291361 0.13107584 0.09874357 ... 1.5261606 -1.7010179
-1.3951328 ]
[-1.0897747 0.53174806 0.40103233 ... -2.2451496 -0.872565
0.8788422 ]
[-0.3393165 -0.14407331 1.3640033 ... 0.11461935 0.11734861
0.7657397 ]
...
[-0.2753506 -1.1553683 0.13125211 ... -0.37722126 -0.5284417
-0.13483532]
[-0.3746723 0.05251044 1.1372249 ... -0.9949117 0.50443316
-1.0106733 ]
[-1.0833622 1.2574992 -1.4423852 ... 1.876381 1.0440995
0.6198413 ]]
y: [83, 1]
ind: [3, 2600, 3600]
We give the dataset to a pytorch DataLoader, such that it can be used for model training.
dl = DataLoader(
dataset=tuh_windows,
batch_size=4,
)
Iterating through the DataLoader gives batch_X as tensor(4 x n_channels x 1000), batch_y as [tensor([4 x age of subject]), tensor([4 x gender of subject])], and batch_ind. We will iterate to the end to look at the last example again.
batch_X: tensor([[[ 5.5363e-02, 8.2126e-01, -8.4776e-01, ..., 9.8022e-01,
-7.0758e-01, -1.3419e+00],
[ 5.3741e-01, 2.8508e-01, 1.7888e+00, ..., 1.8968e+00,
-2.7722e-01, 1.2931e+00],
[-2.6328e-01, 2.8525e+00, 1.6408e+00, ..., -1.7391e-01,
-1.3787e-01, 1.3067e+00],
...,
[ 7.6871e-01, 9.6170e-02, -1.9727e+00, ..., -1.3167e+00,
1.2657e-01, -5.7562e-01],
[-7.7614e-01, -1.3191e-01, 1.0751e+00, ..., 6.8062e-01,
-1.4114e+00, -5.6931e-01],
[-1.1234e+00, -9.2222e-01, -8.1196e-01, ..., 2.9172e-01,
-1.1871e+00, 1.1650e+00]],
[[ 5.3379e-01, 4.7839e-01, -1.5889e+00, ..., -9.3951e-01,
6.9861e-01, 1.2567e-01],
[-7.3573e-01, 1.1935e+00, 6.5482e-01, ..., 1.7572e+00,
-1.4189e+00, -1.2484e-01],
[ 3.1653e-01, -1.4104e+00, -1.0475e-01, ..., 1.0421e+00,
9.9583e-01, 6.4541e-01],
...,
[-7.6099e-01, 1.1348e+00, 1.2041e+00, ..., -9.5639e-01,
-3.9937e-01, -2.1994e+00],
[ 1.0830e-03, -4.0075e-01, -1.2254e+00, ..., 2.0254e+00,
-1.2927e-01, 9.5079e-02],
[-1.9702e-01, 1.1700e+00, -7.7355e-01, ..., 6.4929e-01,
2.8302e+00, 4.4761e-01]],
[[ 4.4920e-01, 2.9287e-01, -4.4798e-01, ..., 1.4734e-01,
-1.9941e+00, -3.7620e-01],
[ 1.3327e+00, 1.9766e-01, -4.9474e-01, ..., -2.8459e-01,
2.9799e-01, 7.7121e-01],
[ 1.2965e-01, 7.2586e-01, 3.1546e-01, ..., -5.4598e-01,
-1.4853e-01, 7.8705e-01],
...,
[-4.2984e-01, -7.6921e-01, 6.0587e-01, ..., -5.4981e-01,
1.7690e-01, 6.0530e-01],
[ 4.0353e-01, 5.1134e-01, 4.1883e-01, ..., 6.0788e-01,
-1.6549e+00, -6.7648e-01],
[-5.7833e-01, -4.9057e-01, 8.2945e-01, ..., 2.6360e-01,
1.0847e-01, 5.0733e-01]],
[[ 1.8291e+00, 1.3108e-01, 9.8744e-02, ..., 1.5262e+00,
-1.7010e+00, -1.3951e+00],
[-1.0898e+00, 5.3175e-01, 4.0103e-01, ..., -2.2451e+00,
-8.7256e-01, 8.7884e-01],
[-3.3932e-01, -1.4407e-01, 1.3640e+00, ..., 1.1462e-01,
1.1735e-01, 7.6574e-01],
...,
[-2.7535e-01, -1.1554e+00, 1.3125e-01, ..., -3.7722e-01,
-5.2844e-01, -1.3484e-01],
[-3.7467e-01, 5.2510e-02, 1.1372e+00, ..., -9.9491e-01,
5.0443e-01, -1.0107e+00],
[-1.0834e+00, 1.2575e+00, -1.4424e+00, ..., 1.8764e+00,
1.0441e+00, 6.1984e-01]]])
batch_y: [tensor([83, 83, 83, 83]), tensor([1, 1, 1, 1])]
batch_ind: [tensor([0, 1, 2, 3]), tensor([ 0, 1000, 2000, 2600]), tensor([1000, 2000, 3000, 3600])]
Total running time of the script: ( 0 minutes 1.293 seconds)
Estimated memory usage: 18 MB