Fixed-Length Windows Extraction#

Introduction to Fixed-Length Windows Function#

In many EEG decoding tasks, such as self-supervised pre-training, it is useful to split long continuous recordings into fixed-length, overlapping or non-overlapping windows

The function braindecode.preprocessing.create_fixed_length_windows() provides an easy way to slice a continuous EEG recording into such windows.

This tutorial explains how to use it, what its parameters mean, and how it can be applied to EEG datasets.

Overview of create_fixed_length_windows#

The function:

create_fixed_length_windows(
    concat_ds,
    start_offset_samples=0,
    stop_offset_samples=None,
    window_size_samples=None,
    window_stride_samples=None,
    drop_last_window=None,
    mapping=None,
    preload=False,
    picks=None,
    reject=None,
    flat=None,
    targets_from='metadata',
    last_target_only=True,
    lazy_metadata=False,
    on_missing='error',
    n_jobs=1,
    verbose='error',
)

Parameters#

concat_dsConcatDataset
  • A concat of base datasets each holding raw and description.

start_offset_samplesint (default=0)
  • Start offset from beginning of recording in samples.

stop_offset_samplesint or None (default=None)
  • Stop offset from beginning of recording in samples. If None, set to be the end of the recording.

window_size_samplesint or None
  • Window size in samples. If None, set to be the maximum possible window size, ie length of the recording, once offsets are accounted for.

window_stride_samplesint or None
  • Stride between windows in samples. If None, set to be equal to winddow_size_samples, so windows will not overlap.

drop_last_windowbool or None
  • Whether or not have a last overlapping window, when windows do not equally divide the continuous signal. Must be set to a bool if window size and stride are not None.

mappingdict(str: int) or None
  • Mapping from event description to target value.

preloadbool (default=False)
  • If True, preload the data of the Epochs objects.

picksstr | list | slice | None
  • Channels to include. If None, all available channels are used. See mne.Epochs.

rejectdict or None
  • Epoch rejection parameters based on peak-to-peak amplitude. If None, no rejection is done based on peak-to-peak amplitude. See mne.Epochs.

flatdict or None
  • Epoch rejection parameters based on flatness of signals. If None, no rejection based on flatness is done. See mne.Epochs.

targets_fromstr (default=’metadata’)
  • Choose where to get targets from: either ‘metadata’ or ‘events’

last_target_onlybool (default=True)
  • If True, only use the last target in the window.

lazy_metadatabool (default=False)
  • If True, metadata is not computed immediately, but only when accessed by using the _LazyDataFrame (experimental).

on_missingstr (default=’error’)
  • What to do if one or several event ids are not found in the recording. Valid keys are ‘error’ | ‘warning’ | ‘ignore’. See mne.Epochs.

n_jobsint (default=1)
  • Number of jobs to use to parallelize the windowing.

verbosebool | str | int | None
  • Control verbosity of the logging output when calling mne.Epochs.

Example 1: Basic 2-Second, 50% Overlapping Windows#

from numpy import multiply

from braindecode.datasets import MOABBDataset
from braindecode.preprocessing import (
    Filter,
    Pick,
    Preprocessor,
    create_fixed_length_windows,
    preprocess,
)
# Load the EEG dataset
dataset = MOABBDataset(dataset_name="BNCI2014_001", subject_ids=[1])

Preprocessing

preprocessors = [
    Pick(eeg=True, meg=False, stim=False),
    Preprocessor(lambda data: multiply(data, 1e6)),
    Filter(l_freq=4.0, h_freq=38.0),
]
preprocess(dataset, preprocessors)

# Sampling frequency
sfreq = dataset.datasets[0].raw.info["sfreq"]

Create windows

window_size_samples = int(sfreq * 2)  # 2-second windows
window_stride_samples = int(window_size_samples * 0.5)  # 50% overlap

windows_dataset = create_fixed_length_windows(
    concat_ds=dataset,
    start_offset_samples=0,
    stop_offset_samples=None,
    window_size_samples=window_size_samples,
    window_stride_samples=window_stride_samples,
    drop_last_window=True,
    mapping=None,
    preload=True,
    picks="eeg",  # Only EEG channels
    reject=dict(eeg=150e-6),  # Reject windows where EEG p2p > 150 µV
    flat=None,
    targets_from="metadata",
    last_target_only=True,
    on_missing="warning",
    n_jobs=1,
    verbose="error",
)

# Let's inspect the output to better understand what we created.

# Check how many windows were created
print(f"Number of windows: {len(windows_dataset)}")

# Each window contains EEG data of fixed size
X, y = windows_dataset[0]
print(f"Window data shape: {X.shape}")
print(f"Window label: {y}")

Working with Targets#

In create_fixed_length_windows, targets can be derived in two ways:

  1. From the recording metadata (e.g., “session”, “condition”). This is the default when targets_from=’metadata’.

  2. From signal channels themselves when targets_from=’channels’, useful for cases like sleep staging where annotations are stored in auxiliary channels.

Additionally:

  • mapping: Optionally map target values (e.g. from “0train” / “1test” to 0 / 1).

  • last_target_only=True (default): If multiple targets are present within a window (as in time-varying labels), use only the final target value in that window.

# Example: mapping session names ("0train" and "1test") to integers

mapping = {"0train": 0, "1test": 1}

windows_dataset = create_fixed_length_windows(
    dataset,
    window_size_samples=window_size_samples,
    window_stride_samples=window_stride_samples,
    drop_last_window=True,
    mapping=mapping,
    preload=True,
)

# View first few targets
print("Targets for first 10 windows:")
print(windows_dataset.datasets[0].windows.get_metadata()["target"][:10])

Example: Rejecting Windows Based on Amplitude#

You can set rejection criteria to exclude windows with extreme values:

reject_criteria = dict(eeg=150e-6)  # 150 µV max peak-to-peak allowed

windows_with_rejection = create_fixed_length_windows(
    concat_ds=dataset,
    window_size_samples=200,
    window_stride_samples=100,
    reject=reject_criteria,
    drop_last_window=True,
)

print(windows_with_rejection)

Example: Using lazy metadata generation#

For large datasets, it can be faster to generate metadata on-demand:

lazy_windows = create_fixed_length_windows(
    concat_ds=dataset,
    window_size_samples=200,
    window_stride_samples=100,
    drop_last_window=True,
    lazy_metadata=True,
)

print(lazy_windows)

Example: Shifted Windows#

You can also create shifted windows by using start_offset_samples or stop_offset_samples. For example, start windowing 500 ms later into the recording.

start_offset_seconds = 0.5
start_offset_samples = int(start_offset_seconds * sfreq)

shifted_windows_dataset = create_fixed_length_windows(
    dataset,
    start_offset_samples=start_offset_samples,
    window_size_samples=window_size_samples,
    window_stride_samples=window_stride_samples,
    drop_last_window=True,
    preload=True,
)

print(f"Number of shifted windows: {len(shifted_windows_dataset)}")

Estimated memory usage: 0 MB

Gallery generated by Sphinx-Gallery