braindecode.functional.rescale_parameter#
- braindecode.functional.rescale_parameter(param, layer_id)[source]#
Recaling the l-th transformer layer.
Rescales the parameter tensor by the inverse square root of the layer id. Made inplace.
[Beit2022]In the labram, this is used to rescale the output matrices (i.e., the last linear projection within each sub-layer) of the self-attention module.
- Parameters:
param (
torch.Tensor
) – tensor to be rescaledlayer_id (int) – layer id in the neural network
References
[Beit2022] Hangbo Bao, Li Dong, Songhao Piao, Furu We (2022). BEIT: BERT Pre-Training of Image Transformers.