Shortcuts

WindowMSAV2

class mmpretrain.models.utils.WindowMSAV2(embed_dims, window_size, num_heads, qkv_bias=True, attn_drop=0.0, proj_drop=0.0, cpb_mlp_hidden_dims=512, pretrained_window_size=(0, 0), init_cfg=None)[source]

Window based multi-head self-attention (W-MSA) module with relative position bias.

Based on implementation on Swin Transformer V2 original repo. Refers to https://github.com/microsoft/Swin-Transformer/blob/main/models/swin_transformer_v2.py for more details.

Parameters:
  • embed_dims (int) – Number of input channels.

  • window_size (tuple[int]) – The height and width of the window.

  • num_heads (int) – Number of attention heads.

  • qkv_bias (bool) – If True, add a learnable bias to q, k, v. Defaults to True.

  • attn_drop (float) – Dropout ratio of attention weight. Defaults to 0.

  • proj_drop (float) – Dropout ratio of output. Defaults to 0.

  • cpb_mlp_hidden_dims (int) – The hidden dimensions of the continuous relative position bias network. Defaults to 512.

  • pretrained_window_size (tuple(int)) – The height and width of the window in pre-training. Defaults to (0, 0), which means not load pretrained model.

  • init_cfg (dict, optional) – The extra config for initialization. Defaults to None.

forward(x, mask=None)[source]
Parameters:
  • x (tensor) – input features with shape of (num_windows*B, N, C)

  • mask (tensor, Optional) – mask with shape of (num_windows, Wh*Ww, Wh*Ww), value should be between (-inf, 0].

Read the Docs v: latest
Versions
latest
stable
mmcls-1.x
mmcls-0.x
dev
Downloads
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.