Shortcuts

PatchMerging

class mmpretrain.models.utils.PatchMerging(in_channels, out_channels, kernel_size=2, stride=None, padding='corner', dilation=1, bias=False, norm_cfg={'type': 'LN'}, use_post_norm=False, init_cfg=None)[source]

Merge patch feature map.

Modified from mmcv, and this module supports specifying whether to use post-norm.

This layer groups feature map by kernel_size, and applies norm and linear layers to the grouped feature map ((used in Swin Transformer)). Our implementation uses torch.nn.Unfold to merge patches, which is about 25% faster than the original implementation. However, we need to modify pretrained models for compatibility.

Parameters:
  • in_channels (int) – The num of input channels. To gets fully covered by filter and stride you specified.

  • out_channels (int) – The num of output channels.

  • kernel_size (int | tuple, optional) – the kernel size in the unfold layer. Defaults to 2.

  • stride (int | tuple, optional) – the stride of the sliding blocks in the unfold layer. Defaults to None, which means to be set as kernel_size.

  • padding (int | tuple | string) – The padding length of embedding conv. When it is a string, it means the mode of adaptive padding, support “same” and “corner” now. Defaults to “corner”.

  • dilation (int | tuple, optional) – dilation parameter in the unfold layer. Defaults to 1.

  • bias (bool, optional) – Whether to add bias in linear layer or not. Defaults to False.

  • norm_cfg (dict, optional) – Config dict for normalization layer. Defaults to dict(type='LN').

  • use_post_norm (bool) – Whether to use post normalization here. Defaults to False.

  • init_cfg (dict, optional) – The extra config for initialization. Defaults to None.

forward(x, input_size)[source]
Parameters:
  • x (Tensor) – Has shape (B, H*W, C_in).

  • input_size (tuple[int]) – The spatial shape of x, arrange as (H, W). Default: None.

Returns:

Contains merged results and its spatial shape.

  • x (Tensor): Has shape (B, Merged_H * Merged_W, C_out)

  • out_size (tuple[int]): Spatial shape of x, arrange as (Merged_H, Merged_W).

Return type:

tuple

Read the Docs v: latest
Versions
latest
stable
mmcls-1.x
mmcls-0.x
dev
Downloads
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.