MAEPretrainHead¶

class mmpretrain.models.heads.MAEPretrainHead(loss, norm_pix=False, patch_size=16, in_channels=3)[source]¶

Head for MAE Pre-training.

Parameters:

loss (dict) – Config of loss.
norm_pix_loss (bool) – Whether or not normalize target. Defaults to False.
patch_size (int) – Patch size. Defaults to 16.
in_channels (int) – Number of input channels. Defaults to 3.

construct_target(target)[source]¶

Construct the reconstruction target.

In addition to splitting images into tokens, this module will also normalize the image according to norm_pix.

Parameters:: target (torch.Tensor) – Image with the shape of B x C x H x W
Returns:: Tokenized images with the shape of B x L x C
Return type:: torch.Tensor

loss(pred, target, mask)[source]¶

Generate loss.

Parameters:

Returns:

The reconstruction loss.

Return type:

torch.Tensor

patchify(imgs)[source]¶

Split images into non-overlapped patches.

Parameters:: imgs (torch.Tensor) – A batch of images. The shape should be \((B, C, H, W)\).
Returns:: Patchified images. The shape is \((B, L, \text{patch_size}^2 \times C)\).
Return type:: torch.Tensor

unpatchify(x)[source]¶

Combine non-overlapped patches into images.

Parameters:: x (torch.Tensor) – The shape is \((B, L, \text{patch_size}^2 \times C)\).
Returns:: The shape is \((B, C, H, W)\).
Return type:: torch.Tensor