Table of Contents

Shortcuts

CLIPGenerator¶

class mmpretrain.models.selfsup.CLIPGenerator(tokenizer_path)[source]¶

Get the features and attention from the last layer of CLIP.

This module is used to generate target features in masked image modeling.

Parameters:: tokenizer_path (str) – The path of the checkpoint of CLIP.

forward(x)[source]¶

Get the features and attention from the last layer of CLIP.

Parameters:: x (torch.Tensor) – The input image, which is of shape (N, 3, H, W).
Returns:: The features and attention from the last layer of CLIP, which are of shape (N, L, C) and (N, L, L), respectively.
Return type:: Tuple[torch.Tensor, torch.Tensor]