VideoDataPreprocessor¶

class mmpretrain.models.utils.data_preprocessor.VideoDataPreprocessor(mean=None, std=None, pad_size_divisor=1, pad_value=0, to_rgb=False, format_shape='NCHW')[source]¶

Video pre-processor for operations, like normalization and bgr to rgb conversion .

Compared with the mmaction.ActionDataPreprocessor, this module supports inputs as torch.Tensor or a list of torch.Tensor.

Parameters:

mean (Sequence[float or int, optional) – The pixel mean of channels of images or stacked optical flow. Defaults to None.
std (Sequence[float or int], optional) – The pixel standard deviation of channels of images or stacked optical flow. Defaults to None.
pad_size_divisor (int) – The size of padded image should be divisible by pad_size_divisor. Defaults to 1.
pad_value (float or int) – The padded pixel value. Defaults to 0.
to_rgb (bool) – Whether to convert image from BGR to RGB. Defaults to False.
format_shape (str) – Format shape of input data. Defaults to 'NCHW'.

forward(data, training=False)[source]¶

Performs normalization、padding and bgr2rgb conversion based on BaseDataPreprocessor.

Parameters:

data (dict) – data sampled from dataloader.
training (bool) – Whether to enable training time augmentation. If subclasses override this method, they can perform different preprocessing strategies for training and testing based on the value of training.

Returns:

Data in the same format: as the model input.

Return type:

Tuple[List[torch.Tensor], Optional[list]]