ConvNeXt¶

class mmpretrain.models.backbones.ConvNeXt(arch='tiny', in_channels=3, stem_patch_size=4, norm_cfg={'eps': 1e-06, 'type': 'LN2d'}, act_cfg={'type': 'GELU'}, linear_pw_conv=True, use_grn=False, drop_path_rate=0.0, layer_scale_init_value=1e-06, out_indices=-1, frozen_stages=0, gap_before_final_norm=True, with_cp=False, init_cfg=[{'type': 'TruncNormal', 'layer': ['Conv2d', 'Linear'], 'std': 0.02, 'bias': 0.0}, {'type': 'Constant', 'layer': ['LayerNorm'], 'val': 1.0, 'bias': 0.0}])[source]¶

ConvNeXt v1&v2 backbone.

A PyTorch implementation of A ConvNet for the 2020s and ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

Modified from the official repo and timm.

To use ConvNeXt v2, please set use_grn=True and layer_scale_init_value=0..

Parameters:

arch (str | dict) –
The model’s architecture. If string, it should be one of architecture in ConvNeXt.arch_settings. And if dict, it should include the following two keys:
- depths (list[int]): Number of blocks at each stage.
- channels (list[int]): The number of channels at each stage.
Defaults to ‘tiny’.
in_channels (int) – Number of input image channels. Defaults to 3.
stem_patch_size (int) – The size of one patch in the stem layer. Defaults to 4.
norm_cfg (dict) – The config dict for norm layers. Defaults to dict(type='LN2d', eps=1e-6).
act_cfg (dict) – The config dict for activation between pointwise convolution. Defaults to dict(type='GELU').
linear_pw_conv (bool) – Whether to use linear layer to do pointwise convolution. Defaults to True.
use_grn (bool) – Whether to add Global Response Normalization in the blocks. Defaults to False.
drop_path_rate (float) – Stochastic depth rate. Defaults to 0.
layer_scale_init_value (float) – Init value for Layer Scale. Defaults to 1e-6.
out_indices (Sequence | int) – Output from which stages. Defaults to -1, means the last stage.
frozen_stages (int) – Stages to be frozen (all param fixed). Defaults to 0, which means not freezing any parameters.
gap_before_final_norm (bool) – Whether to globally average the feature map before the final norm layer. In the official repo, it’s only used in classification task. Defaults to True.
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Defaults to False.
init_cfg (dict, optional) – Initialization config dict

get_layer_depth(param_name, prefix='')[source]¶

Get the layer-wise depth of a parameter.

Parameters:

param_name (str) – The name of the parameter.
prefix (str) – The prefix for the parameter. Defaults to an empty string.

Returns:

The layer-wise depth and the num of layers.

Return type:

Tuple[int, int]