- class mmpretrain.models.backbones.Conformer(arch='tiny', patch_size=16, base_channels=64, mlp_ratio=4.0, qkv_bias=True, with_cls_token=True, drop_path_rate=0.0, norm_eval=True, frozen_stages=0, out_indices=-1, init_cfg=None)¶
A PyTorch implementation of : Conformer: Local Features Coupling Global Representations for Visual Recognition
patch_size (int) – The patch size. Defaults to 16.
base_channels (int) – The base number of channels in CNN network. Defaults to 64.
mlp_ratio (float) – The expansion ratio of FFN network in transformer block. Defaults to 4.
with_cls_token (bool) – Whether use class token or not. Defaults to True.
drop_path_rate (float) – stochastic depth rate. Defaults to 0.
out_indices (Sequence | int) – Output from which stages. Defaults to -1, means the last stage.
init_cfg (dict, optional) – Initialization config dict. Defaults to None.