LearningRateDecayOptimWrapperConstructor¶

class mmpretrain.engine.optimizers.LearningRateDecayOptimWrapperConstructor(optim_wrapper_cfg, paramwise_cfg=None)[source]¶

Different learning rates are set for different layers of backbone.

By default, each parameter share the same optimizer settings, and we provide an argument paramwise_cfg to specify parameter-wise settings. It is a dict and may contain the following fields:

layer_decay_rate (float): The learning rate of a parameter will multiply it by multiple times according to the layer depth of the parameter. Usually, it’s less than 1, so that the earlier layers will have a lower learning rate. Defaults to 1.
bias_decay_mult (float): It will be multiplied to the weight decay for all bias parameters (except for those in normalization layers).
norm_decay_mult (float): It will be multiplied to the weight decay for all weight and bias parameters of normalization layers.
flat_decay_mult (float): It will be multiplied to the weight decay for all one-dimensional parameters
custom_keys (dict): Specified parameters-wise settings by keys. If one of the keys in custom_keys is a substring of the name of one parameter, then the setting of the parameter will be specified by custom_keys[key] and other setting like bias_decay_mult will be ignored. It should be a dict and may contain fields decay_mult. (The lr_mult is disabled in this constructor).

Example:

In the config file, you can use this constructor as below:

optim_wrapper = dict(
    optimizer=dict(
        type='AdamW',
        lr=4e-3,
        weight_decay=0.05,
        eps=1e-8,
        betas=(0.9, 0.999)),
    constructor='LearningRateDecayOptimWrapperConstructor',
    paramwise_cfg=dict(
        layer_decay_rate=0.75,  # layer-wise lr decay factor
        norm_decay_mult=0.,
        flat_decay_mult=0.,
        custom_keys={
            '.cls_token': dict(decay_mult=0.0),
            '.pos_embed': dict(decay_mult=0.0)
        }))

add_params(params, module, prefix='', get_layer_depth=None, **kwargs)[source]¶

Add all parameters of module to the params list.

The parameters of the given module will be added to the list of param groups, with specific rules defined by paramwise_cfg.

Parameters:

params (List[dict]) – A list of param groups, it will be modified in place.
module (nn.Module) – The module to be added.
optimizer_cfg (dict) – The configuration of optimizer.
prefix (str) – The prefix of the module.