Shortcuts

LARS

class mmpretrain.engine.optimizers.LARS(params, lr, momentum=0, weight_decay=0, dampening=0, eta=0.001, nesterov=False, eps=1e-08)[源代码]

Implements layer-wise adaptive rate scaling for SGD.

Based on Algorithm 1 of the following paper by You, Gitman, and Ginsburg. Large Batch Training of Convolutional Networks:.

参数:
  • params (Iterable) – Iterable of parameters to optimize or dicts defining parameter groups.

  • lr (float) – Base learning rate.

  • momentum (float) – Momentum factor. Defaults to 0.

  • weight_decay (float) – Weight decay (L2 penalty). Defaults to 0.

  • dampening (float) – Dampening for momentum. Defaults to 0.

  • eta (float) – LARS coefficient. Defaults to 0.001.

  • nesterov (bool) – Enables Nesterov momentum. Defaults to False.

  • eps (float) – A small number to avoid dviding zero. Defaults to 1e-8.

示例

>>> optimizer = LARS(model.parameters(), lr=0.1, momentum=0.9,
>>>                  weight_decay=1e-4, eta=1e-3)
>>> optimizer.zero_grad()
>>> loss_fn(model(input), target).backward()
>>> optimizer.step()
step(closure=None)[源代码]

Performs a single optimization step.

参数:

closure (callable, optional) – A closure that reevaluates the model and returns the loss.

Read the Docs v: latest
Versions
latest
stable
mmcls-1.x
mmcls-0.x
dev
Downloads
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.