vissl.optimizers package

vissl.optimizers.get_optimizer_param_groups(model, model_config, optimizer_config, optimizer_schedulers)[source]

Go through all the layers, sort out which parameters should be regularized, unregularized and optimization settings for the head/trunk. We filter the trainable params only and add them to the param_groups.

Returns

param_groups (List[Dict])

[
{

“params”: trunk_regularized_params, “lr”: lr_value, “weight_decay”: wd_value,

}, {

”params”: trunk_unregularized_params, “lr”: lr_value, “weight_decay”: 0.0,

}, {

”params”: head_regularized_params, “lr”: head_lr_value, “weight_decay”: head_weight_decay,

}, {

”params”: head_unregularized_params, “lr”: head_lr_value, “weight_decay”: 0.0,

}, {

”params”: remaining_regularized_params, “lr”: lr_value

}

]

vissl.optimizers.optimizer_helper module

vissl.optimizers.optimizer_helper.get_optimizer_param_groups(model, model_config, optimizer_config, optimizer_schedulers)[source]

Go through all the layers, sort out which parameters should be regularized, unregularized and optimization settings for the head/trunk. We filter the trainable params only and add them to the param_groups.

Returns

param_groups (List[Dict])

[
{

“params”: trunk_regularized_params, “lr”: lr_value, “weight_decay”: wd_value,

}, {

”params”: trunk_unregularized_params, “lr”: lr_value, “weight_decay”: 0.0,

}, {

”params”: head_regularized_params, “lr”: head_lr_value, “weight_decay”: head_weight_decay,

}, {

”params”: head_unregularized_params, “lr”: head_lr_value, “weight_decay”: 0.0,

}, {

”params”: remaining_regularized_params, “lr”: lr_value

}

]

vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler module

class vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler.CosineWaveTypes(value)[source]

Bases: str, enum.Enum

An enumeration.

half = 'half'
full = 'full'
class vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler.CosineWarmRestartScheduler(start_value: float, end_value: float, restart_interval_length: float, wave_type: str, lr_multiplier: float, is_adaptive: bool, update_interval: classy_vision.optim.param_scheduler.classy_vision_param_scheduler.UpdateInterval = <UpdateInterval.STEP: 'step'>)[source]

Bases: classy_vision.optim.param_scheduler.classy_vision_param_scheduler.ClassyParamScheduler

Changes the param value after every epoch based on a cosine schedule. The schedule is updated after every train step by default.

Can be used for cosine learning rate with warm restarts. For restarts, we calculate what will be the maximum learning rate after every restart. There are 3 options:

  • Option 1: LR after every restart is same as original max LR

  • Option 2: LR after every restart decays with a fixed LR multiplier

  • Option 3: LR after every restart is adaptively calculated such that the resulting

    max LR matches the original cosine wave LR.

Parameters
  • wave_type – half | full

  • lr_multiplier – float value -> LR after every restart decays with a fixed LR multiplier

  • is_adaptive – True -> if after every restart, maximum LR is adaptively calculated such that the resulting max LR matches the original cosine wave LR.

  • update_interval – step | epoch -> if the LR should be updated after every training iteration or after training epoch

Example

start_value: 0.1
end_value: 0.0001
restart_interval_length: 0.5  # for 1 restart
wave_type: half
lr_multiplier: 1.0  # for using a decayed max LR value at every restart
use_adaptive_decay: False
classmethod from_config(config: Dict[str, Any])vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler.CosineWarmRestartScheduler[source]

Instantiates a CosineWarmRestartScheduler from a configuration.

Parameters

config – A configuration for a CosineWarmRestartScheduler. See __init__() for parameters expected in the config.

Returns

A CosineWarmRestartScheduler instance.

vissl.optimizers.param_scheduler.inverse_sqrt_decay module

class vissl.optimizers.param_scheduler.inverse_sqrt_decay.InverseSqrtScheduler(start_value: float, warmup_interval_length: float, update_interval: classy_vision.optim.param_scheduler.classy_vision_param_scheduler.UpdateInterval = <UpdateInterval.STEP: 'step'>)[source]

Bases: classy_vision.optim.param_scheduler.classy_vision_param_scheduler.ClassyParamScheduler

Decay the LR based on the inverse square root of the update number.

Example

start_value: 4.8
warmup_interval_length: 0.1

Corresponds to a inverse sqrt decay schedule with values in [4.8, 0]

classmethod from_config(config: Dict[str, Any])vissl.optimizers.param_scheduler.inverse_sqrt_decay.InverseSqrtScheduler[source]

Instantiates a InverseSqrtScheduler from a configuration.

Parameters

config – A configuration for a InverseSqrtScheduler. See __init__() for parameters expected in the config.

Returns

A InverseSqrtScheduler instance.