vissl.optimizers package¶

vissl.optimizers.get_optimizer_param_groups(model, model_config, optimizer_config, optimizer_schedulers)[source]¶

Go through all the layers, sort out which parameters should be regularized, unregularized and optimization settings for the head/trunk. We filter the trainable params only and add them to the param_groups.

Returns

param_groups (List[Dict]) –

[

{: “params”: trunk_regularized_params, “lr”: lr_value, “weight_decay”: wd_value,

}, {

”params”: trunk_unregularized_params, “lr”: lr_value, “weight_decay”: 0.0,

}, {

”params”: head_regularized_params, “lr”: head_lr_value, “weight_decay”: head_weight_decay,

}, {

”params”: head_unregularized_params, “lr”: head_lr_value, “weight_decay”: 0.0,

}, {

”params”: remaining_regularized_params, “lr”: lr_value

}

]

vissl.optimizers.optimizer_helper module¶

vissl.optimizers.optimizer_helper.get_optimizer_param_groups(model, model_config, optimizer_config, optimizer_schedulers)[source]¶

Go through all the layers, sort out which parameters should be regularized, unregularized and optimization settings for the head/trunk. We filter the trainable params only and add them to the param_groups.

Returns

param_groups (List[Dict]) –

[

{: “params”: trunk_regularized_params, “lr”: lr_value, “weight_decay”: wd_value,

}, {

”params”: trunk_unregularized_params, “lr”: lr_value, “weight_decay”: 0.0,

}, {

”params”: head_regularized_params, “lr”: head_lr_value, “weight_decay”: head_weight_decay,

}, {

”params”: head_unregularized_params, “lr”: head_lr_value, “weight_decay”: 0.0,

}, {

”params”: remaining_regularized_params, “lr”: lr_value

}

]

vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler module¶

class vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler.CosineWaveTypes(value)[source]¶

Bases: str, enum.Enum

An enumeration.

half = 'half'¶

full = 'full'¶

class vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler.CosineWarmRestartScheduler(start_value: float, end_value: float, restart_interval_length: float, wave_type: str, lr_multiplier: float, is_adaptive: bool, update_interval: classy_vision.optim.param_scheduler.classy_vision_param_scheduler.UpdateInterval = <UpdateInterval.STEP: 'step'>)[source]¶

Bases: classy_vision.optim.param_scheduler.classy_vision_param_scheduler.ClassyParamScheduler

Changes the param value after every epoch based on a cosine schedule. The schedule is updated after every train step by default.

Can be used for cosine learning rate with warm restarts. For restarts, we calculate what will be the maximum learning rate after every restart. There are 3 options:

Option 1: LR after every restart is same as original max LR

Option 2: LR after every restart decays with a fixed LR multiplier

Option 3: LR after every restart is adaptively calculated such that the resulting
max LR matches the original cosine wave LR.

Parameters

wave_type – half | full
lr_multiplier – float value -> LR after every restart decays with a fixed LR multiplier
is_adaptive – True -> if after every restart, maximum LR is adaptively calculated such that the resulting max LR matches the original cosine wave LR.
update_interval – step | epoch -> if the LR should be updated after every training iteration or after training epoch

Example

start_value: 0.1
end_value: 0.0001
restart_interval_length: 0.5  # for 1 restart
wave_type: half
lr_multiplier: 1.0  # for using a decayed max LR value at every restart
use_adaptive_decay: False

classmethod from_config(config: Dict[str, Any]) → vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler.CosineWarmRestartScheduler [source]¶

Instantiates a CosineWarmRestartScheduler from a configuration.

Parameters: config – A configuration for a CosineWarmRestartScheduler. See __init__() for parameters expected in the config.
Returns: A CosineWarmRestartScheduler instance.

vissl.optimizers.param_scheduler.inverse_sqrt_decay module¶

class vissl.optimizers.param_scheduler.inverse_sqrt_decay.InverseSqrtScheduler(start_value: float, warmup_interval_length: float, update_interval: classy_vision.optim.param_scheduler.classy_vision_param_scheduler.UpdateInterval = <UpdateInterval.STEP: 'step'>)[source]¶

Bases: classy_vision.optim.param_scheduler.classy_vision_param_scheduler.ClassyParamScheduler

Decay the LR based on the inverse square root of the update number.

Example

start_value: 4.8
warmup_interval_length: 0.1

Corresponds to a inverse sqrt decay schedule with values in [4.8, 0]

classmethod from_config(config: Dict[str, Any]) → vissl.optimizers.param_scheduler.inverse_sqrt_decay.InverseSqrtScheduler [source]¶

Instantiates a InverseSqrtScheduler from a configuration.

Parameters: config – A configuration for a InverseSqrtScheduler. See __init__() for parameters expected in the config.
Returns: A InverseSqrtScheduler instance.