vissl.optimizers package¶
-
vissl.optimizers.
get_optimizer_param_groups
(model, model_config, optimizer_config, optimizer_schedulers)[source]¶ Go through all the layers, sort out which parameters should be regularized, unregularized and optimization settings for the head/trunk. We filter the trainable params only and add them to the param_groups.
- Returns
param_groups (List[Dict]) –
- [
- {
“params”: trunk_regularized_params, “lr”: lr_value, “weight_decay”: wd_value,
}, {
”params”: trunk_unregularized_params, “lr”: lr_value, “weight_decay”: 0.0,
}, {
”params”: head_regularized_params, “lr”: head_lr_value, “weight_decay”: head_weight_decay,
}, {
”params”: head_unregularized_params, “lr”: head_lr_value, “weight_decay”: 0.0,
}, {
”params”: remaining_regularized_params, “lr”: lr_value
}
]
vissl.optimizers.optimizer_helper module¶
-
vissl.optimizers.optimizer_helper.
get_optimizer_param_groups
(model, model_config, optimizer_config, optimizer_schedulers)[source]¶ Go through all the layers, sort out which parameters should be regularized, unregularized and optimization settings for the head/trunk. We filter the trainable params only and add them to the param_groups.
- Returns
param_groups (List[Dict]) –
- [
- {
“params”: trunk_regularized_params, “lr”: lr_value, “weight_decay”: wd_value,
}, {
”params”: trunk_unregularized_params, “lr”: lr_value, “weight_decay”: 0.0,
}, {
”params”: head_regularized_params, “lr”: head_lr_value, “weight_decay”: head_weight_decay,
}, {
”params”: head_unregularized_params, “lr”: head_lr_value, “weight_decay”: 0.0,
}, {
”params”: remaining_regularized_params, “lr”: lr_value
}
]
vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler module¶
-
class
vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler.
CosineWaveTypes
(value)[source]¶ -
An enumeration.
-
half
= 'half'¶
-
full
= 'full'¶
-
-
class
vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler.
CosineWarmRestartScheduler
(start_value: float, end_value: float, restart_interval_length: float, wave_type: str, lr_multiplier: float, is_adaptive: bool, update_interval: classy_vision.optim.param_scheduler.classy_vision_param_scheduler.UpdateInterval = <UpdateInterval.STEP: 'step'>)[source]¶ Bases:
classy_vision.optim.param_scheduler.classy_vision_param_scheduler.ClassyParamScheduler
Changes the param value after every epoch based on a cosine schedule. The schedule is updated after every train step by default.
Can be used for cosine learning rate with warm restarts. For restarts, we calculate what will be the maximum learning rate after every restart. There are 3 options:
Option 1: LR after every restart is same as original max LR
Option 2: LR after every restart decays with a fixed LR multiplier
- Option 3: LR after every restart is adaptively calculated such that the resulting
max LR matches the original cosine wave LR.
- Parameters
wave_type – half | full
lr_multiplier – float value -> LR after every restart decays with a fixed LR multiplier
is_adaptive – True -> if after every restart, maximum LR is adaptively calculated such that the resulting max LR matches the original cosine wave LR.
update_interval – step | epoch -> if the LR should be updated after every training iteration or after training epoch
Example
start_value: 0.1 end_value: 0.0001 restart_interval_length: 0.5 # for 1 restart wave_type: half lr_multiplier: 1.0 # for using a decayed max LR value at every restart use_adaptive_decay: False
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.optimizers.param_scheduler.cosine_warm_restart_scheduler.CosineWarmRestartScheduler[source]¶ Instantiates a CosineWarmRestartScheduler from a configuration.
- Parameters
config – A configuration for a CosineWarmRestartScheduler. See
__init__()
for parameters expected in the config.- Returns
A CosineWarmRestartScheduler instance.
vissl.optimizers.param_scheduler.inverse_sqrt_decay module¶
-
class
vissl.optimizers.param_scheduler.inverse_sqrt_decay.
InverseSqrtScheduler
(start_value: float, warmup_interval_length: float, update_interval: classy_vision.optim.param_scheduler.classy_vision_param_scheduler.UpdateInterval = <UpdateInterval.STEP: 'step'>)[source]¶ Bases:
classy_vision.optim.param_scheduler.classy_vision_param_scheduler.ClassyParamScheduler
Decay the LR based on the inverse square root of the update number.
Example
start_value: 4.8 warmup_interval_length: 0.1
Corresponds to a inverse sqrt decay schedule with values in [4.8, 0]
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.optimizers.param_scheduler.inverse_sqrt_decay.InverseSqrtScheduler[source]¶ Instantiates a InverseSqrtScheduler from a configuration.
- Parameters
config – A configuration for a InverseSqrtScheduler. See
__init__()
for parameters expected in the config.- Returns
A InverseSqrtScheduler instance.
-
classmethod