vissl.utils package¶
vissl.utils.instance_retrieval_utils.data_util module¶
-
vissl.utils.instance_retrieval_utils.data_util.
is_revisited_dataset
(dataset_name: str)[source]¶ Computes whether the specified dataseet name is a revisited version of the oxford and paris datasets. simply looks for pattern “roxford5k” and “rparis6k” in specified dataset_name.
-
vissl.utils.instance_retrieval_utils.data_util.
is_instre_dataset
(dataset_name: str)[source]¶ Returns True if the dataset name is “instre”. Helper function used in code at several places.
-
vissl.utils.instance_retrieval_utils.data_util.
is_whiten_dataset
(dataset_name: str)[source]¶ Returns if the dataset specified has name “whitening”. User can use any dataset they want for whitening.
-
vissl.utils.instance_retrieval_utils.data_util.
add_bias_channel
(x, dim: int = 1)[source]¶ Adds a bias channel useful during pooling + whitening operation.
-
vissl.utils.instance_retrieval_utils.data_util.
flatten
(x: torch.Tensor, keepdims: bool = False)[source]¶ Flattens B C H W input to B C*H*W output, optionally retains trailing dimensions.
-
vissl.utils.instance_retrieval_utils.data_util.
gem
(x: torch.Tensor, p: int = 3, eps: float = 1e-06, clamp: bool = True, add_bias: bool = False, keepdims: bool = False)[source]¶ Gem pooling on the given tensor.
- Parameters
x (torch.Tensor) – tensor on which the pooling should be done
p (int) – pooling number. If p=inf then simply perform max_pool2d If p=1 and x tensor has grad, simply perform avg_pool2d else, perform Gem pooling for specified p
eps (float) – if clamping the x tensor, use the eps for clamping
clamp (float) – whether to clamp the tensor
add_bias (bool) – whether to add the biad channel
keepdims (bool) – whether to flatten or keep the dimensions as is
- Returns
x (torch.Tensor) – Gem pooled tensor
-
vissl.utils.instance_retrieval_utils.data_util.
l2n
(x: torch.Tensor, eps: float = 1e-06, dim: int = 1)[source]¶ L2 normalize the input tensor along the specified dimension
- Parameters
x (torch.Tensor) – the tensor to normalize
eps (float) – epsilon to use to normalize to avoid the inf output
dim (int) – along which dimension to L2 normalize
- Returns
x (torch.Tensor) – L2 normalized tensor
-
class
vissl.utils.instance_retrieval_utils.data_util.
MultigrainResize
(size: int, largest: bool = False, **kwargs)[source]¶ Bases:
torchvision.transforms.transforms.Resize
Resize with a largest=False argument allowing to resize to a common largest side without cropping Approach used in the Multigrain paper https://arxiv.org/pdf/1902.05509.pdf
-
class
vissl.utils.instance_retrieval_utils.data_util.
WhiteningTrainingImageDataset
(base_dir: str, image_list_file: str, num_samples: int = 0)[source]¶ Bases:
object
A set of training images for whitening
-
class
vissl.utils.instance_retrieval_utils.data_util.
InstreDataset
(dataset_path: str, num_samples: int = 0)[source]¶ Bases:
object
A dataset class that reads and parses the Instre Dataset so it’s ready to be used in the code for retrieval evaluations
-
class
vissl.utils.instance_retrieval_utils.data_util.
RevisitedInstanceRetrievalDataset
(dataset: str, dir_main: str)[source]¶ Bases:
object
A dataset class used for the Revisited Instance retrieval datasets: Revisited Oxford and Revisited Paris. The object reads and parses the datasets so it’s ready to be used in the code for retrieval evaluations.
-
class
vissl.utils.instance_retrieval_utils.data_util.
InstanceRetrievalImageLoader
(S, transforms)[source]¶ Bases:
object
The custom loader for the Paris and Oxford Instance Retrieval datasets.
-
load_and_prepare_whitening_image
(fname)[source]¶ from the filename, load the whitening image and prepare it to be used by applying data transforms
-
load_and_prepare_instre_image
(fname)[source]¶ from the filename, load the db or query image and prepare it to be used by applying data transforms
-
-
class
vissl.utils.instance_retrieval_utils.data_util.
InstanceRetrievalDataset
(path, eval_binary_path, num_samples=None)[source]¶ Bases:
object
A dataset class used for the Instance retrieval datasets: Oxford and Paris. The object reads and parses the datasets so it’s ready to be used in the code for retrieval evaluations.
Credits: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py # NOQA Adapted by: Priya Goyal (prigoyal@fb.com)
vissl.utils.instance_retrieval_utils.evaluate module¶
-
vissl.utils.instance_retrieval_utils.evaluate.
score_ap_from_ranks_1
(ranks, nres)[source]¶ Compute the average precision of one search.
- Parameters
ranks – ordered list of ranks of true positives
nres – total number of positives in dataset
- Returns
ap (float) – the average precision following the Holidays and the INSTRE package
-
vissl.utils.instance_retrieval_utils.evaluate.
compute_ap
(ranks, nres)[source]¶ Computes average precision for given ranked indexes.
- Parameters
ranks – zero-based ranks of positive images
nres – number of positive images
- Returns
ap (float) – average precision
-
vissl.utils.instance_retrieval_utils.evaluate.
compute_map
(ranks, gnd, kappas)[source]¶ Computes the mAP for a given set of returned results.
- Credits:
https://github.com/filipradenovic/revisitop/blob/master/python/evaluate.py
- Usage:
- map = compute_map (ranks, gnd)
computes mean average precsion (map) only
- map, aps, pr, prs = compute_map (ranks, gnd, kappas)
- -> computes mean average precision (map), average precision (aps) for
each query
- -> computes mean precision at kappas (pr), precision at kappas (prs) for
each query
Notes: 1) ranks starts from 0, ranks.shape = db_size X #queries 2) The junk results (e.g., the query itself) should be declared in the gnd
stuct array
If there are no positive images for some query, that query is excluded from the evaluation
vissl.utils.instance_retrieval_utils.pca module¶
vissl.utils.instance_retrieval_utils.rmac module¶
-
vissl.utils.instance_retrieval_utils.rmac.
normalize_L2
(a, dim)[source]¶ L2 normalize the input tensor along the specified dimension
- Parameters
a (torch.Tensor) – the tensor to normalize
dim (int) – along which dimension to L2 normalize
- Returns
a (torch.Tensor) – L2 normalized tensor
-
vissl.utils.instance_retrieval_utils.rmac.
get_rmac_region_coordinates
(H, W, L)[source]¶ Almost verbatim from Tolias et al Matlab implementation. Could be heavily pythonized, but really not worth it… Desired overlap of neighboring regions
-
vissl.utils.instance_retrieval_utils.rmac.
get_rmac_descriptors
(features, rmac_levels, pca=None)[source]¶ RMAC descriptors. Coordinates are retrieved following Tolias et al. L2 normalize the descriptors and optionally apply PCA on the descriptors if specified by the user. After PCA, aggregate the descriptors (sum) and normalize the aggregated descriptor and return.
vissl.utils.svm_utils.evaluate module¶
-
vissl.utils.svm_utils.evaluate.
calculate_ap
(rec, prec)[source]¶ Computes the AP under the precision recall curve.
-
vissl.utils.svm_utils.evaluate.
get_precision_recall
(targets, scores, weights=None)[source]¶ [P, R, score, ap] = get_precision_recall(targets, scores, weights)
- Parameters
targets – number of occurrences of this class in the ith image
scores – score for this image
weights – 0 or 1 whether where 0 means we should ignore the sample
- Returns
P, R – precision and recall score: score which corresponds to the particular precision and recall ap: average precision
vissl.utils.svm_utils.svm_trainer module¶
-
class
vissl.utils.svm_utils.svm_trainer.
SVMTrainer
(config, layer, output_dir)[source]¶ Bases:
object
SVM trainer that takes care of training (using k-fold cross validation), and evaluating the SVMs
-
load_input_data
(data_file, targets_file)[source]¶ Given the input data (features) and targets (labels) files, load the features of shape N x D and labels of shape (N,)
-
get_best_cost_value
()[source]¶ During the SVM training, we write the cross vaildation AP value for training at each class and cost value combination. We load the AP values and for each class, determine the cost value that gives the maximum AP. We return the chosen cost values for each class as a numpy matrix.
-
train_cls
(features, targets, cls_num)[source]¶ Train SVM on the input features and targets for a given class. The SVMs are trained for all costs values for the given class. We also save the cross-validation AP at each cost value for the given class.
-
train
(features, targets)[source]¶ Train SVMs on the given features and targets for all classes and all the costs values.
-
test
(features, targets)[source]¶ Test the trained SVM models on the test features and targets values. We use the cost per class that gives the maximum cross validation AP on the training and load the correspond trained SVM model for the cost value and the class.
Log the test ap to stdout and also save the AP in a file.
-
vissl.utils.svm_utils.svm_low_shot_trainer module¶
-
class
vissl.utils.svm_utils.svm_low_shot_trainer.
SVMLowShotTrainer
(config, layer, output_dir)[source]¶ Bases:
vissl.utils.svm_utils.svm_trainer.SVMTrainer
Train the SVM for the low-shot image classification tasks. Currently, datasets like VOC07 and Places205 are supported.
The trained inherits from the SVMTrainer class and takes care of training SVM, evaluating, and aggregate the metrics.
-
train
(features, targets, sample_num, low_shot_kvalue)[source]¶ Train SVM on the input features and targets for a given low-shot k-value and the independent low-shot sample number.
- We save the trained SVM model for each combination:
cost value, class number, sample number, k-value
-
test
(features, targets, sample_num, low_shot_kvalue)[source]¶ - Test the SVM for the input test features and targets for the given:
low-shot k-value, sample number
We compute the meanAP across all classes for a given cost value. We get the output matrix of shape (1, #costs) for the given sample_num and k-value and save the matrix. We use this information to aggregate later.
-
aggregate_stats
(k_values, sample_inds)[source]¶ Aggregate the test AP across all k-values and independent samples.
For each low-shot k-value, we obtain the mean, max, min, std AP value. Steps:
For each k-value, get the min/max/mean/std value across all the independent samples. This results in matrices [#k-values x #classes]
Then we aggregate stats across the classes. For the mean stats in step 1, for each k-value, we get the class which has maximum mean.
-
vissl.utils.activation_checkpointing module¶
This module centralizes all activation checkpointing related code. It is a work-in-progress as we evolve the APIs and eventually put this in fairscale so that multiple projects can potentially share it.
-
vissl.utils.activation_checkpointing.
manual_gradient_reduction
(model: torch.nn.modules.module.Module, config_flag: bool) → bool[source]¶ Return if we should use manual gradient reduction or not.
We should use manual DDP if config says so and model is wrapped by DDP.
-
vissl.utils.activation_checkpointing.
manual_sync_params
(model: torch.nn.parallel.distributed.DistributedDataParallel) → None[source]¶ Manually sync params and buffers for DDP.
-
vissl.utils.activation_checkpointing.
manual_gradient_all_reduce
(model: torch.nn.parallel.distributed.DistributedDataParallel) → None[source]¶ Gradient reduction function used after backward is done.
vissl.utils.checkpoint module¶
-
vissl.utils.checkpoint.
is_training_finished
(cfg: vissl.utils.hydra_config.AttrDict, checkpoint_folder: str)[source]¶ Given the checkpoint folder, we check that there’s not already a final checkpoint If the final checkpoint exists but the user wants to override the final checkpoint then we mark training as not finished.
-
vissl.utils.checkpoint.
get_checkpoint_folder
(config: vissl.utils.hydra_config.AttrDict)[source]¶ Check, create and return the checkpoint folder. User can specify their own checkpoint directory otherwise the default “.” is used.
Optionally, for training that involves more than 1 machine, we allow to append the distributed run id which helps to uniquely identify the training. This is completely optional and user can se APPEND_DISTR_RUN_ID=true for this.
-
vissl.utils.checkpoint.
is_checkpoint_phase
(mode_num: int, mode_frequency: int, train_phase_idx: int, num_epochs: int, mode: str)[source]¶ Determines if a checkpoint should be saved on current epoch. If epoch=1, then we check whether to save at current iteration or not.
- Parameters
mode (str) – what model we are checkpointing models at - every few iterations or at the end of every phase/epoch. The mode is encoded in the checkpoint filename.
mode_num (int) – what is the current iteration or epoch number that we are trying to checkpoint at.
mode_frequency (int) – checkpoint frequency - every N iterations or every N epochs/phase
train_phase_idx (int) – the current training phase we are in. Starts from 0
num_epochs (int) – total number of epochs in training
- Returns
checkpointing_phase (bool) – whether the model should be checkpointed or not
-
vissl.utils.checkpoint.
has_checkpoint
(checkpoint_folder: str, skip_final: bool = False)[source]¶ Check whether there are any checkpoints at all in the checkpoint folder.
-
vissl.utils.checkpoint.
has_final_checkpoint
(checkpoint_folder: str, final_checkpoint_pattern: str = 'model_final')[source]¶ Check whether the final checkpoint exists in the checkpoint folder. The final checkpoint is recognized by the prefix “model_final_” in VISSL.
-
vissl.utils.checkpoint.
get_checkpoint_resume_files
(checkpoint_folder: str, config: vissl.utils.hydra_config.AttrDict, skip_final: bool = False, latest_checkpoint_resume_num: int = 1)[source]¶ Get the checkpoint file from which the model should be resumed. We look at all the checkpoints in the checkpoint_folder and if the final model checkpoint exists (starts with model_final_) and not overriding it, then return the final checkpoint. Otherwise find the latest checkpoint.
- Parameters
checkpoint_folder (str) – path to the checkpoint folder.
config (AttrDict) – root config
skip_final (bool) – whether the final model checkpoint should be skipped or not
latest_checkpoint_resume_num (int) – what Nth latest checkpoint to resume from. Sometimes the latest checkpoints could be corrupt so this option helps to resume from instead a few checkpoints before the last checkpoint.
-
vissl.utils.checkpoint.
get_resume_checkpoint
(cfg: vissl.utils.hydra_config.AttrDict, checkpoint_folder: str)[source]¶ Return the checkpoint from which to resume traning. If no checkpoint found, return None. Resuming training is optional and user can set AUTO_RESUME=false to not resume the training.
If we want to overwrite the existing final checkpoint, we ignore the final checkpoint and return the previous checkpoints if exist.
-
vissl.utils.checkpoint.
print_state_dict_shapes
(state_dict: Dict[str, Any])[source]¶ For the given model state dictionary, print the name and shape of each parameter tensor in the model state. Helps debugging.
- Parameters
state_dict (Dict[str, Any]) – model state dictionary
-
vissl.utils.checkpoint.
print_loaded_dict_info
(model_state_dict: Dict[str, Any], state_dict: Dict[str, Any], skip_layers: List[str], model_config: vissl.utils.hydra_config.AttrDict)[source]¶ Print what layers were loaded, what layers were ignored/skipped/not found when initializing a model from a specified model params file.
-
vissl.utils.checkpoint.
replace_module_prefix
(state_dict: Dict[str, Any], prefix: str, replace_with: str = '')[source]¶ Remove prefixes in a state_dict needed when loading models that are not VISSL trained models.
Specify the prefix in the keys that should be removed.
-
vissl.utils.checkpoint.
append_module_prefix
(state_dict: Dict[str, Any], prefix: str)[source]¶ Append prefixes in a state_dict needed when loading models that are not VISSL trained models.
- In order to load the model (if not trained with VISSL) with VISSL, there are 2 scenarios:
If you are interested in evaluating the model features and freeze the trunk. Set APPEND_PREFIX=”trunk.base_model.” This assumes that your model is compatible with the VISSL trunks. The VISSL trunks start with “_feature_blocks.” prefix. If your model doesn’t have these prefix you can append them. For example: For TorchVision ResNet trunk, set APPEND_PREFIX=”trunk.base_model._feature_blocks.”
where you want to load the model simply and finetune the full model. Set APPEND_PREFIX=”trunk.” This assumes that your model is compatible with the VISSL trunks. The VISSL trunks start with “_feature_blocks.” prefix. If your model doesn’t have these prefix you can append them. For TorchVision ResNet trunk, set APPEND_PREFIX=”trunk._feature_blocks.”
NOTE: the prefix is appended to all the layers in the model
-
vissl.utils.checkpoint.
check_model_compatibilty
(config: vissl.utils.hydra_config.AttrDict, state_dict: Dict[str, Any])[source]¶ Given a VISSL model and state_dict, check if the state_dict can be loaded to VISSL model (trunk + head) based on the trunk and head prefix that is expected. If not compatible, we raise exception.
Prefix checked for head: heads. Prefix checked for trunk: trunk._feature_blocks. or trunk.base_model._feature_blocks.
depending on the workflow type (training | evaluation).
-
vissl.utils.checkpoint.
get_checkpoint_model_state_dict
(config: vissl.utils.hydra_config.AttrDict, state_dict: Dict[str, Any])[source]¶ Given a specified pre-trained VISSL model (composed of head and trunk), we get the state_dict that can be loaded by appending prefixes to model and trunk.
- Parameters
config (AttrDict) – full config file
state_dict (Dict) – raw state_dict loaded from the checkpoint or weights file
- Returns
state_dict (Dict) –
- vissl state_dict with layer names matching compatible with
vissl model. Hence this state_dict can be loaded directly.
-
vissl.utils.checkpoint.
init_model_from_weights
(config: vissl.utils.hydra_config.AttrDict, model, state_dict: Dict[str, Any], state_dict_key_name: str, skip_layers: List[str], replace_prefix=None, append_prefix=None)[source]¶ Initialize the model from any given params file. This is particularly useful during the feature evaluation process or when we want to evaluate a model on a range of tasks.
- Parameters
config (AttrDict) – config file
model (object) – instance of base_ssl_model
state_dict (Dict) – torch.load() of user provided params file path.
state_dict_key_name (string) – key name containing the model state dict
skip_layers (List(string)) – layer names with this key are not copied
replace_prefix (string) – remove these prefixes from the layer names (executed first)
append_prefix (string) – append the prefix to the layer names (executed after replace_prefix)
- Returns
model (object) – the model initialized from the weights file
vissl.utils.collect_env module¶
vissl.utils.env module¶
-
vissl.utils.env.
set_env_vars
(local_rank: int, node_id: int, cfg: vissl.utils.hydra_config.AttrDict)[source]¶ Set some environment variables like total number of gpus used in training, distributed rank and local rank of the current gpu, whether to print the nccl debugging info and tuning nccl settings.
vissl.utils.hydra_config module¶
-
class
vissl.utils.hydra_config.
AttrDict
(dictionary)[source]¶ Bases:
dict
Dictionary subclass whose entries can be accessed like attributes (as well as normally). Credits: https://aiida.readthedocs.io/projects/aiida-core/en/latest/_modules/aiida/common/extendeddicts.html#AttributeDict # noqa
-
__init__
(dictionary)[source]¶ Recursively turn the dict and all its nested dictionaries into AttrDict instance.
-
__getattr__
(key)[source]¶ Read a key as an attribute.
- Raises
AttributeError – if the attribute does not correspond to an existing key.
-
__delattr__
(key)[source]¶ Delete a key as an attribute.
- Raises
AttributeError – if the attribute does not correspond to an existing key.
-
-
vissl.utils.hydra_config.
convert_to_attrdict
(cfg: omegaconf.dictconfig.DictConfig, cmdline_args: List[Any] = None)[source]¶ Given the user input Hydra Config, and some command line input options to override the config file: 1. merge and override the command line options in the config 2. Convert the Hydra OmegaConf to AttrDict structure to make it easy
to access the keys in the config file
Also check the config version used is compatible and supported in vissl. In future, we would want to support upgrading the old config versions if we make changes to the VISSL default config structure (deleting, renaming keys)
We infer values of some parameters in the config file using the other parameter values.
-
vissl.utils.hydra_config.
is_hydra_available
()[source]¶ Check if Hydra is available. Simply python import to test.
-
vissl.utils.hydra_config.
print_cfg
(cfg)[source]¶ Supports printing both Hydra DictConfig and also the AttrDict config
-
vissl.utils.hydra_config.
resolve_linear_schedule
(cfg, param_schedulers)[source]¶ For the given composite schedulers, for each linear schedule, if the training is 1 node only, the https://arxiv.org/abs/1706.02677 linear warmup rule has to be checked if the rule is applicable and necessary.
We set the end_value = scaled_lr (assuming it’s a linear warmup). In case only 1 machine is used in training, the start_lr = scaled_lr and then the linear warmup is not needed.
-
vissl.utils.hydra_config.
get_scaled_lr_scheduler
(cfg, param_schedulers, scaled_lr)[source]¶ Scale learning rate value for different Learning rate types. See assert_learning_rate() for how the scaled LR is calculated.
Values changed for learning rate schedules: 1. cosine:
end_value = scaled_lr * (end_value / start_value) start_value = scaled_lr and
- multistep:
gamma = values[1] / values[0] values = [scaled_lr * pow(gamma, idx) for idx in range(len(values))]
- step_with_fixed_gamma
base_value = scaled_lr
linear: end_value = scaled_lr
inverse_sqrt: start_value = scaled_lr
constant: value = scaled_lr
- composite:
recursively call to scale each composition. If the composition consists of a linear schedule, we assume that a linear warmup is applied. If the linear warmup is applied, it’s possible the warmup is not necessary if the global batch_size is smaller than the base_lr_batch_size and in that case, we remove the linear warmup from the schedule.
-
vissl.utils.hydra_config.
assert_learning_rate
(cfg)[source]¶ 1) Assert the Learning rate here. LR is scaled as per https://arxiv.org/abs/1706.02677. to turn this automatic scaling off, set config.OPTIMIZER.param_schedulers.lr.auto_lr_scaling.auto_scale=false
- scaled_lr is calculated:
- given base_lr_batch_size = batch size for which the base learning rate is specified,
base_value = base learning rate value that will be scaled, The current batch size is used to determine how to scale the base learning rate value.
scaled_lr = ((batchsize_per_gpu * world_size) * base_value ) / base_lr_batch_size
We perform this auto-scaling for head learning rate as well if user wants to use a different learning rate for the head
infer the model head params weight decay: if the head should use a different weight decay value than the trunk. If using different weight decay value for the head, set here. otherwise, the same value as trunk will be automatically used.
-
vissl.utils.hydra_config.
assert_losses
(cfg)[source]¶ Infer settings for various self-supervised losses. Takes care of setting various loss parameters correctly like world size, batch size per gpu, effective global batch size, collator etc. Each loss has additional set of parameters that can be inferred to ensure smooth training in case user forgets to adjust all the parameters.
-
vissl.utils.hydra_config.
assert_hydra_conf
(cfg)[source]¶ Infer values of few parameters in the config file using the value of other config parameters 1. Inferring losses 2. Auto scale learning rate if user has specified auto scaling to be True. 3. Infer meter names (model layer name being evaluated) since we support list meters
that have multiple output and same target. This is very common in self-supervised learning where we want to evaluate metric for several layers of the models. VISSL supports running evaluation for multiple model layers in a single training run.
Support multi-gpu DDP eval model by attaching a dummy parameter. This is particularly helpful for the multi-gpu feature extraction especially when the dataset is large for which features are being extracted.
Infer what kind of labels are being used. If user has specified a labels source, we set LABEL_TYPE to “standard” (also vissl default), otherwise if no label is specified, we set the LABEL_TYPE to “sample_index”.
vissl.utils.io module¶
-
vissl.utils.io.
cache_url
(url: str, cache_dir: str) → str[source]¶ This implementation downloads the remote resource and caches it locally. The resource will only be downloaded if not previously requested.
-
vissl.utils.io.
create_file_symlink
(file1, file2)[source]¶ Simply create the symlinks for a given file1 to file2. Useful during model checkpointing to symlinks to the latest successful checkpoint.
-
vissl.utils.io.
save_file
(data, filename)[source]¶ Common i/o utility to handle saving data to various file formats. Supported:
.pkl, .pickle, .npy, .json
-
vissl.utils.io.
load_file
(filename, mmap_mode=None)[source]¶ Common i/o utility to handle loading data from various file formats. Supported:
.pkl, .pickle, .npy, .json
For the npy files, we support reading the files in mmap_mode. If the mmap_mode of reading is not successful, we load data without the mmap_mode.
-
vissl.utils.io.
is_url
(input_url)[source]¶ Check if an input string is a url. look for http(s):// and ignoring the case
-
vissl.utils.io.
cleanup_dir
(dir)[source]¶ Utility for deleting a directory. Useful for cleaning the storage space that contains various training artifacts like checkpoints, data etc.
-
vissl.utils.io.
copy_file
(input_file, destination_dir, tmp_destination_dir)[source]¶ Copy a given input_file from source to the destination directory.
Steps: 1. We use PathManager to extract the data to local path. 2. we simply move the files from the PathManager cached local directory
to the user specified destination directory. We use rsync. How destination dir is chosen:
If user is using slurm, we set destination_dir = slurm_dir (see get_slurm_dir)
If the local path used by PathManafer is same as the input_file path, and the destination directory is not specified, we set destination_dir = tmp_destination_dir
- Returns
output_file (str) – the new path of the file destination_dir (str): the destination dir that was actually used
-
vissl.utils.io.
copy_dir
(input_dir, destination_dir, num_threads)[source]¶ Copy contents of one directory to the specified destination directory using the number of threads to speed up the copy. When the data is copied successfully, we create a copy_complete file in the destination_dir folder to mark the completion. If the destination_dir folder already exists and has the copy_complete file, we don’t copy the file.
useful for copying datasets like ImageNet to speed up dataloader. Using 20 threads for imagenet takes about 20 minutes to copy.
- Returns
destination_dir (str) – directory where the contents were copied
-
vissl.utils.io.
copy_data
(input_file, destination_dir, num_threads, tmp_destination_dir)[source]¶ Copy data from one source to the other using num_threads. The data to copy can be a single file or a directory. We check what type of data and call the relevant functions.
- Returns
output_file (str) – the new path of the data (could be file or dir) destination_dir (str): the destination dir that was actually used
-
vissl.utils.io.
copy_data_to_local
(input_files, destination_dir, num_threads=40, tmp_destination_dir=None)[source]¶ Iteratively copy the list of data to a destination directory. Each data to copy could be a single file or a directory.
- Returns
output_file (str) –
- the new path of the file. If there were
no files to copy, simply return the input_files
destination_dir (str): the destination dir that was actually used
vissl.utils.logger module¶
-
vissl.utils.logger.
setup_logging
(name, output_dir=None, rank=0)[source]¶ Setup various logging streams: stdout and file handlers.
For file handlers, we only setup for the master gpu.
-
vissl.utils.logger.
shutdown_logging
()[source]¶ After training is done, we ensure to shut down all the logger streams.
vissl.utils.misc module¶
-
vissl.utils.misc.
is_apex_available
()[source]¶ Check if apex is available with simple python imports.
-
vissl.utils.misc.
is_faiss_available
()[source]¶ Check if faiss is available with simple python imports. To install faiss, simply do:
If using PIP env: pip install faiss-gpu If using conda env: conda install faiss-gpu -c pytorch
-
vissl.utils.misc.
is_opencv_available
()[source]¶ Check if opencv is available with simple python imports. To install opencv, simply do: pip install opencv-python regardless of whether using conda or pip environment.
-
vissl.utils.misc.
find_free_tcp_port
()[source]¶ Find the free port that can be used for Rendezvous on the local machine. We use this for 1 machine training where the port is automatically detected.
-
vissl.utils.misc.
get_dist_run_id
(cfg, num_nodes)[source]¶ For multi-gpu training with PyTorch, we have to specify how the gpus are going to rendezvous. This requires specifying the communication method: file, tcp and the unique rendezvous run_id that is specific to 1 run.
- We recommend:
for 1-node: use init_method=tcp and run_id=auto
for multi-node, use init_method=tcp and specify run_id={master_node}:{port}
-
vissl.utils.misc.
setup_multiprocessing_method
(method_name: str)[source]¶ PyTorch supports several multiprocessing options: forkserver | spawn | fork
We recommend and use forkserver as the default method in VISSL.
-
vissl.utils.misc.
set_seeds
(cfg, node_id=0)[source]¶ Set the python random, numpy and torch seed for each gpu. Also set the CUDA seeds if the CUDA is available. This ensures deterministic nature of the training.
-
vissl.utils.misc.
get_indices_sparse
(data)[source]¶ Is faster than np.argwhere. Used in loss functions like swav loss, etc
-
vissl.utils.misc.
merge_features
(output_dir, split, layer, cfg)[source]¶ For multi-gpu feature extraction, each gpu saves features corresponding to its share of the data. We can merge the features across all gpus to get the features for the full data.
The features are saved along with the data indexes and label. The data indexes can be used to sort the data and ensure the uniqueness.
We organize the features, targets corresponding to the data index of each feature, ensure the uniqueness and return.
- Parameters
- Returns
output (Dict) – contains features, targets, inds as the keys
vissl.utils.perf_stats module¶
-
class
vissl.utils.perf_stats.
PerfTimer
(timer_name: str, perf_stats: Optional[PerfStats])[source]¶ Bases:
object
Very simple timing wrapper, with context manager wrapping. Typical usage:
- with PerfTimer(‘forward_pass’, perf_stats):
model.forward(data)
# … with PerfTimer(‘backward_pass’, perf_stats):
model.backward(loss)
# … print(perf_stats.report_str())
Note that timer stats accumulate by name, so you can as if resume them by re-using the name.
You can also use it without context manager, i.e. via start() / stop() directly.
If supplied PerfStats is constructed with use_cuda_events=True (which is default), then Cuda events will be added to correctly track time of async execution of Cuda kernels:
- with PerfTimer(‘foobar’, perf_stats):
some_cpu_work() schedule_some_cuda_work()
In example above, the “Host” column will capture elapsed time from the perspective of the Python process, and “CudaEvent” column will capture elapsed time between scheduling of Cuda work (within the PerfTimer scope) and completion of this work, some of which might happen outside the PerfTimer scope.
If perf_stats is None, using PerfTimer does nothing.
-
start
()[source]¶ Start the recording if the perfTimer should not be skipped or if the recording is not already in progress. If using cuda, we record time of cuda events as well.
-
class
vissl.utils.perf_stats.
PerfMetric
[source]¶ Bases:
object
Encapsulates numerical tracking of a single metric, with a .update(value) API. Under-the-hood this can additionally keep track of sums, (exp.) moving averages, sum of squares (e.g. for stdev), filtered values, etc.
-
EMA_FACTOR
= 0.1¶
-
-
class
vissl.utils.perf_stats.
PerfStats
(use_cuda_events=True)[source]¶ Bases:
object
Accumulate stats (from timers) over many iterations
-
MAX_PENDING_TIMERS
= 1000¶
-
update_with_timer
(timer: vissl.utils.perf_stats.PerfTimer)[source]¶
-
vissl.utils.slurm module¶
vissl.utils.tensorboard module¶
This script contains some helpful functions to handle tensorboard setup.
-
vissl.utils.tensorboard.
is_tensorboard_available
()[source]¶ Check whether tensorboard is available or not.
- Returns
tb_available (bool) –
- based on tensorboard imports, returns whether tensboarboard
is available or not.
-
vissl.utils.tensorboard.
get_tensorboard_dir
(cfg)[source]¶ Get the output directory where the tensorboard events will be written.
- Parameters
cfg (AttrDict) – User specified config file containing the settings for the tensorboard as well like log directory, logging frequency etc
- Returns
tensorboard_dir (str) – output directory path
-
vissl.utils.tensorboard.
get_tensorboard_hook
(cfg)[source]¶ Construct the Tensorboard hook for visualization from the specified config
- Parameters
cfg (AttrDict) – User specified config file containing the settings for the tensorboard as well like log directory, logging frequency etc
- Returns
SSLTensorboardHook (function) – the tensorboard hook constructed