vissl.utils package

vissl.utils.instance_retrieval_utils.data_util module

vissl.utils.instance_retrieval_utils.data_util.is_revisited_dataset(dataset_name: str)[source]

Computes whether the specified dataseet name is a revisited version of the oxford and paris datasets. simply looks for pattern “roxford5k” and “rparis6k” in specified dataset_name.

vissl.utils.instance_retrieval_utils.data_util.is_instre_dataset(dataset_name: str)[source]

Returns True if the dataset name is “instre”. Helper function used in code at several places.

vissl.utils.instance_retrieval_utils.data_util.is_whiten_dataset(dataset_name: str)[source]

Returns if the dataset specified has name “whitening”. User can use any dataset they want for whitening.

vissl.utils.instance_retrieval_utils.data_util.add_bias_channel(x, dim: int = 1)[source]

Adds a bias channel useful during pooling + whitening operation.

vissl.utils.instance_retrieval_utils.data_util.flatten(x: torch.Tensor, keepdims: bool = False)[source]

Flattens B C H W input to B C*H*W output, optionally retains trailing dimensions.

vissl.utils.instance_retrieval_utils.data_util.gem(x: torch.Tensor, p: int = 3, eps: float = 1e-06, clamp: bool = True, add_bias: bool = False, keepdims: bool = False)[source]

Gem pooling on the given tensor.

  • x (torch.Tensor) – tensor on which the pooling should be done

  • p (int) – pooling number. If p=inf then simply perform max_pool2d If p=1 and x tensor has grad, simply perform avg_pool2d else, perform Gem pooling for specified p

  • eps (float) – if clamping the x tensor, use the eps for clamping

  • clamp (float) – whether to clamp the tensor

  • add_bias (bool) – whether to add the biad channel

  • keepdims (bool) – whether to flatten or keep the dimensions as is


x (torch.Tensor) – Gem pooled tensor

vissl.utils.instance_retrieval_utils.data_util.l2n(x: torch.Tensor, eps: float = 1e-06, dim: int = 1)[source]

L2 normalize the input tensor along the specified dimension

  • x (torch.Tensor) – the tensor to normalize

  • eps (float) – epsilon to use to normalize to avoid the inf output

  • dim (int) – along which dimension to L2 normalize


x (torch.Tensor) – L2 normalized tensor

class vissl.utils.instance_retrieval_utils.data_util.MultigrainResize(size: int, largest: bool = False, **kwargs)[source]

Bases: torchvision.transforms.transforms.Resize

Resize with a largest=False argument allowing to resize to a common largest side without cropping Approach used in the Multigrain paper

static target_size(w: int, h: int, size: int, largest: bool = False)[source]
class vissl.utils.instance_retrieval_utils.data_util.WhiteningTrainingImageDataset(base_dir: str, image_list_file: str, num_samples: int = 0)[source]

Bases: object

A set of training images for whitening

get_filename(i: int)[source]
class vissl.utils.instance_retrieval_utils.data_util.InstreDataset(dataset_path: str, num_samples: int = 0)[source]

Bases: object

A dataset class that reads and parses the Instre Dataset so it’s ready to be used in the code for retrieval evaluations


Number of images in the dataset


Number of query images in the dataset

get_filename(i: int)[source]

Return the image filepath for the db image

get_query_filename(i: int)[source]

Reutrn the image filepath for the query image

get_query_roi(i: int)[source]

INSTRE dataset has no notion of ROI so we return None.


Return the mean average precision value or the train and validation both provided the ranks (scores of the model).

score(scores, temp_dir, verbose=True)[source]

For the input scores of the model, calculate the AP metric

class vissl.utils.instance_retrieval_utils.data_util.RevisitedInstanceRetrievalDataset(dataset: str, dir_main: str)[source]

Bases: object

A dataset class used for the Revisited Instance retrieval datasets: Revisited Oxford and Revisited Paris. The object reads and parses the datasets so it’s ready to be used in the code for retrieval evaluations.

get_filename(i: int)[source]

Return the image filepath for the db image

get_query_filename(i: int)[source]

Reutrn the image filepath for the query image


Number of images in the dataset


Number of query images in the dataset

get_query_roi(i: int)[source]

Get the ROI for the query image that we want to test retrieval

score(sim, temp_dir: str)[source]

For the input similarity scores of the model, calculate the mean AP metric and mean Precision@k metrics.

class vissl.utils.instance_retrieval_utils.data_util.InstanceRetrievalImageLoader(S, transforms)[source]

Bases: object

The custom loader for the Paris and Oxford Instance Retrieval datasets.


Apply the pre-defined transforms on the image.


from the filename, load the whitening image and prepare it to be used by applying data transforms


from the filename, load the db or query image and prepare it to be used by applying data transforms

load_and_prepare_image(fname, roi=None)[source]

Read image, get aspect ratio, and resize such as the largest side equals S. If there is a roi, adapt the roi to the new size and crop. Do not rescale the image once again. ROI format is (xmin,ymin,xmax,ymax)

load_and_prepare_revisited_image(img_path, roi=None)[source]

Load the image, crop the roi from the image if the roi is not None, apply the image transforms.

class vissl.utils.instance_retrieval_utils.data_util.InstanceRetrievalDataset(path, eval_binary_path, num_samples=None)[source]

Bases: object

A dataset class used for the Instance retrieval datasets: Oxford and Paris. The object reads and parses the datasets so it’s ready to be used in the code for retrieval evaluations.

Credits: # NOQA Adapted by: Priya Goyal (


Number of images in the dataset


Number of query images in the dataset


Load the data ground truth and parse the data so it’s ready to be used.

score(sim, temp_dir)[source]

From the input similarity score, compute the mean average precision

score_rnk_partial(i, idx, temp_dir)[source]

Compute the mean AP for a given single query


Return the image filepath for the db image


Reutrn the image filepath for the query image


Get the ROI for the query image that we want to test retrieval

vissl.utils.instance_retrieval_utils.evaluate module

vissl.utils.instance_retrieval_utils.evaluate.score_ap_from_ranks_1(ranks, nres)[source]

Compute the average precision of one search.

  • ranks – ordered list of ranks of true positives

  • nres – total number of positives in dataset


ap (float) – the average precision following the Holidays and the INSTRE package

vissl.utils.instance_retrieval_utils.evaluate.compute_ap(ranks, nres)[source]

Computes average precision for given ranked indexes.

  • ranks – zero-based ranks of positive images

  • nres – number of positive images


ap (float) – average precision

vissl.utils.instance_retrieval_utils.evaluate.compute_map(ranks, gnd, kappas)[source]

Computes the mAP for a given set of returned results.


map = compute_map (ranks, gnd)

computes mean average precsion (map) only

map, aps, pr, prs = compute_map (ranks, gnd, kappas)
-> computes mean average precision (map), average precision (aps) for

each query

-> computes mean precision at kappas (pr), precision at kappas (prs) for

each query

Notes: 1) ranks starts from 0, ranks.shape = db_size X #queries 2) The junk results (e.g., the query itself) should be declared in the gnd

stuct array

  1. If there are no positive images for some query, that query is excluded from the evaluation

vissl.utils.instance_retrieval_utils.pca module

class vissl.utils.instance_retrieval_utils.pca.PCA(n_components)[source]

Bases: object

Fits and applies PCA whitening

vissl.utils.instance_retrieval_utils.pca.train_and_save_pca(features, n_pca, pca_out_fname)[source]

vissl.utils.instance_retrieval_utils.rmac module

vissl.utils.instance_retrieval_utils.rmac.normalize_L2(a, dim)[source]

L2 normalize the input tensor along the specified dimension

  • a (torch.Tensor) – the tensor to normalize

  • dim (int) – along which dimension to L2 normalize


a (torch.Tensor) – L2 normalized tensor

vissl.utils.instance_retrieval_utils.rmac.get_rmac_region_coordinates(H, W, L)[source]

Almost verbatim from Tolias et al Matlab implementation. Could be heavily pythonized, but really not worth it… Desired overlap of neighboring regions

vissl.utils.instance_retrieval_utils.rmac.get_rmac_descriptors(features, rmac_levels, pca=None)[source]

RMAC descriptors. Coordinates are retrieved following Tolias et al. L2 normalize the descriptors and optionally apply PCA on the descriptors if specified by the user. After PCA, aggregate the descriptors (sum) and normalize the aggregated descriptor and return.

vissl.utils.svm_utils.evaluate module

vissl.utils.svm_utils.evaluate.calculate_ap(rec, prec)[source]

Computes the AP under the precision recall curve.

vissl.utils.svm_utils.evaluate.get_precision_recall(targets, scores, weights=None)[source]

[P, R, score, ap] = get_precision_recall(targets, scores, weights)

  • targets – number of occurrences of this class in the ith image

  • scores – score for this image

  • weights – 0 or 1 whether where 0 means we should ignore the sample


P, R – precision and recall score: score which corresponds to the particular precision and recall ap: average precision

vissl.utils.svm_utils.svm_trainer module

class vissl.utils.svm_utils.svm_trainer.SVMTrainer(config, layer, output_dir)[source]

Bases: object

SVM trainer that takes care of training (using k-fold cross validation), and evaluating the SVMs

load_input_data(data_file, targets_file)[source]

Given the input data (features) and targets (labels) files, load the features of shape N x D and labels of shape (N,)


During the SVM training, we write the cross vaildation AP value for training at each class and cost value combination. We load the AP values and for each class, determine the cost value that gives the maximum AP. We return the chosen cost values for each class as a numpy matrix.

train_cls(features, targets, cls_num)[source]

Train SVM on the input features and targets for a given class. The SVMs are trained for all costs values for the given class. We also save the cross-validation AP at each cost value for the given class.

train(features, targets)[source]

Train SVMs on the given features and targets for all classes and all the costs values.

test(features, targets)[source]

Test the trained SVM models on the test features and targets values. We use the cost per class that gives the maximum cross validation AP on the training and load the correspond trained SVM model for the cost value and the class.

Log the test ap to stdout and also save the AP in a file.

vissl.utils.svm_utils.svm_low_shot_trainer module

class vissl.utils.svm_utils.svm_low_shot_trainer.SVMLowShotTrainer(config, layer, output_dir)[source]

Bases: vissl.utils.svm_utils.svm_trainer.SVMTrainer

Train the SVM for the low-shot image classification tasks. Currently, datasets like VOC07 and Places205 are supported.

The trained inherits from the SVMTrainer class and takes care of training SVM, evaluating, and aggregate the metrics.

train(features, targets, sample_num, low_shot_kvalue)[source]

Train SVM on the input features and targets for a given low-shot k-value and the independent low-shot sample number.

We save the trained SVM model for each combination:

cost value, class number, sample number, k-value

test(features, targets, sample_num, low_shot_kvalue)[source]
Test the SVM for the input test features and targets for the given:

low-shot k-value, sample number

We compute the meanAP across all classes for a given cost value. We get the output matrix of shape (1, #costs) for the given sample_num and k-value and save the matrix. We use this information to aggregate later.

aggregate_stats(k_values, sample_inds)[source]

Aggregate the test AP across all k-values and independent samples.

For each low-shot k-value, we obtain the mean, max, min, std AP value. Steps:

  1. For each k-value, get the min/max/mean/std value across all the independent samples. This results in matrices [#k-values x #classes]

  2. Then we aggregate stats across the classes. For the mean stats in step 1, for each k-value, we get the class which has maximum mean.

vissl.utils.activation_checkpointing module

This module centralizes all activation checkpointing related code. It is a work-in-progress as we evolve the APIs and eventually put this in fairscale so that multiple projects can potentially share it.

vissl.utils.activation_checkpointing.manual_gradient_reduction(model: torch.nn.modules.module.Module, config_flag: bool)bool[source]

Return if we should use manual gradient reduction or not.

We should use manual DDP if config says so and model is wrapped by DDP.

vissl.utils.activation_checkpointing.manual_sync_params(model: torch.nn.parallel.distributed.DistributedDataParallel)None[source]

Manually sync params and buffers for DDP.

vissl.utils.activation_checkpointing.manual_gradient_all_reduce(model: torch.nn.parallel.distributed.DistributedDataParallel)None[source]

Gradient reduction function used after backward is done.

vissl.utils.activation_checkpointing.layer_splittable_before(m: torch.nn.modules.module.Module)bool[source]

Return if this module can be split in front of it for checkpointing. We don’t split the relu module.

vissl.utils.activation_checkpointing.checkpoint_trunk(feature_blocks: Dict[str, torch.nn.modules.module.Module], unique_out_feat_keys: List[str], checkpointing_splits: int) → Dict[str, torch.nn.modules.module.Module][source]

Checkpoint a list of blocks and return back the split version.

vissl.utils.checkpoint module

vissl.utils.checkpoint.is_training_finished(cfg: vissl.utils.hydra_config.AttrDict, checkpoint_folder: str)[source]

Given the checkpoint folder, we check that there’s not already a final checkpoint If the final checkpoint exists but the user wants to override the final checkpoint then we mark training as not finished.

  • cfg (AttrDict) – input config file specified by user and parsed by vissl

  • checkpoint_folder (str) – the directory where the checkpoints exist


boolean whether training is finished or not.

vissl.utils.checkpoint.get_checkpoint_folder(config: vissl.utils.hydra_config.AttrDict)[source]

Check, create and return the checkpoint folder. User can specify their own checkpoint directory otherwise the default “.” is used.

Optionally, for training that involves more than 1 machine, we allow to append the distributed run id which helps to uniquely identify the training. This is completely optional and user can se APPEND_DISTR_RUN_ID=true for this.

vissl.utils.checkpoint.is_checkpoint_phase(mode_num: int, mode_frequency: int, train_phase_idx: int, num_epochs: int, mode: str)[source]

Determines if a checkpoint should be saved on current epoch. If epoch=1, then we check whether to save at current iteration or not.

  • mode (str) – what model we are checkpointing models at - every few iterations or at the end of every phase/epoch. The mode is encoded in the checkpoint filename.

  • mode_num (int) – what is the current iteration or epoch number that we are trying to checkpoint at.

  • mode_frequency (int) – checkpoint frequency - every N iterations or every N epochs/phase

  • train_phase_idx (int) – the current training phase we are in. Starts from 0

  • num_epochs (int) – total number of epochs in training


checkpointing_phase (bool) – whether the model should be checkpointed or not

vissl.utils.checkpoint.has_checkpoint(checkpoint_folder: str, skip_final: bool = False)[source]

Check whether there are any checkpoints at all in the checkpoint folder.

  • checkpoint_folder (str) – path to the checkpoint folder

  • skip_final (bool) – if the checkpoint with model_final_ prefix exist, whether to skip it and train.


checkpoint_exists (bool) – whether checkpoint exists or not

vissl.utils.checkpoint.has_final_checkpoint(checkpoint_folder: str, final_checkpoint_pattern: str = 'model_final')[source]

Check whether the final checkpoint exists in the checkpoint folder. The final checkpoint is recognized by the prefix “model_final_” in VISSL.

  • checkpoint_folder (str) – path to the checkpoint folder.

  • final_checkpoint_pattern (str) – what prefix is used to save the final checkpoint.


has_final_checkpoint – whether the final checkpoint exists or not

vissl.utils.checkpoint.get_checkpoint_resume_files(checkpoint_folder: str, config: vissl.utils.hydra_config.AttrDict, skip_final: bool = False, latest_checkpoint_resume_num: int = 1)[source]

Get the checkpoint file from which the model should be resumed. We look at all the checkpoints in the checkpoint_folder and if the final model checkpoint exists (starts with model_final_) and not overriding it, then return the final checkpoint. Otherwise find the latest checkpoint.

  • checkpoint_folder (str) – path to the checkpoint folder.

  • config (AttrDict) – root config

  • skip_final (bool) – whether the final model checkpoint should be skipped or not

  • latest_checkpoint_resume_num (int) – what Nth latest checkpoint to resume from. Sometimes the latest checkpoints could be corrupt so this option helps to resume from instead a few checkpoints before the last checkpoint.

vissl.utils.checkpoint.get_resume_checkpoint(cfg: vissl.utils.hydra_config.AttrDict, checkpoint_folder: str)[source]

Return the checkpoint from which to resume traning. If no checkpoint found, return None. Resuming training is optional and user can set AUTO_RESUME=false to not resume the training.

If we want to overwrite the existing final checkpoint, we ignore the final checkpoint and return the previous checkpoints if exist.

vissl.utils.checkpoint.print_state_dict_shapes(state_dict: Dict[str, Any])[source]

For the given model state dictionary, print the name and shape of each parameter tensor in the model state. Helps debugging.


state_dict (Dict[str, Any]) – model state dictionary

vissl.utils.checkpoint.print_loaded_dict_info(model_state_dict: Dict[str, Any], state_dict: Dict[str, Any], skip_layers: List[str], model_config: vissl.utils.hydra_config.AttrDict)[source]

Print what layers were loaded, what layers were ignored/skipped/not found when initializing a model from a specified model params file.

vissl.utils.checkpoint.replace_module_prefix(state_dict: Dict[str, Any], prefix: str, replace_with: str = '')[source]

Remove prefixes in a state_dict needed when loading models that are not VISSL trained models.

Specify the prefix in the keys that should be removed.

vissl.utils.checkpoint.append_module_prefix(state_dict: Dict[str, Any], prefix: str)[source]

Append prefixes in a state_dict needed when loading models that are not VISSL trained models.

In order to load the model (if not trained with VISSL) with VISSL, there are 2 scenarios:
  1. If you are interested in evaluating the model features and freeze the trunk. Set APPEND_PREFIX=”trunk.base_model.” This assumes that your model is compatible with the VISSL trunks. The VISSL trunks start with “_feature_blocks.” prefix. If your model doesn’t have these prefix you can append them. For example: For TorchVision ResNet trunk, set APPEND_PREFIX=”trunk.base_model._feature_blocks.”

  2. where you want to load the model simply and finetune the full model. Set APPEND_PREFIX=”trunk.” This assumes that your model is compatible with the VISSL trunks. The VISSL trunks start with “_feature_blocks.” prefix. If your model doesn’t have these prefix you can append them. For TorchVision ResNet trunk, set APPEND_PREFIX=”trunk._feature_blocks.”

NOTE: the prefix is appended to all the layers in the model

vissl.utils.checkpoint.check_model_compatibilty(config: vissl.utils.hydra_config.AttrDict, state_dict: Dict[str, Any])[source]

Given a VISSL model and state_dict, check if the state_dict can be loaded to VISSL model (trunk + head) based on the trunk and head prefix that is expected. If not compatible, we raise exception.

Prefix checked for head: heads. Prefix checked for trunk: trunk._feature_blocks. or trunk.base_model._feature_blocks.

depending on the workflow type (training | evaluation).

  • config (AttrDict) – root config

  • state_dict (Dict[str, Any]) – state dict that should be checked for compatibility

vissl.utils.checkpoint.get_checkpoint_model_state_dict(config: vissl.utils.hydra_config.AttrDict, state_dict: Dict[str, Any])[source]

Given a specified pre-trained VISSL model (composed of head and trunk), we get the state_dict that can be loaded by appending prefixes to model and trunk.

  • config (AttrDict) – full config file

  • state_dict (Dict) – raw state_dict loaded from the checkpoint or weights file


state_dict (Dict)

vissl state_dict with layer names matching compatible with

vissl model. Hence this state_dict can be loaded directly.

vissl.utils.checkpoint.init_model_from_weights(config: vissl.utils.hydra_config.AttrDict, model, state_dict: Dict[str, Any], state_dict_key_name: str, skip_layers: List[str], replace_prefix=None, append_prefix=None)[source]

Initialize the model from any given params file. This is particularly useful during the feature evaluation process or when we want to evaluate a model on a range of tasks.

  • config (AttrDict) – config file

  • model (object) – instance of base_ssl_model

  • state_dict (Dict) – torch.load() of user provided params file path.

  • state_dict_key_name (string) – key name containing the model state dict

  • skip_layers (List(string)) – layer names with this key are not copied

  • replace_prefix (string) – remove these prefixes from the layer names (executed first)

  • append_prefix (string) – append the prefix to the layer names (executed after replace_prefix)


model (object) – the model initialized from the weights file

vissl.utils.collect_env module


Collect information about user system including cuda, torch, gpus, vissl and its dependencies. Users are strongly recommended to run this script to collect information about information if they needed debugging help.

vissl.utils.env module

vissl.utils.env.set_env_vars(local_rank: int, node_id: int, cfg: vissl.utils.hydra_config.AttrDict)[source]

Set some environment variables like total number of gpus used in training, distributed rank and local rank of the current gpu, whether to print the nccl debugging info and tuning nccl settings.


Print information about user system environment where VISSL is running.


Get the distributed and local rank of the current gpu.

vissl.utils.hydra_config module

class vissl.utils.hydra_config.AttrDict(dictionary)[source]

Bases: dict

Dictionary subclass whose entries can be accessed like attributes (as well as normally). Credits: # noqa


Recursively turn the dict and all its nested dictionaries into AttrDict instance.


Read a key as an attribute.


AttributeError – if the attribute does not correspond to an existing key.

__setattr__(key, value)[source]

Set a key as an attribute.


Delete a key as an attribute.


AttributeError – if the attribute does not correspond to an existing key.


Needed for pickling this class.


Needed for pickling this class.


Deep copy.

vissl.utils.hydra_config.convert_to_attrdict(cfg: omegaconf.dictconfig.DictConfig, cmdline_args: List[Any] = None)[source]

Given the user input Hydra Config, and some command line input options to override the config file: 1. merge and override the command line options in the config 2. Convert the Hydra OmegaConf to AttrDict structure to make it easy

to access the keys in the config file

  1. Also check the config version used is compatible and supported in vissl. In future, we would want to support upgrading the old config versions if we make changes to the VISSL default config structure (deleting, renaming keys)

  2. We infer values of some parameters in the config file using the other parameter values.


Check if Hydra is available. Simply python import to test.


Supports printing both Hydra DictConfig and also the AttrDict config

vissl.utils.hydra_config.resolve_linear_schedule(cfg, param_schedulers)[source]

For the given composite schedulers, for each linear schedule, if the training is 1 node only, the linear warmup rule has to be checked if the rule is applicable and necessary.

We set the end_value = scaled_lr (assuming it’s a linear warmup). In case only 1 machine is used in training, the start_lr = scaled_lr and then the linear warmup is not needed.

vissl.utils.hydra_config.get_scaled_lr_scheduler(cfg, param_schedulers, scaled_lr)[source]

Scale learning rate value for different Learning rate types. See assert_learning_rate() for how the scaled LR is calculated.

Values changed for learning rate schedules: 1. cosine:

end_value = scaled_lr * (end_value / start_value) start_value = scaled_lr and

  1. multistep:

    gamma = values[1] / values[0] values = [scaled_lr * pow(gamma, idx) for idx in range(len(values))]

  2. step_with_fixed_gamma

    base_value = scaled_lr

  3. linear: end_value = scaled_lr

  4. inverse_sqrt: start_value = scaled_lr

  5. constant: value = scaled_lr

  6. composite:

    recursively call to scale each composition. If the composition consists of a linear schedule, we assume that a linear warmup is applied. If the linear warmup is applied, it’s possible the warmup is not necessary if the global batch_size is smaller than the base_lr_batch_size and in that case, we remove the linear warmup from the schedule.


1) Assert the Learning rate here. LR is scaled as per to turn this automatic scaling off, set

scaled_lr is calculated:
given base_lr_batch_size = batch size for which the base learning rate is specified,

base_value = base learning rate value that will be scaled, The current batch size is used to determine how to scale the base learning rate value.

scaled_lr = ((batchsize_per_gpu * world_size) * base_value ) / base_lr_batch_size

We perform this auto-scaling for head learning rate as well if user wants to use a different learning rate for the head

  1. infer the model head params weight decay: if the head should use a different weight decay value than the trunk. If using different weight decay value for the head, set here. otherwise, the same value as trunk will be automatically used.


Infer settings for various self-supervised losses. Takes care of setting various loss parameters correctly like world size, batch size per gpu, effective global batch size, collator etc. Each loss has additional set of parameters that can be inferred to ensure smooth training in case user forgets to adjust all the parameters.


Infer values of few parameters in the config file using the value of other config parameters 1. Inferring losses 2. Auto scale learning rate if user has specified auto scaling to be True. 3. Infer meter names (model layer name being evaluated) since we support list meters

that have multiple output and same target. This is very common in self-supervised learning where we want to evaluate metric for several layers of the models. VISSL supports running evaluation for multiple model layers in a single training run.

  1. Support multi-gpu DDP eval model by attaching a dummy parameter. This is particularly helpful for the multi-gpu feature extraction especially when the dataset is large for which features are being extracted.

  2. Infer what kind of labels are being used. If user has specified a labels source, we set LABEL_TYPE to “standard” (also vissl default), otherwise if no label is specified, we set the LABEL_TYPE to “sample_index”. module str, cache_dir: str)str[source]

This implementation downloads the remote resource and caches it locally. The resource will only be downloaded if not previously requested.

Simply create the symlinks for a given file1 to file2. Useful during model checkpointing to symlinks to the latest successful checkpoint., filename)[source]

Common i/o utility to handle saving data to various file formats. Supported:

.pkl, .pickle, .npy, .json, mmap_mode=None)[source]

Common i/o utility to handle loading data from various file formats. Supported:

.pkl, .pickle, .npy, .json

For the npy files, we support reading the files in mmap_mode. If the mmap_mode of reading is not successful, we load data without the mmap_mode.[source]

Create the directory if it does not exist.[source]

Check if an input string is a url. look for http(s):// and ignoring the case[source]

Utility for deleting a directory. Useful for cleaning the storage space that contains various training artifacts like checkpoints, data etc.[source]

Given a file, get the size of file in MB, destination_dir, tmp_destination_dir)[source]

Copy a given input_file from source to the destination directory.

Steps: 1. We use PathManager to extract the data to local path. 2. we simply move the files from the PathManager cached local directory

to the user specified destination directory. We use rsync. How destination dir is chosen:

  1. If user is using slurm, we set destination_dir = slurm_dir (see get_slurm_dir)

  2. If the local path used by PathManafer is same as the input_file path, and the destination directory is not specified, we set destination_dir = tmp_destination_dir


output_file (str) – the new path of the file destination_dir (str): the destination dir that was actually used, destination_dir, num_threads)[source]

Copy contents of one directory to the specified destination directory using the number of threads to speed up the copy. When the data is copied successfully, we create a copy_complete file in the destination_dir folder to mark the completion. If the destination_dir folder already exists and has the copy_complete file, we don’t copy the file.

useful for copying datasets like ImageNet to speed up dataloader. Using 20 threads for imagenet takes about 20 minutes to copy.


destination_dir (str) – directory where the contents were copied, destination_dir, num_threads, tmp_destination_dir)[source]

Copy data from one source to the other using num_threads. The data to copy can be a single file or a directory. We check what type of data and call the relevant functions.


output_file (str) – the new path of the data (could be file or dir) destination_dir (str): the destination dir that was actually used, destination_dir, num_threads=40, tmp_destination_dir=None)[source]

Iteratively copy the list of data to a destination directory. Each data to copy could be a single file or a directory.


output_file (str)

the new path of the file. If there were

no files to copy, simply return the input_files

destination_dir (str): the destination dir that was actually used

vissl.utils.logger module

vissl.utils.logger.setup_logging(name, output_dir=None, rank=0)[source]

Setup various logging streams: stdout and file handlers.

For file handlers, we only setup for the master gpu.


After training is done, we ensure to shut down all the logger streams.


Log nvidia-smi snapshot. Useful to capture the configuration of gpus.


Parse the nvidia-smi output and extract the memory used stats. Not recommended to use.

vissl.utils.misc module


Check if apex is available with simple python imports.


Check if faiss is available with simple python imports. To install faiss, simply do:

If using PIP env: pip install faiss-gpu If using conda env: conda install faiss-gpu -c pytorch


Check if opencv is available with simple python imports. To install opencv, simply do: pip install opencv-python regardless of whether using conda or pip environment.


Find the free port that can be used for Rendezvous on the local machine. We use this for 1 machine training where the port is automatically detected.

vissl.utils.misc.get_dist_run_id(cfg, num_nodes)[source]

For multi-gpu training with PyTorch, we have to specify how the gpus are going to rendezvous. This requires specifying the communication method: file, tcp and the unique rendezvous run_id that is specific to 1 run.

We recommend:
  1. for 1-node: use init_method=tcp and run_id=auto

  2. for multi-node, use init_method=tcp and specify run_id={master_node}:{port}

vissl.utils.misc.setup_multiprocessing_method(method_name: str)[source]

PyTorch supports several multiprocessing options: forkserver | spawn | fork

We recommend and use forkserver as the default method in VISSL.

vissl.utils.misc.set_seeds(cfg, node_id=0)[source]

Set the python random, numpy and torch seed for each gpu. Also set the CUDA seeds if the CUDA is available. This ensures deterministic nature of the training.


Is faster than np.argwhere. Used in loss functions like swav loss, etc

vissl.utils.misc.merge_features(output_dir, split, layer, cfg)[source]

For multi-gpu feature extraction, each gpu saves features corresponding to its share of the data. We can merge the features across all gpus to get the features for the full data.

The features are saved along with the data indexes and label. The data indexes can be used to sort the data and ensure the uniqueness.

We organize the features, targets corresponding to the data index of each feature, ensure the uniqueness and return.

  • output_dir (str) – input path where the features are dumped

  • split (str) – whether the features are train or test data features

  • layer (str) – the features correspond to what layer of the model

  • cfg (AttrDict) – the input configuration specified by user


output (Dict) – contains features, targets, inds as the keys


Searches for the dataset_catalog.json file that contains information about the dataset paths if set by user.


Performs all_gather operation on the provided tensors. * Warning *: torch.distributed.all_gather has no gradient.

vissl.utils.perf_stats module

class vissl.utils.perf_stats.PerfTimer(timer_name: str, perf_stats: Optional[PerfStats])[source]

Bases: object

Very simple timing wrapper, with context manager wrapping. Typical usage:

with PerfTimer(‘forward_pass’, perf_stats):


# … with PerfTimer(‘backward_pass’, perf_stats):


# … print(perf_stats.report_str())

Note that timer stats accumulate by name, so you can as if resume them by re-using the name.

You can also use it without context manager, i.e. via start() / stop() directly.

If supplied PerfStats is constructed with use_cuda_events=True (which is default), then Cuda events will be added to correctly track time of async execution of Cuda kernels:

with PerfTimer(‘foobar’, perf_stats):

some_cpu_work() schedule_some_cuda_work()

In example above, the “Host” column will capture elapsed time from the perspective of the Python process, and “CudaEvent” column will capture elapsed time between scheduling of Cuda work (within the PerfTimer scope) and completion of this work, some of which might happen outside the PerfTimer scope.

If perf_stats is None, using PerfTimer does nothing.


Start the recording if the perfTimer should not be skipped or if the recording is not already in progress. If using cuda, we record time of cuda events as well.


Stop the recording and update the recording interaval, total time elapsed from the beginning of perfTimer recording. If using CUDA, we measure time for cuda events and append to cuda interval.


Update the timer. We should only do this if the timer is Not skipped and also if the timer has already been stopped.

class vissl.utils.perf_stats.PerfMetric[source]

Bases: object

Encapsulates numerical tracking of a single metric, with a .update(value) API. Under-the-hood this can additionally keep track of sums, (exp.) moving averages, sum of squares (e.g. for stdev), filtered values, etc.

update(value: float)[source]

Get the mean value of the metrics recorded.

class vissl.utils.perf_stats.PerfStats(use_cuda_events=True)[source]

Bases: object

Accumulate stats (from timers) over many iterations

update_with_timer(timer: vissl.utils.perf_stats.PerfTimer)[source]

Fancy column-aligned human-readable report. If using Cuda events, calling this invokes cuda.synchronize(), which is needed to capture pending Cuda work in the report.


vissl.utils.slurm module

vissl.utils.slurm.get_node_id(node_id: int)[source]

If using SLURM, we get environment variables like SLURMD_NODENAME, SLURM_NODEID to get information about the current node. Useful to set the node_id automatically.

vissl.utils.slurm.get_slurm_dir(input_dir: str)[source]

If using SLURM, we use the environment variable “SLURM_JOBID” to uniquely identify the current training and append the id to the input directory. This could be used to store any training artifacts specific to this training run.

vissl.utils.tensorboard module

This script contains some helpful functions to handle tensorboard setup.


Check whether tensorboard is available or not.


tb_available (bool)

based on tensorboard imports, returns whether tensboarboard

is available or not.


Get the output directory where the tensorboard events will be written.


cfg (AttrDict) – User specified config file containing the settings for the tensorboard as well like log directory, logging frequency etc


tensorboard_dir (str) – output directory path


Construct the Tensorboard hook for visualization from the specified config


cfg (AttrDict) – User specified config file containing the settings for the tensorboard as well like log directory, logging frequency etc


SSLTensorboardHook (function) – the tensorboard hook constructed