vissl.utils package

vissl.utils.instance_retrieval_utils.data_util module

vissl.utils.instance_retrieval_utils.data_util.is_oxford_paris_dataset(dataset_name: str)[source]

Computes whether the specified dataseet name is a revisited version of the oxford and paris datasets. simply looks for pattern “roxford5k” and “rparis6k” in specified dataset_name.

vissl.utils.instance_retrieval_utils.data_util.is_revisited_dataset(dataset_name: str)[source]

Computes whether the specified dataseet name is a revisited version of the oxford and paris datasets. simply looks for pattern “roxford5k” and “rparis6k” in specified dataset_name.

vissl.utils.instance_retrieval_utils.data_util.is_instre_dataset(dataset_name: str)[source]

Returns True if the dataset name is “instre”. Helper function used in code at several places.

vissl.utils.instance_retrieval_utils.data_util.is_whiten_dataset(dataset_name: str)[source]

Returns if the dataset specified has name “whitening”. User can use any dataset they want for whitening.

vissl.utils.instance_retrieval_utils.data_util.is_copdays_dataset(dataset_name: str)[source]

Is the dataset copydays.

vissl.utils.instance_retrieval_utils.data_util.add_bias_channel(x, dim: int = 1)[source]

Adds a bias channel useful during pooling + whitening operation.

vissl.utils.instance_retrieval_utils.data_util.flatten(x: torch.Tensor, keepdims: bool = False)[source]

Flattens B C H W input to B C*H*W output, optionally retains trailing dimensions.

vissl.utils.instance_retrieval_utils.data_util.get_average_gem(activation_maps: List[torch.Tensor], p: int = 3, eps: float = 1e-06, clamp: bool = True, add_bias: bool = False, keepdims: bool = False)[source]

Average Gem pooling of list of tensors. See #gem below for more information.


x (torch.Tensor) – Gem pooled tensor

vissl.utils.instance_retrieval_utils.data_util.gem(x: torch.Tensor, p: int = 3, eps: float = 1e-06, clamp: bool = True, add_bias: bool = False, keepdims: bool = False)[source]

Gem pooling on the given tensor.

  • x (torch.Tensor) – tensor on which the pooling should be done

  • p (int) – pooling number. If p=inf then simply perform max_pool2d If p=1 and x tensor has grad, simply perform avg_pool2d else, perform Gem pooling for specified p

  • eps (float) – if clamping the x tensor, use the eps for clamping

  • clamp (float) – whether to clamp the tensor

  • add_bias (bool) – whether to add the biad channel

  • keepdims (bool) – whether to flatten or keep the dimensions as is


x (torch.Tensor) – Gem pooled tensor

vissl.utils.instance_retrieval_utils.data_util.l2n(x: torch.Tensor, eps: float = 1e-06, dim: int = 1)[source]

L2 normalize the input tensor along the specified dimension

  • x (torch.Tensor) – the tensor to normalize

  • eps (float) – epsilon to use to normalize to avoid the inf output

  • dim (int) – along which dimension to L2 normalize


x (torch.Tensor) – L2 normalized tensor

class vissl.utils.instance_retrieval_utils.data_util.MultigrainResize(size: int, largest: bool = False, **kwargs)[source]

Bases: torchvision.transforms.transforms.Resize

Resize with a largest=False argument allowing to resize to a common largest side without cropping Approach used in the Multigrain paper

static target_size(w: int, h: int, size: int, largest: bool = False)[source]
class vissl.utils.instance_retrieval_utils.data_util.WhiteningTrainingImageDataset(base_dir: str, image_list_file: str, num_samples: int = 0)[source]

Bases: object

A set of training images for whitening

get_filename(i: int)[source]
class vissl.utils.instance_retrieval_utils.data_util.InstreDataset(dataset_path: str, num_samples: int = 0)[source]

Bases: object

A dataset class that reads and parses the Instre Dataset so it’s ready to be used in the code for retrieval evaluations


Number of images in the dataset


Number of query images in the dataset

get_filename(i: int)[source]

Return the image filepath for the db image

get_query_filename(i: int)[source]

Reutrn the image filepath for the query image

get_query_roi(i: int)[source]

INSTRE dataset has no notion of ROI so we return None.


Return the mean average precision value or the train and validation both provided the ranks (scores of the model).

score(scores, verbose=True, temp_dir=None)[source]

For the input scores of the model, calculate the AP metric

class vissl.utils.instance_retrieval_utils.data_util.RevisitedInstanceRetrievalDataset(dataset: str, dir_main: str, num_samples=None)[source]

Bases: object

A dataset class used for the Revisited Instance retrieval datasets: Revisited Oxford and Revisited Paris. The object reads and parses the datasets so it’s ready to be used in the code for retrieval evaluations.

load_config(dir_main, dataset)[source]
get_filename(i: int)[source]

Return the image filepath for the db image

get_query_filename(i: int)[source]

Reutrn the image filepath for the query image


Number of images in the dataset


Number of query images in the dataset

get_query_roi(i: int)[source]

Get the ROI for the query image that we want to test retrieval

score(sim, temp_dir=None)[source]

For the input similarity scores of the model, calculate the mean AP metric and mean Precision@k metrics.

class vissl.utils.instance_retrieval_utils.data_util.InstanceRetrievalImageLoader(S, transforms)[source]

Bases: object

The custom loader for the Paris and Oxford Instance Retrieval datasets.


Apply the pre-defined transforms on the image.


from the filename, load the whitening image and prepare it to be used by applying data transforms


from the filename, load the db or query image and prepare it to be used by applying data transforms

load_and_prepare_image(fname, roi=None)[source]

Read image, get aspect ratio, and resize such as the largest side equals S. If there is a roi, adapt the roi to the new size and crop. Do not rescale the image once again. ROI format is (xmin,ymin,xmax,ymax)

load_and_prepare_revisited_image(img_path, roi=None)[source]

Load the image, crop the roi from the image if the roi is not None, apply the image transforms.

class vissl.utils.instance_retrieval_utils.data_util.GenericInstanceRetrievalDataset(data_path: str, num_samples: int = None)[source]

Bases: object

A dataset class for reading images from a folder in the following simple format:
  • image_0.jpg

… - image_N.jpg

The other datasets are in the process of being deprecated, currently this is available for use as a database or train split.

get_filename(i: int)[source]

Return the image filepath for the db image

get_query_filename(i: int)[source]

Rerurn the image filepath for the query image


Number of images in the dataset


Number of query images in the dataset

get_query_roi(i: int)[source]

GenericDataset does not yet have query_roi support

score(sim, temp_dir=None)[source]

For the input similarity scores of the model, calculate the mean AP metric and mean Precision@k metrics.

class vissl.utils.instance_retrieval_utils.data_util.InstanceRetrievalDataset(path, eval_binary_path, num_samples=None)[source]

Bases: object

A dataset class used for the Instance retrieval datasets: Oxford and Paris. The object reads and parses the datasets so it’s ready to be used in the code for retrieval evaluations.

Credits: # NOQA Adapted by: Priya Goyal (


Number of images in the dataset


Number of query images in the dataset


Load the data ground truth and parse the data so it’s ready to be used.

score(sim, temp_dir)[source]

From the input similarity score, compute the mean average precision

score_rnk_partial(i, idx, temp_dir)[source]

Compute the mean AP for a given single query


Return the image filepath for the db image


Reutrn the image filepath for the query image


Get the ROI for the query image that we want to test retrieval

class vissl.utils.instance_retrieval_utils.data_util.CopyDaysDataset(data_path: str, num_samples: int = None, use_distractors: bool = False)[source]

Bases: object

A dataset class used for the Copydays dataset.

query_splits = ['original', 'strong', 'jpegqual_3', 'jpegqual_5', 'jpegqual_8', 'jpegqual_10', 'jpegqual_15', 'jpegqual_20', 'jpegqual_30', 'jpegqual_50', 'jpegqual_75', 'crops_10', 'crops_15', 'crops_20', 'crops_30', 'crops_40', 'crops_50', 'crops_60', 'crops_70', 'crops_80']
database_splits = ['original']
get_filename(i: int)[source]

Return the image filepath for the db image

get_query_filename(i: int)[source]

Rerurn the image filepath for the query image


Number of images in the dataset


Number of query images in the dataset

get_query_roi(i: int)[source]

Copydays has no concept of ROI.

score(sim, temp_dir=None)[source]

For the input similarity scores of the model, calculate the mean AP metric and mean Precision@k metrics.

vissl.utils.instance_retrieval_utils.evaluate module

vissl.utils.instance_retrieval_utils.evaluate.score_ap_from_ranks_1(ranks, nres)[source]

Compute the average precision of one search.

  • ranks – ordered list of ranks of true positives

  • nres – total number of positives in dataset


ap (float) – the average precision following the Holidays and the INSTRE package

vissl.utils.instance_retrieval_utils.evaluate.compute_ap(ranks, nres)[source]

Computes average precision for given ranked indexes.

  • ranks – zero-based ranks of positive images

  • nres – number of positive images


ap (float) – average precision

vissl.utils.instance_retrieval_utils.evaluate.compute_map(ranks, gnd, kappas)[source]

Computes the mAP for a given set of returned results.


map = compute_map (ranks, gnd)

computes mean average precsion (map) only

map, aps, pr, prs = compute_map (ranks, gnd, kappas)
-> computes mean average precision (map), average precision (aps) for

each query

-> computes mean precision at kappas (pr), precision at kappas (prs) for

each query

Notes: 1) ranks starts from 0, ranks.shape = db_size X #queries 2) The junk results (e.g., the query itself) should be declared in the gnd

stuct array

  1. If there are no positive images for some query, that query is excluded from the evaluation

vissl.utils.instance_retrieval_utils.pca module

vissl.utils.instance_retrieval_utils.rmac module

vissl.utils.instance_retrieval_utils.rmac.normalize_L2(a, dim)[source]

L2 normalize the input tensor along the specified dimension

  • a (torch.Tensor) – the tensor to normalize

  • dim (int) – along which dimension to L2 normalize


a (torch.Tensor) – L2 normalized tensor

vissl.utils.instance_retrieval_utils.rmac.get_rmac_region_coordinates(H, W, L)[source]

Almost verbatim from Tolias et al Matlab implementation. Could be heavily pythonized, but really not worth it… Desired overlap of neighboring regions

vissl.utils.instance_retrieval_utils.rmac.get_rmac_descriptors(features, rmac_levels, pca=None, normalize=True)[source]

RMAC descriptors. Coordinates are retrieved following Tolias et al. L2 normalize the descriptors and optionally apply PCA on the descriptors if specified by the user. After PCA, aggregate the descriptors (sum) and normalize the aggregated descriptor and return.

vissl.utils.svm_utils.evaluate module

vissl.utils.svm_utils.evaluate.calculate_ap(rec, prec)[source]

Computes the AP under the precision recall curve.

vissl.utils.svm_utils.evaluate.get_precision_recall(targets, scores, weights=None)[source]

[P, R, score, ap] = get_precision_recall(targets, scores, weights)

  • targets – number of occurrences of this class in the ith image

  • scores – score for this image

  • weights – 0 or 1 whether where 0 means we should ignore the sample


P, R – precision and recall score: score which corresponds to the particular precision and recall ap: average precision

vissl.utils.svm_utils.svm_trainer module

class vissl.utils.svm_utils.svm_trainer.SVMTrainer(config, layer, output_dir)[source]

Bases: object

SVM trainer that takes care of training (using k-fold cross validation), and evaluating the SVMs

load_input_data(data_file, targets_file)[source]

Given the input data (features) and targets (labels) files, load the features of shape N x D and labels of shape (N,)


During the SVM training, we write the cross vaildation AP value for training at each class and cost value combination. We load the AP values and for each class, determine the cost value that gives the maximum AP. We return the chosen cost values for each class as a numpy matrix.

train_cls(features, targets, cls_num)[source]

Train SVM on the input features and targets for a given class. The SVMs are trained for all costs values for the given class. We also save the cross-validation AP at each cost value for the given class.

train(features, targets)[source]

Train SVMs on the given features and targets for all classes and all the costs values.

test(features, targets)[source]

Test the trained SVM models on the test features and targets values. We use the cost per class that gives the maximum cross validation AP on the training and load the correspond trained SVM model for the cost value and the class.

Log the test ap to stdout and also save the AP in a file.

vissl.utils.svm_utils.svm_low_shot_trainer module

class vissl.utils.svm_utils.svm_low_shot_trainer.SVMLowShotTrainer(config, layer, output_dir)[source]

Bases: vissl.utils.svm_utils.svm_trainer.SVMTrainer

Train the SVM for the low-shot image classification tasks. Currently, datasets like VOC07 and Places205 are supported.

The trained inherits from the SVMTrainer class and takes care of training SVM, evaluating, and aggregate the metrics.

train(features, targets, sample_num, low_shot_kvalue)[source]

Train SVM on the input features and targets for a given low-shot k-value and the independent low-shot sample number.

We save the trained SVM model for each combination:

cost value, class number, sample number, k-value

test(features, targets, sample_num, low_shot_kvalue)[source]
Test the SVM for the input test features and targets for the given:

low-shot k-value, sample number

We compute the meanAP across all classes for a given cost value. We get the output matrix of shape (1, #costs) for the given sample_num and k-value and save the matrix. We use this information to aggregate later.

aggregate_stats(k_values, sample_inds)[source]

Aggregate the test AP across all k-values and independent samples.

For each low-shot k-value, we obtain the mean, max, min, std AP value. Steps:

  1. For each k-value, get the min/max/mean/std value across all the independent samples. This results in matrices [#k-values x #classes]

  2. Then we aggregate stats across the classes. For the mean stats in step 1, for each k-value, we get the class which has maximum mean.

vissl.utils.activation_checkpointing module

This module centralizes all activation checkpointing related code. It is a work-in-progress as we evolve the APIs and eventually put this in fairscale so that multiple projects can potentially share it.

vissl.utils.activation_checkpointing.manual_gradient_reduction(model: torch.nn.modules.module.Module, config_flag: bool)bool[source]

Return if we should use manual gradient reduction or not.

We should use manual DDP if config says so and model is wrapped by DDP.

vissl.utils.activation_checkpointing.manual_sync_params(model: torch.nn.parallel.distributed.DistributedDataParallel)None[source]

Manually sync params and buffers for DDP.

vissl.utils.activation_checkpointing.manual_gradient_all_reduce(model: torch.nn.parallel.distributed.DistributedDataParallel)None[source]

Gradient reduction function used after backward is done.

vissl.utils.activation_checkpointing.layer_splittable_before(m: torch.nn.modules.module.Module)bool[source]

Return if this module can be split in front of it for checkpointing. We don’t split the relu module.

vissl.utils.activation_checkpointing.checkpoint_trunk(feature_blocks: Dict[str, torch.nn.modules.module.Module], unique_out_feat_keys: List[str], checkpointing_splits: int) → Dict[str, torch.nn.modules.module.Module][source]

Checkpoint a list of blocks and return back the split version.

vissl.utils.checkpoint module

vissl.utils.collect_env module


Collect information about user system including cuda, torch, gpus, vissl and its dependencies. Users are strongly recommended to run this script to collect information about information if they needed debugging help.

vissl.utils.env module

vissl.utils.env.set_env_vars(local_rank: int, node_id: int, cfg: vissl.config.attr_dict.AttrDict)[source]

Set some environment variables like total number of gpus used in training, distributed rank and local rank of the current gpu, whether to print the nccl debugging info and tuning nccl settings.


Registering the right options for the g_pathmgr: Override this function in your build system to support different distributed file system


Print information about user system environment where VISSL is running.


Get the distributed and local rank of the current gpu.

vissl.utils.hydra_config module

vissl.utils.hydra_config.save_attrdict_to_disk(cfg: vissl.config.attr_dict.AttrDict)[source]
vissl.utils.hydra_config.convert_to_attrdict(cfg: omegaconf.dictconfig.DictConfig, cmdline_args: List[Any] = None, dump_config: bool = True)[source]

Given the user input Hydra Config, and some command line input options to override the config file: 1. merge and override the command line options in the config 2. Convert the Hydra OmegaConf to AttrDict structure to make it easy

to access the keys in the config file

  1. Also check the config version used is compatible and supported in vissl. In future, we would want to support upgrading the old config versions if we make changes to the VISSL default config structure (deleting, renaming keys)

  2. We infer values of some parameters in the config file using the other parameter values.

vissl.utils.hydra_config.convert_fsdp_dtypes(config: vissl.config.attr_dict.AttrDict)[source]

Transform configuration types (primitive types) to VISSL specific types


Check if Hydra is available. Simply python import to test.

vissl.utils.hydra_config.get_hydra_version() → Tuple[int, …][source]

Check if Hydra is available. Simply python import to test. Also verifies whether the version is up to date.

vissl.utils.hydra_config.compose_hydra_configuration(overrides: List[str])[source]

Transform the list of overrides provided on the command line to an actual VISSL configuration by merging these overrides with the defaults configuration of VISSL


Supports printing both Hydra DictConfig and also the AttrDict config

vissl.utils.hydra_config.resolve_linear_schedule(cfg, param_schedulers)[source]

For the given composite schedulers, for each linear schedule, if the training is 1 node only, the linear warmup rule has to be checked if the rule is applicable and necessary.

We set the end_value = scaled_lr (assuming it’s a linear warmup). In case only 1 machine is used in training, the start_lr = scaled_lr and then the linear warmup is not needed.

vissl.utils.hydra_config.get_scaled_lr_scheduler(cfg, param_schedulers, scaled_lr)[source]

Scale learning rate value for different Learning rate types. See infer_learning_rate() for how the scaled LR is calculated.

Values changed for learning rate schedules: 1. cosine:

end_value = scaled_lr * (end_value / start_value) start_value = scaled_lr and

  1. multistep:

    gamma = values[1] / values[0] values = [scaled_lr * pow(gamma, idx) for idx in range(len(values))]

  2. step_with_fixed_gamma

    base_value = scaled_lr

  3. linear: end_value = scaled_lr

  4. inverse_sqrt: start_value = scaled_lr

  5. constant: value = scaled_lr

  6. composite:

    recursively call to scale each composition. If the composition consists of a linear schedule, we assume that a linear warmup is applied. If the linear warmup is applied, it’s possible the warmup is not necessary if the global batch_size is smaller than the base_lr_batch_size and in that case, we remove the linear warmup from the schedule.


1) Assert the Learning rate here. LR is scaled as per to turn this automatic scaling off, set

scaled_lr is calculated:
given base_lr_batch_size = batch size for which the base learning rate is specified,

base_value = base learning rate value that will be scaled, The current batch size is used to determine how to scale the base learning rate value.

scale_factor = (batchsize_per_gpu * world_size) / base_lr_batch_size if scaling_type is sqrt, scale factor = sqrt(scale_factor) scaled_lr = scale_factor * base_value

We perform this auto-scaling for head learning rate as well if user wants to use a different learning rate for the head

  1. infer the model head params weight decay: if the head should use a different weight decay value than the trunk. If using different weight decay value for the head, set here. otherwise, the same value as trunk will be automatically used.


Infer settings for various self-supervised losses. Takes care of setting various loss parameters correctly like world size, batch size per gpu, effective global batch size, collator etc. Each loss has additional set of parameters that can be inferred to ensure smooth training in case user forgets to adjust all the parameters.


inference for the FSDP settings. Conditions are: 1) use the FSDP task 2) use the single param group in the optimizer 3) if AMP is used, it must be PyTorch AMP 4) If training SwAV, we automatically set the head to SwAV FSDP head 4) Inference for the FSDP parameters to ensure the good convergence


Infer values of few parameters in the config file using the value of other config parameters 1. Inferring losses 2. Auto scale learning rate if user has specified auto scaling to be True. 3. Infer meter names (model layer name being evaluated) since we support list meters

that have multiple output and same target. This is very common in self-supervised learning where we want to evaluate metric for several layers of the models. VISSL supports running evaluation for multiple model layers in a single training run.

  1. Support multi-gpu DDP eval model by attaching a dummy parameter. This is particularly helpful for the multi-gpu feature extraction especially when the dataset is large for which features are being extracted.

  2. Infer what kind of labels are being used. If user has specified a labels source, we set LABEL_TYPE to “standard” (also vissl default), otherwise if no label is specified, we set the LABEL_TYPE to “sample_index”. module str, cache_dir: str)str[source]

This implementation downloads the remote resource and caches it locally. The resource will only be downloaded if not previously requested.

Simply create the symlinks for a given file1 to file2. Useful during model checkpointing to symlinks to the latest successful checkpoint., filename, append_to_json=True, verbose=True)[source]

Common i/o utility to handle saving data to various file formats. Supported:

.pkl, .pickle, .npy, .json

Specifically for .json, users have the option to either append (default) or rewrite by passing in Boolean value to append_to_json., mmap_mode=None)[source]

Common i/o utility to handle loading data from various file formats. Supported:

.pkl, .pickle, .npy, .json

For the npy files, we support reading the files in mmap_mode. If the mmap_mode of reading is not successful, we load data without the mmap_mode. str)[source]

Make a path absolute, but take into account prefixes like “http://” or “manifold://”[source]

Create the directory if it does not exist.[source]

Check if an input string is a url. look for http(s):// and ignoring the case[source]

Utility for deleting a directory. Useful for cleaning the storage space that contains various training artifacts like checkpoints, data etc.[source]

Given a file, get the size of file in MB, destination_dir, tmp_destination_dir)[source]

Copy a given input_file from source to the destination directory.

Steps: 1. We use g_pathmgr to extract the data to local path. 2. we simply move the files from the g_pathmgr cached local directory

to the user specified destination directory. We use rsync. How destination dir is chosen:

  1. If user is using slurm, we set destination_dir = slurm_dir (see get_slurm_dir)

  2. If the local path used by PathManafer is same as the input_file path, and the destination directory is not specified, we set destination_dir = tmp_destination_dir


output_file (str) – the new path of the file destination_dir (str): the destination dir that was actually used, destination_dir, num_threads)[source]

Copy contents of one directory to the specified destination directory using the number of threads to speed up the copy. When the data is copied successfully, we create a copy_complete file in the destination_dir folder to mark the completion. If the destination_dir folder already exists and has the copy_complete file, we don’t copy the file.

useful for copying datasets like ImageNet to speed up dataloader. Using 20 threads for imagenet takes about 20 minutes to copy.


destination_dir (str) – directory where the contents were copied, destination_dir, num_threads, tmp_destination_dir)[source]

Copy data from one source to the other using num_threads. The data to copy can be a single file or a directory. We check what type of data and call the relevant functions.


output_file (str) – the new path of the data (could be file or dir) destination_dir (str): the destination dir that was actually used, destination_dir, num_threads=40, tmp_destination_dir=None)[source]

Iteratively copy the list of data to a destination directory. Each data to copy could be a single file or a directory.


output_file (str)

the new path of the file. If there were

no files to copy, simply return the input_files

destination_dir (str): the destination dir that was actually used

vissl.utils.logger module

vissl.utils.logger.setup_logging(name, output_dir=None, rank=0)[source]

Setup various logging streams: stdout and file handlers.

For file handlers, we only setup for the master gpu.


After training is done, we ensure to shut down all the logger streams.


Log nvidia-smi snapshot. Useful to capture the configuration of gpus.


Parse the nvidia-smi output and extract the memory used stats. Not recommended to use.

vissl.utils.misc module


Check if the fairscale version has the ShardedGradScaler() to use with ZeRO + PyTorchAMP


Check if faiss is available with simple python imports.

To install faiss, simply do:

If using PIP env: pip install faiss-gpu If using conda env: conda install faiss-gpu -c pytorch


Check if opencv is available with simple python imports.

To install opencv, simply do: pip install opencv-python regardless of whether using conda or pip environment.


Check if apex is available with simple python imports.


Check if apex is available with simple python imports.


Find the free port that can be used for Rendezvous on the local machine. We use this for 1 machine training where the port is automatically detected.

vissl.utils.misc.get_dist_run_id(cfg, num_nodes)[source]

For multi-gpu training with PyTorch, we have to specify how the gpus are going to rendezvous. This requires specifying the communication method: file, tcp and the unique rendezvous run_id that is specific to 1 run.

We recommend:
  1. for 1-node: use init_method=tcp and run_id=auto

  2. for multi-node, use init_method=tcp and specify run_id={master_node}:{port}

vissl.utils.misc.setup_multiprocessing_method(method_name: str)[source]

PyTorch supports several multiprocessing options: forkserver | spawn | fork

We recommend and use forkserver as the default method in VISSL.

vissl.utils.misc.set_seeds(cfg, dist_rank)[source]

Set the python random, numpy and torch seed for each gpu. Also set the CUDA seeds if the CUDA is available. This ensures deterministic nature of the training.

vissl.utils.misc.set_dataloader_seeds(_worker_id: int)[source]

See: When using “Fork” process spawning, the dataloader workers inherit the seeds of the parent process for numpy. While torch seeds are handled correctly across dataloaders and across epochs, numpy seeds are not. Therefore in order to ensure each worker has a different and deterministic seed, we must explicitly set the numpy seed to the torch seed. Also see


Is faster than np.argwhere. Used in loss functions like swav loss, etc

vissl.utils.misc.merge_features(input_dir: str, split: str, layer: str)[source]
vissl.utils.misc.get_json_catalog_path(default_dataset_catalog_path: str)str[source]

Gets dataset catalog json file absolute path. Optionally set environment variable VISSL_DATASET_CATALOG_PATH for dataset catalog path. Useful for local development and/or remote server configuration.


Searches for the dataset_catalog.json file that contains information about the dataset paths if set by user.


Performs all_gather operation on the provided tensors. * Warning *: torch.distributed.all_gather has no gradient.

class vissl.utils.misc.set_torch_seed(seed)[source]

Bases: object

vissl.utils.misc.retry(func=None, exception=<class 'Exception'>, n_tries=5, delay=5, backoff=1, logger=False)[source]

Retry decorator with exponential backoff.

functyping.Callable, optional

Callable on which the decorator is applied, by default None

exceptionException or tuple of Exceptions, optional

Exception(s) that invoke retry, by default Exception

n_triesint, optional

Number of tries before giving up, by default 5

delayint, optional

Initial delay between retries in seconds, by default 5

backoffint, optional

Backoff multiplier e.g. value of 2 will double the delay, by default 1

loggerbool, optional

Option to log or print, by default False


Decorated callable that calls itself when exception(s) occur.

>>> import random
>>> @retry(exception=Exception, n_tries=4)
... def test_random(text):
...    x = random.random()
...    if x < 0.5:
...        raise Exception("Fail")
...    else:
...        print("Success: ", text)
>>> test_random("It works!")
vissl.utils.misc.flatten_dict(d: dict, parent_key='', sep='_')[source]

Flattens a dict, delimited with a ‘_’. For example the input: {

‘top_1’: {

‘res_5’: 100



will return:


‘top_1_res_5’: 100


vissl.utils.misc.recursive_dict_merge(dict1, dict2)[source]

Recursively merges dict2 into dict1

vissl.utils.perf_stats module

class vissl.utils.perf_stats.PerfTimer(timer_name: str, perf_stats: Optional[PerfStats])[source]

Bases: object

Very simple timing wrapper, with context manager wrapping. Typical usage:

with PerfTimer(‘forward_pass’, perf_stats):


# … with PerfTimer(‘backward_pass’, perf_stats):


# … print(perf_stats.report_str())

Note that timer stats accumulate by name, so you can as if resume them by re-using the name.

You can also use it without context manager, i.e. via start() / stop() directly.

If supplied PerfStats is constructed with use_cuda_events=True (which is default), then Cuda events will be added to correctly track time of async execution of Cuda kernels:

with PerfTimer(‘foobar’, perf_stats):

some_cpu_work() schedule_some_cuda_work()

In example above, the “Host” column will capture elapsed time from the perspective of the Python process, and “CudaEvent” column will capture elapsed time between scheduling of Cuda work (within the PerfTimer scope) and completion of this work, some of which might happen outside the PerfTimer scope.

If perf_stats is None, using PerfTimer does nothing.


Start the recording if the perfTimer should not be skipped or if the recording is not already in progress. If using cuda, we record time of cuda events as well.


Stop the recording and update the recording interaval, total time elapsed from the beginning of perfTimer recording. If using CUDA, we measure time for cuda events and append to cuda interval.


Update the timer. We should only do this if the timer is Not skipped and also if the timer has already been stopped.

class vissl.utils.perf_stats.PerfMetric[source]

Bases: object

Encapsulates numerical tracking of a single metric, with a .update(value) API. Under-the-hood this can additionally keep track of sums, (exp.) moving averages, sum of squares (e.g. for stdev), filtered values, etc.

update(value: float)[source]

Get the mean value of the metrics recorded.

class vissl.utils.perf_stats.PerfStats(use_cuda_events=True)[source]

Bases: object

Accumulate stats (from timers) over many iterations

update_with_timer(timer: vissl.utils.perf_stats.PerfTimer)[source]

Fancy column-aligned human-readable report. If using Cuda events, calling this invokes cuda.synchronize(), which is needed to capture pending Cuda work in the report.


vissl.utils.slurm module

vissl.utils.slurm.get_node_id(node_id: int)[source]

If using SLURM, we get environment variables like SLURMD_NODENAME, SLURM_NODEID to get information about the current node. Useful to set the node_id automatically.

vissl.utils.slurm.get_slurm_dir(input_dir: str)[source]

If using SLURM, we use the environment variable “SLURM_JOBID” to uniquely identify the current training and append the id to the input directory. This could be used to store any training artifacts specific to this training run.


Indicates if submitit, the library around SLURM used to run distributed training, is available.

vissl.utils.tensorboard module