vissl.data package¶
-
class
vissl.data.
GenericSSLDataset
(cfg, split, dataset_source_map)[source]¶ Bases:
torch.utils.data.dataset.Dataset
Base Self Supervised Learning Dataset Class.
The GenericSSLDataset class is defined to support reading data from multiple data sources. For example: data = [dataset1, dataset2] and the minibatches generated will have the corresponding data from each dataset.
For this reason, we also support labels from multiple sources. For example targets = [dataset1 targets, dataset2 targets].
In order to support multiple data sources, the dataset configuration always has list inputs.
DATA_SOURCES, LABEL_SOURCES, DATASET_NAMES, DATA_PATHS, LABEL_PATHS
For several data sources, we also support specifying on what dataset the transforms should be applied. By default, apply the transforms on data from all datasets.
- Parameters
cfg (AttrDict) – configuration defined by user
split (str) – the dataset split for which we are constructing the Dataset object
dataset_source_map (Dict[str, Callable]) –
The dictionary that maps what data sources are supported and what object to use to read data from those sources. For example: DATASET_SOURCE_MAP = {
”disk_filelist”: DiskImageDataset, “disk_folder”: DiskImageDataset, “synthetic”: SyntheticImageDataset,
}
-
load_single_label_file
(path)[source]¶ Load the single data file. We only support user specifying the numpy label files if user is specifying a data_filelist source of labels.
To save memory, if the mmap_mode is set to True for loading, we try to load the images in mmap_mode. If it fails, we simply load the labels without mmap
-
__getitem__
(idx)[source]¶ Get the input sample for the minibatch for a specified data index. For each data object (if we are loading several datasets in a minibatch), we get the sample: consisting of {
image data,
label (if applicable) otherwise idx
data_valid: 0 or 1 indicating if the data is valid image
data_idx : index of the data in the dataset for book-keeping and debugging
}
Once the sample data is available, we apply the data transform on the sample.
The final transformed sample is returned to be added into the minibatch.
-
get_image_paths
()[source]¶ Get the image paths for all the data sources.
- Returns
image_paths (List[List[str]]) –
- list containing image paths list for each
data source.
-
get_available_splits
(dataset_config)[source]¶ Get the available splits in the dataset confir. Not specific to this split for which the SSLDataset is being constructed.
NOTE: this is deprecated method.
-
vissl.data.
get_data_files
(split, dataset_config)[source]¶ - Get the path to the dataset (images and labels).
If the user has explicitly specified the data_sources, we simply use those and don’t do lookup in the datasets registered with VISSL from the dataset catalog.
If the user hasn’t specified the path, look for the dataset in the datasets catalog registered with VISSL. For a given list of datasets and a given partition (train/test), we first verify that we have the dataset and the correct source as specified by the user. Then for each dataset in the list, we get the data path (make sure it exists, sources match). For the label file, the file is optional.
Once we have the dataset original paths, we replace the path with the local paths if the data was copied to local disk.
-
vissl.data.
register_datasets
(json_catalog_path)[source]¶ If the json dataset_catalog file is found, we register the datasets specified in the catalog with VISSL. If the catalog also specified VOC or coco datasets, we resister them
- Parameters
json_catalog_path (str) – the path to the json dataset catalog
-
class
vissl.data.
VisslDatasetCatalog
[source]¶ Bases:
object
A catalog that stores information about the datasets and how to obtain them. It contains a mapping from strings (which are names that identify a dataset, e.g. “imagenet1k”) to a dict which contains:
mapping of various data splits (train, test, val) to the data source (path on the disk whether a folder path or a filelist)
source of the data (disk_filelist | disk_folder)
The purpose of having this catalog is to make it easy to choose different datasets, by just using the strings in the config.
-
static
register_json
(json_catalog_path)[source]¶ - Parameters
filepath – a .json filepath that contains the data to be registered
-
static
register_dict
(dict_catalog)[source]¶ - Parameters
dict – a dict with a bunch of datasets to be registered
-
static
register_data
(name, data_dict)[source]¶ - Parameters
name (str) – the name that identifies a dataset, e.g. “imagenet1k_folder”.
func (callable) – a callable which takes no arguments and returns a list of dicts. It must return the same results if called multiple times.
vissl.data.collators module¶
-
vissl.data.collators.
register_collator
(name)[source]¶ Registers Self-Supervision data collators.
This decorator allows VISSL to add custom data collators, even if the collator itself is not part of VISSL. To use it, apply this decorator to a collator function, like this:
@register_collator('my_collator_name') def my_collator_name(): ...
To get a collator from a configuration file, see
get_collator()
.
vissl.data.collators.mixup_collator module¶
-
vissl.data.collators.mixup_collator.
multicrop_mixup_collator
(batch)[source]¶ This collator is used to mix-up 2 images at a time. 2*N input images becomes N images This collator can handle multi-crop input. For each crop, it mixes-up the corresponding crop of the next image.
- Input:
- batch: Example
- batch = [
{“data” : [img1_0, …, img1_k], ..}, {“data” : [img2_0, …, img2_k], …}, … {“data” : [img2N_0, …, img2N_k], …},
]
- Returns: Example output:
- output = [
- {
- “data”: [
torch.tensor([img1_2_0, …, img1_2_k]), torch.tensor([img3_4_0, …, img3_4_k]) …
]
},
]
vissl.data.collators.moco_collator module¶
-
vissl.data.collators.moco_collator.
moco_collator
(batch: List[Dict[str, Any]]) → Dict[str, List[torch.Tensor]][source]¶ This collator is specific to MoCo approach http://arxiv.org/abs/1911.05722
The collators collates the batch for the following input (assuming k-copies of image):
- Input:
- batch: Example
- batch = [
{“data” : [img1_0, …, img1_k], ..}, {“data” : [img2_0, …, img2_k], …}, …
]
- Returns: Example output:
- output = [
- {
“data”: torch.tensor([img1_0, …, img1_k], [img2_0, …, img2_k]) ..
},
]
Dimensions become [num_positives x Batch x C x H x W]
vissl.data.collators.multicrop_collator module¶
-
vissl.data.collators.multicrop_collator.
multicrop_collator
(batch)[source]¶ This collator is used in SwAV approach.
The collators collates the batch for the following input (assuming k-copies of image):
- Input:
- batch: Example
- batch = [
{“data” : [img1_0, …, img1_k], ..}, {“data” : [img2_0, …, img2_k], …}, …
]
- Returns: Example output:
- output = [
- {
“data”: torch.tensor([img1_0, …, imgN_0], [img1_k, …, imgN_k]) ..
},
]
vissl.data.collators.patch_and_image_collator module¶
-
vissl.data.collators.patch_and_image_collator.
patch_and_image_collator
(batch)[source]¶ This collator is used in PIRL approach.
- batch contains two keys “data” and “label”.
data is a list of N+1 elements. 1st element is the “image” and remainder N are patches.
label is an integer (image index in the dataset)
- We collate this to
image: batch_size tensor containing images patches: N * batch_size tensor containing patches
vissl.data.collators.siamese_collator module¶
-
vissl.data.collators.siamese_collator.
siamese_collator
(batch)[source]¶ This collator is used in Jigsaw approach.
- Input:
- batch: Example
- batch = [
{“data”: [img1,], “label”: [lbl1, ]}, #img1 {“data”: [img2,], “label”: [lbl2, ]}, #img2 . . {“data”: [imgN,], “label”: [lblN, ]}, #imgN
]
- where:
img{x} is a tensor of size: num_towers x C x H x W lbl{x} is an integer
- Returns: Example output:
- output = [
- {
“data”: torch.tensor([img1_0, …, imgN_0]) ..
},
] where the output is of dimension: (N * num_towers) x C x H x W
vissl.data.collators.simclr_collator module¶
-
vissl.data.collators.simclr_collator.
simclr_collator
(batch)[source]¶ This collator is used in SimCLR approach.
- The collators collates the batch for the following input (each image has k-copies):
input: [[img1_0, …, img1_k], [img2_0, …, img2_k], …, [imgN_0, …, imgN_k]] output: [img1_0, img2_0, ….., img1_1, img2_1,…]
- Input:
- batch: Example
- batch = [
{“data”: [img1_0, …, img1_k], “label”: [lbl1, ]}, #img1 {“data”: [img2_0, …, img2_k], “label”: [lbl2, ]}, #img2 . . {“data”: [imgN_0, …, imgN_k], “label”: [lblN, ]}, #imgN
]
- where:
img{x} is a tensor of size: C x H x W lbl{x} is an integer
- Returns: Example output:
- output = [
- {
“data”: torch.tensor([img1_0, img2_0, ….., img1_1, img2_1,…]) ..
},
]
vissl.data.collators.targets_one_hot_default_collator module¶
-
vissl.data.collators.targets_one_hot_default_collator.
convert_to_one_hot
(pos_lbl, neg_lbl, num_classes: int) → torch.Tensor[source]¶ This function converts target class indices to one-hot vectors, given the number of classes.
-> 1 for positive labels, -> 0 for negative and -> -1 for ignore labels.
-
vissl.data.collators.targets_one_hot_default_collator.
targets_one_hot_default_collator
(batch, num_classes: int)[source]¶ The collators collates the batch for the following input:
- Input:
input : [[img0, …, imgk]] label: [
[[1, 3, 6], [4, 9]] [[1, 5], [6, 8, 10, 11]] …..
]
- Output:
output: [img0, img0, …..,] label: [[0, 1, 0, 1, …, -1, 0, 0, 1], [0, 1, 0, 0, 0, 1, 0], ….]
vissl.data.ssl_transforms module¶
-
class
vissl.data.ssl_transforms.
SSLTransformsWrapper
(indices, **args)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
VISSL wraps around transforms so that they work with the multimodal input. VISSL supports batches that come from several datasets and sources. Hence the input batch (images, labels) always is a list.
To apply the user defined transforms, VISSL takes “indices” as input which defines on what dataset/source data in the sample should the transform be applied to. For example:
- Assuming input sample is {
“data”: [dataset1_imgX, dataset2_imgY], “label”: [dataset1_lblX, dataset2_lblY]
} and the transform is:
- TRANSFORMS:
name: RandomGrayscale p: 0.2 indices: 0
then the transform is applied only on dataset1_imgX. If however, the indices are either not specified or set to 0, 1 then the transform is applied on both dataset1_imgX and dataset2_imgY
Since this structure of data is introduced by vissl, the SSLTransformsWrapper takes care of dealing with the multi-modality input by wrapping the original transforms (pytorch transforms or custom transforms defined by user) and calling each transform on each index.
VISSL also supports _TRANSFORMS_WITH_LABELS transforms that modify the label or are used to generate the labels used in self-supervised learning tasks like Jigsaw. When the transforms in _TRANSFORMS_WITH_LABELS are called, the new label is also returned besides the transformed image.
VISSL also supports the _TRANSFORMS_WITH_COPIES which are transforms that basically generate several copies of image. Common example of self-supervised training methods that do this is SimCLR, SwAV, MoCo etc When a transform from _TRANSFORMS_WITH_COPIES is used, the SSLTransformsWrapper will flatten the transform output. For example for the input [img1], if we apply ImgReplicatePil to replicate the image 2 times:
- SSLTransformsWrapper(
ImgReplicatePil(num_times=2), [img1]
) will output [img1_1, img1_2] instead of nested list [[img1_1, img1_2]].
The benefit of this is that the next set of transforms specified by user can now operate on img1_1 and img1_2 as the input becomes multi-modal nature.
VISSL also supports _TRANSFORMS_WITH_GROUPING which essentially means that a single transform should be applied on the full multi-modal input together instead of separately. This is common transform used in BYOL/ For example:
- SSLTransformsWrapper(
- ImgPilMultiCropRandomApply(
RandomApply, prob=[0.0, 0.2]
), [img1_1, img1_2]
) this will apply RandomApply on img1_1 with prob=0.0 and on img1_2 with prob=0.2
-
__init__
(indices, **args)[source]¶ - Parameters
indices (List[int]) (Optional) – the indices list on which transform should be applied for the input which is always a list Example: minibatch of size=2 looks like [[img1], [img2]]). If indices is not specified, transform is applied to all the multi-modal input.
args (dict) – the arguments that the transform takes
-
__call__
(sample)[source]¶ Apply each transform on the specified indices of each entry in the input sample.
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.SSLTransformsWrapper[source]¶
-
vissl.data.ssl_transforms.
get_transform
(input_transforms_list)[source]¶ Given the list of user specified transforms, return the torchvision.transforms.Compose() version of the transforms. Each transform in the composition is SSLTransformsWrapper which wraps the original transforms to handle multi-modal nature of input.
vissl.data.ssl_transforms.img_patches_tensor module¶
-
class
vissl.data.ssl_transforms.img_patches_tensor.
ImgPatchesFromTensor
(num_patches=9, patch_jitter=21)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Create image patches from a torch Tensor or numpy array. This transform was proposed in Jigsaw - https://arxiv.org/abs/1603.09246
- Parameters
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_patches_tensor.ImgPatchesFromTensor[source]¶ Instantiates ImgPatchesFromTensor from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPatchesFromTensor instance.
vissl.data.ssl_transforms.img_pil_color_distortion module¶
-
class
vissl.data.ssl_transforms.img_pil_color_distortion.
ImgPilColorDistortion
(strength)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Apply Random color distortions to the input image. There are multiple different ways of applying these distortions. This implementation follows SimCLR - https://arxiv.org/abs/2002.05709 It randomly distorts the hue, saturation, brightness of an image and can randomly convert the image to grayscale.
-
__init__
(strength)[source]¶ - Parameters
strength (float) – A number used to quantify the strength of the color distortion.
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_pil_color_distortion.ImgPilColorDistortion[source]¶ Instantiates ImgPilColorDistortion from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPilColorDistortion instance.
-
vissl.data.ssl_transforms.img_pil_gaussian_blur module¶
-
class
vissl.data.ssl_transforms.img_pil_gaussian_blur.
ImgPilGaussianBlur
(p, radius_min, radius_max)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Apply Gaussian Blur to the PIL image. Take the radius and probability of application as the parameter.
This transform was used in SimCLR - https://arxiv.org/abs/2002.05709
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_pil_gaussian_blur.ImgPilGaussianBlur[source]¶ Instantiates ImgPilGaussianBlur from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPilGaussianBlur instance.
-
classmethod
vissl.data.ssl_transforms.img_pil_multicrop_random_apply module¶
-
class
vissl.data.ssl_transforms.img_pil_multicrop_random_apply.
ImgPilMultiCropRandomApply
(transforms: List[Dict[str, Any]], prob: float)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Apply a list of transforms on multi-crop input. The transforms are Randomly applied to each crop using the specified probability. This is used in BYOL https://arxiv.org/pdf/2006.07733.pdf
Multi-crops are several crops of a given image. This is most commonly used in contrastive learning. For example SimCLR, SwAV approaches use multi-crop input.
-
__init__
(transforms: List[Dict[str, Any]], prob: float)[source]¶ - Parameters
transforms (List(tranforms)) – List of transforms that should be applied to each crop.
prob (List(float)) –
Probability of RandomApply for the transforms composition on each crop. example: for 2 crop in BYOL, for solarization:
prob = [0.0, 0.2]
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_pil_multicrop_random_apply.ImgPilMultiCropRandomApply[source]¶ Instantiates ImgPilMultiCropRandomApply from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPilMultiCropRandomApply instance.
-
vissl.data.ssl_transforms.img_pil_random_color_jitter module¶
-
class
vissl.data.ssl_transforms.img_pil_random_color_jitter.
ImgPilRandomColorJitter
(strength, prob)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Apply Random color jitter to the input image. It randomly distorts the hue, saturation, brightness of an image.
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_pil_random_color_jitter.ImgPilRandomColorJitter[source]¶ Instantiates ImgPilRandomColorJitter from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPilRandomColorJitter instance.
-
classmethod
vissl.data.ssl_transforms.img_pil_random_photometric module¶
-
class
vissl.data.ssl_transforms.img_pil_random_photometric.
ImgPilRandomPhotometric
(p)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Randomly apply some photometric transforms to an image. This was used in PIRL - https://arxiv.org/abs/1912.01991
- The photometric transforms applied includes:
AutoContrast, RandomPosterize, RandomSharpness, RandomSolarize
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_pil_random_photometric.ImgPilRandomPhotometric[source]¶ Instantiates ImgPilRandomPhotometric from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPilRandomPhotometric instance.
vissl.data.ssl_transforms.img_pil_random_solarize module¶
-
class
vissl.data.ssl_transforms.img_pil_random_solarize.
ImgPilRandomSolarize
(prob: float)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Randomly apply solarization transform to an image. This was used in BYOL - https://arxiv.org/abs/2006.07733
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_pil_random_solarize.ImgPilRandomSolarize[source]¶ Instantiates ImgPilRandomSolarize from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPilRandomSolarize instance.
-
classmethod
vissl.data.ssl_transforms.img_pil_to_lab_tensor module¶
-
class
vissl.data.ssl_transforms.img_pil_to_lab_tensor.
ImgPil2LabTensor
(indices)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Convert a PIL image to LAB tensor of shape C x H x W This transform was proposed in Colorization - https://arxiv.org/abs/1603.08511
The input image is PIL Image. We first convert it to tensor HWC which has channel order RGB. We then convert the RGB to BGR and use OpenCV to convert the image to LAB. The LAB image is 8-bit image in range > L [0, 255], A [0, 255], B [0, 255]. We rescale it to: L [0, 100], A [-128, 127], B [-128, 127]
The output is image torch tensor.
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_pil_to_lab_tensor.ImgPil2LabTensor[source]¶ Instantiates ImgPil2LabTensor from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPil2LabTensor instance.
-
classmethod
vissl.data.ssl_transforms.img_pil_to_multicrop module¶
-
class
vissl.data.ssl_transforms.img_pil_to_multicrop.
ImgPilToMultiCrop
(total_num_crops, num_crops, size_crops, crop_scales)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Convert a PIL image to Multi-resolution Crops. The input is a PIL image and output is the list of image crops.
This transform was proposed in SwAV - https://arxiv.org/abs/2006.09882
-
__init__
(total_num_crops, num_crops, size_crops, crop_scales)[source]¶ Returns total_num_crops square crops of an image. Each crop is a random crop extracted according to the parameters specified in size_crops and crop_scales. For ease of use, one can specify num_crops which removes the need to repeat parameters.
- Parameters
Example usage: - (total_num_crops=2, num_crops=[1, 1],
size_crops=[224, 96], crop_scales=[(0.14, 1.), (0.05, 0.14)]) Extracts 2 crops total of size 224x224 and 96x96
- (total_num_crops=2, num_crops=[1, 2],
size_crops=[224, 96], crop_scales=[(0.14, 1.), (0.05, 0.14)]) Extracts 3 crops total: 1 of size 224x224 and 2 of size 96x96
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_pil_to_multicrop.ImgPilToMultiCrop[source]¶ Instantiates ImgPilToMultiCrop from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPilToMultiCrop instance.
-
vissl.data.ssl_transforms.img_pil_to_patches_and_image module¶
-
class
vissl.data.ssl_transforms.img_pil_to_patches_and_image.
ImgPilToPatchesAndImage
(crop_scale_image=0.08, 1.0, crop_size_image=224, crop_scale_patches=0.6, 1.0, crop_size_patches=255, permute_patches=True, num_patches=9)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Convert an input PIL image to Patches and Image This transform was proposed in PIRL - https://arxiv.org/abs/1912.01991.
- Input:
PIL Image
- Returns
- list containing N+1 elements
zeroth element: a RandomResizedCrop of the image
remainder: N patches extracted uniformly from a RandomResizedCrop
-
__init__
(crop_scale_image=0.08, 1.0, crop_size_image=224, crop_scale_patches=0.6, 1.0, crop_size_patches=255, permute_patches=True, num_patches=9)[source]¶ - Parameters
crop_scale_image (tuple of floats) – scale for RandomResizedCrop of image
crop_size_image (int) – size for RandomResizedCrop of image
crop_scale_patches (tuple of floats) – scale for RandomResizedCrop of patches
crop_size_patches (int) – size for RandomResizedCrop of patches
permute_patches (bool) – permute the patches in any order
num_patches (int) – number of patches to create. should be a square integer.
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_pil_to_patches_and_image.ImgPilToPatchesAndImage[source]¶ Instantiates ImgPilToPatchesAndImage from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPilToPatchesAndImage instance.
vissl.data.ssl_transforms.img_pil_to_raw_tensor module¶
-
class
vissl.data.ssl_transforms.img_pil_to_raw_tensor.
ImgPilToRawTensor
[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Convert a PIL image to Raw Tensor if we don’t want to apply the default division by 255 by torchvision.transforms.ToTensor()
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_pil_to_raw_tensor.ImgPilToRawTensor[source]¶ Instantiates ImgPilToRawTensor from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgPilToRawTensor instance.
-
classmethod
vissl.data.ssl_transforms.img_pil_to_tensor module¶
-
class
vissl.data.ssl_transforms.img_pil_to_tensor.
ImgToTensor
[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
The Transform that overrides the PyTorch transform to provide better transformation speed.
# credits: mannatsingh@fb.com
vissl.data.ssl_transforms.img_replicate_pil module¶
-
class
vissl.data.ssl_transforms.img_replicate_pil.
ImgReplicatePil
(num_times: int = 2)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Adds the same image multiple times to the batch K times so that the batch. Size is now N*K. Use the simclr_collator to convert into batches.
This transform is useful when generating multiple copies of the same image, for example, when training contrastive methods.
-
__init__
(num_times: int = 2)[source]¶ - Parameters
num_times (int) – how many times should the image be replicated.
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_replicate_pil.ImgReplicatePil[source]¶ Instantiates ImgReplicatePil from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgReplicatePil instance.
-
vissl.data.ssl_transforms.img_rotate_pil module¶
-
class
vissl.data.ssl_transforms.img_rotate_pil.
ImgRotatePil
(num_angles=4, num_rotations_per_img=1)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
Apply rotation to a PIL Image. Samples rotation angle from a set of predefined rotation angles.
Predefined rotation angles are sampled at equal intervals in the [0, 360) angle space where the number of angles is specified by num_angles.
This transform was used in RotNet - https://arxiv.org/abs/1803.07728
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.img_rotate_pil.ImgRotatePil[source]¶ Instantiates ImgRotatePil from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ImgRotatePil instance.
-
classmethod
vissl.data.ssl_transforms.pil_photometric_transforms_lib module¶
-
class
vissl.data.ssl_transforms.pil_photometric_transforms_lib.
TransformObject
[source]¶ Bases:
object
Helper object to that prints information about the transformation and other transforms can inherit from this.
-
class
vissl.data.ssl_transforms.pil_photometric_transforms_lib.
RandomValueApplier
(min_v, max_v, root_transform, vtype='float', closed_interval=False)[source]¶ Bases:
vissl.data.ssl_transforms.pil_photometric_transforms_lib.TransformObject
-
__init__
(min_v, max_v, root_transform, vtype='float', closed_interval=False)[source]¶ Applies a transform by sampling a random value between [min_v, max_v]
- Parameters
root_transform (transform object) – transform that will be applied. must accept a value as input.
vtype (string) – value type - either “float” or “int”
closed_interval (bool) – sample from [min_v, max_v] (when True) or [min_v, max_v) when False
-
-
vissl.data.ssl_transforms.pil_photometric_transforms_lib.
Sharpness
(img, v)[source]¶ Applies PIL.ImageEnhance.Sharpness to the image
-
vissl.data.ssl_transforms.pil_photometric_transforms_lib.
Solarize
(img, v)[source]¶ Applies PIL.ImageOps.solarize to the image
-
vissl.data.ssl_transforms.pil_photometric_transforms_lib.
Posterize
(img, v)[source]¶ Applies PIL.ImageOps.posterize to the image
-
vissl.data.ssl_transforms.pil_photometric_transforms_lib.
AutoContrast
(img, _)[source]¶ Applies PIL.ImageOps.autocontrast to the image
-
class
vissl.data.ssl_transforms.pil_photometric_transforms_lib.
RandomSharpnessTransform
(min_v=0.1, max_v=1.9, root_transform=<function Sharpness>, vtype='float')[source]¶ Bases:
vissl.data.ssl_transforms.pil_photometric_transforms_lib.RandomValueApplier
Randomly apply the Sharpness transformation with the random value selected from an interval.
-
class
vissl.data.ssl_transforms.pil_photometric_transforms_lib.
RandomPosterizeTransform
(min_v=4, max_v=8, root_transform=<function Posterize>, vtype='int')[source]¶ Bases:
vissl.data.ssl_transforms.pil_photometric_transforms_lib.RandomValueApplier
-
class
vissl.data.ssl_transforms.pil_photometric_transforms_lib.
RandomSolarizeTransform
(min_v=0, max_v=256, root_transform=<function Solarize>, vtype='int')[source]¶ Bases:
vissl.data.ssl_transforms.pil_photometric_transforms_lib.RandomValueApplier
-
class
vissl.data.ssl_transforms.pil_photometric_transforms_lib.
AutoContrastTransform
[source]¶ Bases:
vissl.data.ssl_transforms.pil_photometric_transforms_lib.TransformObject
Wraps the AutoContrast method
vissl.data.ssl_transforms.shuffle_img_patches module¶
-
class
vissl.data.ssl_transforms.shuffle_img_patches.
ShuffleImgPatches
(perm_file: str)[source]¶ Bases:
classy_vision.dataset.transforms.classy_transform.ClassyTransform
This transform is used to shuffle the list of tensors (usually image patches of shape C x H x W) according to a randomly selected permutation from a pre-defined set of permutations.
This is a common operation used in Jigsaw approach https://arxiv.org/abs/1603.09246
-
__init__
(perm_file: str)[source]¶ - Parameters
perm_file (string) – path to the file containing pre-defined permutations.
-
__call__
(input_patches)[source]¶ The interface __call__ is used to transform the input data. It should contain the actual implementation of data transform.
- Parameters
input_patches (List[torch.tensor]) – list of torch tensors
-
classmethod
from_config
(config: Dict[str, Any]) → vissl.data.ssl_transforms.shuffle_img_patches.ShuffleImgPatches[source]¶ Instantiates ShuffleImgPatches from configuration.
- Parameters
config (Dict) – arguments for for the transform
- Returns
ShuffleImgPatches instance.
-
vissl.data.data_helper module¶
-
vissl.data.data_helper.
get_mean_image
(crop_size)[source]¶ Helper function that returns a gray PIL image of the size specified by user.
- Parameters
crop_size (int) – used to generate (crop_size x crop_size x 3) image.
- Returns
img – PIL Image
-
class
vissl.data.data_helper.
StatefulDistributedSampler
(dataset, batch_size=None)[source]¶ Bases:
torch.utils.data.distributed.DistributedSampler
More fine-grained state DataSampler that uses training iteration and epoch both for shuffling data. PyTorch DistributedSampler only uses epoch for the shuffling and starts sampling data from the start. In case of training on very large data, we train for one epoch only and when we resume training, we want to resume the data sampler from the training iteration.
-
__init__
(dataset, batch_size=None)[source]¶ Initializes the instance of StatefulDistributedSampler. Random seed is set for the epoch set and data is shuffled. For starting the sampling, use the start_iter (set to 0 or set by checkpointing resuming) to sample data from the remaining images.
- Parameters
dataset (Dataset) – Pytorch dataset that sampler will shuffle
batch_size (int) – batch size we want the sampler to sample
-
-
class
vissl.data.data_helper.
QueueDataset
(queue_size)[source]¶ Bases:
torch.utils.data.dataset.Dataset
This class helps dealing with the invalid images in the dataset by using two queue. One queue is used to enqueue seen and valid images from previous batches. The other queue is used to dequeue. The class is implemented such that the same batch will never have duplicate images. If we can’t dequeue a valid image, we return None for that instance.
- Parameters
queue_size – size the the queue (ideally set it to batch_size). Both queues will be of the same size
vissl.data.dataloader_sync_gpu_wrapper module¶
-
class
vissl.data.dataloader_sync_gpu_wrapper.
DataloaderSyncGPUWrapper
(dataloader: Iterable)[source]¶ Bases:
classy_vision.dataset.dataloader_wrapper.DataloaderWrapper
Dataloader which wraps another dataloader, and moves the data to GPU in async manner so as to overlap the cost of copying data from cpu to gpu with the previous model iteration.
vissl.data.ssl_dataset module¶
-
class
vissl.data.ssl_dataset.
GenericSSLDataset
(cfg, split, dataset_source_map)[source]¶ Bases:
torch.utils.data.dataset.Dataset
Base Self Supervised Learning Dataset Class.
The GenericSSLDataset class is defined to support reading data from multiple data sources. For example: data = [dataset1, dataset2] and the minibatches generated will have the corresponding data from each dataset.
For this reason, we also support labels from multiple sources. For example targets = [dataset1 targets, dataset2 targets].
In order to support multiple data sources, the dataset configuration always has list inputs.
DATA_SOURCES, LABEL_SOURCES, DATASET_NAMES, DATA_PATHS, LABEL_PATHS
For several data sources, we also support specifying on what dataset the transforms should be applied. By default, apply the transforms on data from all datasets.
- Parameters
cfg (AttrDict) – configuration defined by user
split (str) – the dataset split for which we are constructing the Dataset object
dataset_source_map (Dict[str, Callable]) –
The dictionary that maps what data sources are supported and what object to use to read data from those sources. For example: DATASET_SOURCE_MAP = {
”disk_filelist”: DiskImageDataset, “disk_folder”: DiskImageDataset, “synthetic”: SyntheticImageDataset,
}
-
load_single_label_file
(path)[source]¶ Load the single data file. We only support user specifying the numpy label files if user is specifying a data_filelist source of labels.
To save memory, if the mmap_mode is set to True for loading, we try to load the images in mmap_mode. If it fails, we simply load the labels without mmap
-
__getitem__
(idx)[source]¶ Get the input sample for the minibatch for a specified data index. For each data object (if we are loading several datasets in a minibatch), we get the sample: consisting of {
image data,
label (if applicable) otherwise idx
data_valid: 0 or 1 indicating if the data is valid image
data_idx : index of the data in the dataset for book-keeping and debugging
}
Once the sample data is available, we apply the data transform on the sample.
The final transformed sample is returned to be added into the minibatch.
-
get_image_paths
()[source]¶ Get the image paths for all the data sources.
- Returns
image_paths (List[List[str]]) –
- list containing image paths list for each
data source.
-
get_available_splits
(dataset_config)[source]¶ Get the available splits in the dataset confir. Not specific to this split for which the SSLDataset is being constructed.
NOTE: this is deprecated method.
vissl.data.disk_dataset module¶
-
class
vissl.data.disk_dataset.
DiskImageDataset
(cfg, data_source, path, split, dataset_name)[source]¶ Bases:
vissl.data.data_helper.QueueDataset
Base Dataset class for loading images from Disk. Can load a predefined list of images or all images inside a folder.
Inherits from QueueDataset class in VISSL to provide better handling of the invalid images by replacing them with the valid and seen images.
- Parameters
cfg (AttrDict) – configuration defined by user
data_source (string) – data source either of “disk_filelist” or “disk_folder”
path (string) –
can be either of the following 1. A .npy file containing a list of filepaths.
In this case data_source = “disk_filelist”
A folder such that folder/split contains images. In this case data_source = “disk_folder”
split (string) – specify split for the dataset. Usually train/val/test. Used to read images if reading from a folder path and retrieve settings for that split from the config path.
dataset_name (string) – name of dataset. For information only.
NOTE: This dataset class only returns images (not labels or other metdata). To load labels you must specify them in LABEL_SOURCES (See ssl_dataset.py). LABEL_SOURCES follows a similar convention as the dataset and can either be a filelist or a torchvision ImageFolder compatible folder - 1. Store labels in a numpy file 2. Store images in a nested directory structure so that torchvision ImageFolder
dataset can infer the labels.
-
__getitem__
(idx)[source]¶ We do delayed loading of data to reduce the memory size due to pickling of dataset across dataloader workers.
Loads the data if not already loaded.
Sets and initializes the queue if not already initialized
Depending on the data source (folder or filelist), get the image. If using the QueueDataset and image is valid, save the image in queue if not full. Otherwise return a valid seen image from the queue if queue is not empty.
vissl.data.synthetic_dataset module¶
-
class
vissl.data.synthetic_dataset.
SyntheticImageDataset
(cfg, path, split, dataset_name, data_source='synthetic')[source]¶ Bases:
torch.utils.data.dataset.Dataset
Synthetic dataset class. Mean image is returned always. This dataset is used/recommended to use for testing purposes only.
- Parameters
path (string) – can be “” [not used]
split (string) – specify split for the dataset. Usually train/val/test. Used to read images if reading from a folder `path’ and retrieve settings for that split from the config path [not used]
dataset_name (string) – name of dataset. For information only. [not used]
data_source (string, Optional) – data source (“synthetic”) [not used]
vissl.data.dataset_catalog module¶
Data and labels file for various datasets.
-
class
vissl.data.dataset_catalog.
VisslDatasetCatalog
[source]¶ Bases:
object
A catalog that stores information about the datasets and how to obtain them. It contains a mapping from strings (which are names that identify a dataset, e.g. “imagenet1k”) to a dict which contains:
mapping of various data splits (train, test, val) to the data source (path on the disk whether a folder path or a filelist)
source of the data (disk_filelist | disk_folder)
The purpose of having this catalog is to make it easy to choose different datasets, by just using the strings in the config.
-
static
register_json
(json_catalog_path)[source]¶ - Parameters
filepath – a .json filepath that contains the data to be registered
-
static
register_dict
(dict_catalog)[source]¶ - Parameters
dict – a dict with a bunch of datasets to be registered
-
static
register_data
(name, data_dict)[source]¶ - Parameters
name (str) – the name that identifies a dataset, e.g. “imagenet1k_folder”.
func (callable) – a callable which takes no arguments and returns a list of dicts. It must return the same results if called multiple times.
-
vissl.data.dataset_catalog.
get_local_path
(input_file, dest_dir)[source]¶ If user specified copying data to a local directory, get the local path where the data files were copied.
If input_file is just a file, we return the dest_dir/filename
If the intput_file is a directory, then we check if the environemt is SLURM and use slurm_dir or otherwise dest_dir to look up copy_complete file is available. If available, we return the directory.
If both above fail, we return the input_file as is.
-
vissl.data.dataset_catalog.
get_local_output_filepaths
(input_files, dest_dir)[source]¶ If we have copied the files to local disk as specified in the config, we return those local paths. Otherwise return the original paths.
-
vissl.data.dataset_catalog.
check_data_exists
(data_files)[source]¶ Check that the input data files exist. If the data_files is a list, we iteratively check for each file in the list.
-
vissl.data.dataset_catalog.
register_pascal_voc
()[source]¶ Register PASCAL VOC 2007 and 2012 datasets to the data catalog. We first look up for these datasets paths in the dataset catalog, if the paths exist, we register, otherwise we remove the voc_data from the catalog registry.
-
vissl.data.dataset_catalog.
register_coco
()[source]¶ Register COCO 2004 datasets to the data catalog. We first look up for these datasets paths in the dataset catalog, if the paths exist, we register, otherwise we remove the coco2014_folder from the catalog registry.
-
vissl.data.dataset_catalog.
register_datasets
(json_catalog_path)[source]¶ If the json dataset_catalog file is found, we register the datasets specified in the catalog with VISSL. If the catalog also specified VOC or coco datasets, we resister them
- Parameters
json_catalog_path (str) – the path to the json dataset catalog
-
vissl.data.dataset_catalog.
get_data_files
(split, dataset_config)[source]¶ - Get the path to the dataset (images and labels).
If the user has explicitly specified the data_sources, we simply use those and don’t do lookup in the datasets registered with VISSL from the dataset catalog.
If the user hasn’t specified the path, look for the dataset in the datasets catalog registered with VISSL. For a given list of datasets and a given partition (train/test), we first verify that we have the dataset and the correct source as specified by the user. Then for each dataset in the list, we get the data path (make sure it exists, sources match). For the label file, the file is optional.
Once we have the dataset original paths, we replace the path with the local paths if the data was copied to local disk.