Instance Retrieval and Copy detection Benchmarks

It has been shown that self-supervised models have state-of-the art performance on Instance Retrieval and Copy Detection. VISSL supports benchmarking for the following datasets: ROxford, RParis, and CopyDays.

Setting Up Datasets

To setup the datasets, please follow the steps below.

Revisited Oxford/Paris

These datasets test instance retrieval for landmarks in Oxford and Paris and are an update of the original Oxford/Paris datasets. For more information about this dataset see here.

To setup the datasets, we have convenience scripts for RParis and ROxford. For example:

python extra_scripts/datasets/create_oxford_dataset.py
    -i /path/to/roxford/
    -o /output_path/roxford
    -d

CopyDays

These datasets test copy detection performance. For more information about this dataset see here.

To setup the dataset, please setup the datasets according to these `instructions<https://lear.inrialpes.fr/~jegou/data.php>`_. For example:

Evaluating the Datasets

At a high level, the features for the database, query, and train images are extracted as follows:

  • Step1: Images are loaded, resized, normalized, and converted to a Tensor.

  • Step2: Images are fed to the model. Optionally you can feed the image to the model with multiple scalings, using IMG_SCALINGS=[1, 0.5] for example.

  • Step3 (Optional): Post-processing is performed on the model output. You can use no post-processing, gem, or rmac. This option is controlled by FEATS_PROCESSING_TYPE.

  • Step4 (Optional): Normalize the features. This option is controlled by NORMALIZE_FEATURES.

The entire evaluation is as follows:

  • Step1 (Optional): Extract the train features as above and train PCA on the features. You must set EVAL_DATASET_NAME and TRAIN_PCA_WHITENING: True.

  • Step2: Extract the database and query features as above.

  • Step3 (Optional): Optionally apply the PCA fit in step-1 to the database and query images.

  • Step4: For each query image, rank the database according to the SIMILARITY_MEASURE.

  • Step5: Evaluate based on the Mean Average Precision metric.

We offer several configs for evaluating these datasets. See eval_resnet_1gpu_roxford.yaml, eval_resnet_1gpu_rparis.yaml, and eval_resnet_1gpu_roxford.yaml

Here is an example of the config options:

MODEL:
  FEATURE_EVAL_SETTINGS:
    EVAL_MODE_ON: True
    FREEZE_TRUNK_ONLY: True
    EXTRACT_TRUNK_FEATURES_ONLY: True
    SHOULD_FLATTEN_FEATS: false
    # Which feature layer to use to evaluate.
    LINEAR_EVAL_FEAT_POOL_OPS_MAP: [
        ["res5", ["Identity", []]],
    ]
IMG_RETRIEVAL:
  ########################## Dataset Information #############################
  TRAIN_DATASET_NAME: roxford5k
  EVAL_DATASET_NAME: rparis6k
  DATASET_PATH: <enter dataset path>
  # valid only if we are training whitening on the whitening dataset
  WHITEN_IMG_LIST: ""
  # Path to the compute_ap binary to evaluate Oxford / Paris
  EVAL_BINARY_PATH: ""
  # Sets data limits for the number of training, query, and database samples.
  DEBUG_MODE: False
  # Number of training samples to use. -1 uses all the samples in the dataset.
  NUM_TRAINING_SAMPLES: -1
  # Number of query samples to use. -1 uses all the samples in the dataset.
  NUM_QUERY_SAMPLES: -1
  # Number of database samples to use. -1 uses all the samples in the dataset.
  NUM_DATABASE_SAMPLES: -1
  # Whether or not to use distractor images. Distractors should be under DATASET_PATH/distractors dir.
  USE_DISTRACTORS: False
  # IMG_SCALINGS=List[int], where features are extracted for each
  # image scale and averaged. Default is [1], meaning that only the full
  # image is processed.
  IMG_SCALINGS: [1]
  # cosine_similarity | l2.
  SIMILARITY_MEASURE: cosine_similarity
  ######################## Features Processing Hypers #######################
  # Resize larger side of image to RESIZE_IMG pixel
  RESIZE_IMG: 1024
  # RMAC spatial levels. See https://arxiv.org/pdf/1511.05879.pdf.
  SPATIAL_LEVELS: 3
  # output dimension of PCA
  N_PCA: 512
  # Whether to apply PCA/whitening or not
  TRAIN_PCA_WHITENING: True
  # gem  | rmac | "" (no post-processing)
  FEATS_PROCESSING_TYPE: ""
  # valid only for GeM pooling of features. Note that GEM_POOL_POWER=1 equates to average pooling.
  GEM_POOL_POWER: 4.0
  # Whether or not to crop the query images with the given region of interests --
  # Relevant for Oxford, Paris, ROxford, and RParis datasets.
  # Our experiments with RN-50/rmac show that ROI cropping degrades performance.
  CROP_QUERY_ROI: False
  # Whether or not to apply L2 norm after the features have been post-processed.
  # Normalization is heavily recommended based on experiments run.
  NORMALIZE_FEATURES: True
  ######################## Misc #######################
  # Whether or not to save the retrieval ranking scores (metrics, rankings, similarity scores)
  SAVE_RETRIEVAL_RANKINGS_SCORES: True
  # Whether or not to save the features that were extracted
  SAVE_FEATURES: False

Evaluating ROxford

python tools/run_distributed_engines.py config=benchmark/instance_retrieval/eval_resnet_1gpu_roxford.yaml

Evaluating RParis

python tools/run_distributed_engines.py config=benchmark/instance_retrieval/eval_resnet_1gpu_rparis.yaml

Evaluating Copydays

python tools/run_distributed_engines.py config=benchmark/instance_retrieval/eval_resnet_1gpu_copydays.yaml