Rotated bounding box detection using Detectron2

This tutorial will show you:

  1. How to produce rotated bounding box labels using labelme (https://github.com/wkentaro/labelme) for custom datasets

  2. How to configure and set up training for rotated bounding box detection using Detectron2

  3. How to visualize predictions and labels

Labelme polygon annotations

We are using labelme (https://github.com/wkentaro/labelme) polygon annotation to label the container ships in the images.

Data labeling

Install labelme

Labelme offers various options for installing the labeling GUI, please refer to the instructions here: https://github.com/wkentaro/labelme

Creating polygons

Start labelme with labelme --autosave --nodata --keep-prev. The GUI allows you to select the images to label one-by-one or based on a directory. It is highly recommended to place the images to label in a single directory, since a json file with the labels will be produced in the same location as the file.

The --autosave flag enables auto saving when moving from image to image The --nodata flag skips saving the actual image data in the json file that is produced for every image. Using --keep-prev can be considered optional, but it is very useful if the images are for example consecutive frames from a video since the option copies the labels from the previously labeled image to the current image.

You can create the polygon annotations with the “Create polygons” option in the GUI. The polygons will be used to determine the minimum-area rotated bounding box in the next step as a part of the model training. The logic to do this is provided out-of-the-box in the stray package that is used in the example scripts.

This is an example of a json file that is produced for each image (in this case the file is called 1.json):

{
  "version": "5.0.1",
  "flags": {},
  "shapes": [
    {
      "label": "ship",
      "points": [
        [
          239.0990099009901,
          420.2970297029703
        ],
        [
          423.25742574257424,
          338.1188118811881
        ],
        [
          444.54455445544556,
          345.049504950495
        ],
        [
          434.64356435643566,
          365.84158415841586
        ],
        [
          253.45544554455444,
          446.53465346534654
        ]
      ],
      "group_id": null,
      "shape_type": "polygon",
      "flags": {}
    }
  ],
  "imagePath": "1.png",
  "imageData": null,
  "imageHeight": 480,
  "imageWidth": 640
}

Installing Detectron2

Before we can visualize the annotations using visualization tools provided by Detectron2 and training the model, we need to install the package. Warning: this step may cause headaches.

Install PyTorch, OpenCV, and Detectron2

Before installing Detectron2, we need to have PyTorch installed. This means that we can’t provide a clean requirements.txt file with all the dependencies as there is no way to tell pip in which order to install the listed packages. Detectron also does not include the dependency in their install requirements for compatibility reasons.

Depending on whether you want to use a CPU or a GPU (if available) with Detectron2, install the proper version from https://pytorch.org/. The Detectron2 installation documentation also offers some background and debug steps if there are issues.

After installing torch, install Detectron2 using the instructions in the Detectron2 installation documentation.

Install rest of the dependencies with pip install -r requirements.txt.

Test the installation and visualize the dataset

Detectron2 rotated bounding boxes

The polygons are used to determine the rotated bounding boxes.

To see if everything works properly, you can run the visualization script (from stray/examples/detectron2) with python visualize_dataset.py <path-to-dataset> to visualize the annotations. As you can see the polygons are turned into rotated bounding boxes in the data loading step.

Training the model

To run the training, run python train_rotated_bbox.py <path-to-dataset> --num-gpus <gpus>. The script and the rotated_bbox_config.yaml file contain various ways to configure the training, see the files for details. By default, the final and intermediate weights of the model are saved in the current working directory (model_*.pth).

Predictions

Detectron2 predictions

Predictions from the trained model.

To visualize predictions from the trained model, run python visualize_predictions <path-to-dataset> --weights <path-to-pth-model>.