Visualize Bounding Boxes for Yolov5

2021-12-13

I already showed how to visualize bounding boxes based on YOLO input: https://czarrar.github.io/visualize-boxes/. But what if I wanted to do something similar but making greater use of code within Yolov5. Well here it is.

Code

Packages

Let’s load the relevant packages. Note that you need to put Yolov5 in your path. I got some help from this github issue: https://github.com/ultralytics/yolov5/issues/5033.

# Make sure yolob5 is in your path
import sys
sys.path.append("yolov5")

from utils.plots import Annotator
from utils.general import xywhn2xyxy ## def xywhn2xyxy(x, w=640, h=640, padw=0, padh=0)

from PIL import Image
import pandas as pd
import numpy as np

Inputs

We can first load in our sample image and box coordinates in Yolo format. Click on the following links for the input image and for the input coordinates.

# My inputs
im1 = Image.open("data/sample2/z_generated_00001.jpg")
df1 = pd.read_csv("data/sample2/z_generated_00001.txt", sep=" ", header=None, names=['class','x','y','w','h'])

The coordinates in df1 should look something like this below. This is following the Yolo format as seen here: https://roboflow.com/formats/yolov5-pytorch-txt.

ind class   x	        y	        w	        h
0   0	    0.701563	0.626389	0.596875	0.747222
1   3	    0.882812	0.327778	0.234375	0.311111

Plotting

Prepare Coordinates: We can drop the class column and convert it to a numpy array. Then we want to convert the format. The Yolo format is center x, center y, width, and height with each value normalized by the size of the image. In this case, the image is 640x360 pixels. We convert to xyxy or top left x, y and bottom right x, y un-normalized or raw pixel values.

Plot: We use the Annotator class provided by Yolov5. We input the numpy array version of the input image and mention that we want to keep things in PIL. We then plot our two boxes (you could have used a loop here instead with a list having the class names).

Using Annotator:

xywhn = df1.drop(['class'], axis=1).to_numpy()
box = xywhn2xyxy(xywhn, w=640, h=360) # default width and height are 640

ann = Annotator(np.ascontiguousarray(im1), pil=True)
ann.box_label(box[0,:], label='Slide', color=(255, 0, 0)) # red
ann.box_label(box[1,:], label='Camera', color=(0, 0, 255)) # blue
ann.im.show()

The result should look like: