Object detection task#

Object detection#

An object detection task is a process of drawing bounding boxes around one or multiple objects in an image. By drawing bounding boxes, you give coordinates and localize an object of interest in an image or a video frame. These bounding boxes encompass the object with local labels attached to each other.

Object detection for aerial imagery#

Object detection for aerial images requires a different convolutional neural network architecture than a standard object detection task, as well as different annotation tools. The main difference from the user’s perspective is the use of a different annotation tool - you can learn more about the tool here.

With standard bounding boxes, it is challenging to obtain the correct orientation estimation and accurate localization for objects captured from the top view. That is why rotated bounding boxes are used to solve this problem. To understand the difference, look at the example below:

Bounding Box

Rotated Bounding Box

Placeholder
Placeholder

Object detection vs classification#

Object detection and classification are not the same. However, classification is part of object detection.

Object detection combines classification, and localization to determine the position of classified objects in an image or video frame. Drawing bounding boxes belongs to localization whereas applying labels to bounding boxes is, what we call, classification.

Examples of leveraging object detection#

  • Security System: facial recognition

  • Industrial: a factory safety system detecting shop floor employees moving through different zones and triggering a warning when a forbidden zone is entered

  • Street Scene: an autonomous vehicle recognizing pedestrians’ presence and assessing their location, to decide if they are at a safe distance