Object Localization
Last updated
Last updated
Object Localization is the process of locating an object in an image, and creating a bounding box around the object once localized.
We use CNNs for object classification. However, they can also be used for localization simultaneously.
To do this, we must add the parameters to the softmax output where are the coordinates for the center of the required bounding box and are its height and width respectively.
Note that the training images must contain bounding boxes too (with the 4 parameters) so as to be able to learn the parameters.
In fact, every training image has the following vector associated with it:
[]
where p=1 if there is an object in the image and c is the label of the object.
If p=0 (no object in the image), then the vector becomes [0, ?, ?, ?, ?, ?] where ?s denote "don't-care" values.
(c can be one-hot encoded).