A Scale Balanced Loss for Bounding Box Regression

Degang Sun; Yang Yang; Min Li; Jian Yang; Bo Meng; Ruwen Bai; Linghan Li; Junxing Ren

doi:10.1109/ACCESS.2020.3001234

A Scale Balanced Loss for Bounding Box Regression

Degang Sun, Yang Yang, Min Li^*, Jian Yang, Bo Meng, Ruwen Bai, Linghan Li, Junxing Ren

^*Corresponding author for this work

School of Optics and Photonics

Research output: Contribution to journal › Article › peer-review

13 Citations (Scopus)

Abstract

Object detectors typically use bounding box regressors to improve the accuracy of object localization. Currently, the two types of bounding box regression loss are \ell _{n} -norm-based and intersection over union ( IoU )-based. However, we found that these two types of losses have their drawbacks. First, for \ell _{n} -norm-based loss, large-scale objects are more likely to obtain a larger penalty than the smaller ones when calculating localization errors, which will cause regression loss imbalance. Second, \ell _{n} -norm-based loss has symmetry so that when the predicted bounding boxes are in some unique symmetrical relationships (i.e., Symmetric Trap), the regression loss remains unchanged. Third, for IoU -based loss, the overlap area and the union area do not change as the shape or relative position of two bounding boxes changes in some cases(i.e., Area Maze). To address these problems, we propose the scale balanced loss( \mathcal {L}_{SB} ), which is asymmetric, position-sensitive, and scale-invariant. First, in order to obtain the property of scale invariance, it is designed as a fraction to eliminate the scale information contained in the numerator and denominator. Second, by incorporating the Euclidean distance between different corner points instead of the area, \mathcal {L}_{SB} is sensitive to the changes of coordinates of any corner point, so as to solve the area maze problem. Finally, by incorporating the diagonals of the overlap and the smallest enclosing rectangle, this fraction is no longer symmetric, thus solving the symmetry trap problem. To validate the proposed algorithm, we have replaced the \ell _{n} -norm-based loss of YOLOv3 and SSD with \mathcal {L}_{GIoU} and \mathcal {L}_{SB} and evaluate their performance on Pascal Visual Object Classes and Microsoft Common Objects in Context benchmarks. The final results show that \mathcal {L}_{SB} has improved their average precisions at different IoU thresholds and scales. We envision that this regression loss can also improve the performance of other visual tasks.

Original language	English
Article number	9112187
Pages (from-to)	108438-108448
Number of pages	11
Journal	IEEE Access
Volume	8
DOIs	https://doi.org/10.1109/ACCESS.2020.3001234
Publication status	Published - 2020

Keywords

Object detection
bounding box
regression loss
scale imbalance

Access to Document

10.1109/ACCESS.2020.3001234

Cite this

@article{53102ec4a91c456fa8dcfce41b7f1222,

title = "A Scale Balanced Loss for Bounding Box Regression",

abstract = "Object detectors typically use bounding box regressors to improve the accuracy of object localization. Currently, the two types of bounding box regression loss are \ell _{n} -norm-based and intersection over union ( IoU )-based. However, we found that these two types of losses have their drawbacks. First, for \ell _{n} -norm-based loss, large-scale objects are more likely to obtain a larger penalty than the smaller ones when calculating localization errors, which will cause regression loss imbalance. Second, \ell _{n} -norm-based loss has symmetry so that when the predicted bounding boxes are in some unique symmetrical relationships (i.e., Symmetric Trap), the regression loss remains unchanged. Third, for IoU -based loss, the overlap area and the union area do not change as the shape or relative position of two bounding boxes changes in some cases(i.e., Area Maze). To address these problems, we propose the scale balanced loss( \mathcal {L}_{SB} ), which is asymmetric, position-sensitive, and scale-invariant. First, in order to obtain the property of scale invariance, it is designed as a fraction to eliminate the scale information contained in the numerator and denominator. Second, by incorporating the Euclidean distance between different corner points instead of the area, \mathcal {L}_{SB} is sensitive to the changes of coordinates of any corner point, so as to solve the area maze problem. Finally, by incorporating the diagonals of the overlap and the smallest enclosing rectangle, this fraction is no longer symmetric, thus solving the symmetry trap problem. To validate the proposed algorithm, we have replaced the \ell _{n} -norm-based loss of YOLOv3 and SSD with \mathcal {L}_{GIoU} and \mathcal {L}_{SB} and evaluate their performance on Pascal Visual Object Classes and Microsoft Common Objects in Context benchmarks. The final results show that \mathcal {L}_{SB} has improved their average precisions at different IoU thresholds and scales. We envision that this regression loss can also improve the performance of other visual tasks.",

keywords = "Object detection, bounding box, regression loss, scale imbalance",

author = "Degang Sun and Yang Yang and Min Li and Jian Yang and Bo Meng and Ruwen Bai and Linghan Li and Junxing Ren",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2020",

doi = "10.1109/ACCESS.2020.3001234",

language = "English",

volume = "8",

pages = "108438--108448",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - A Scale Balanced Loss for Bounding Box Regression

AU - Sun, Degang

AU - Yang, Yang

AU - Li, Min

AU - Yang, Jian

AU - Meng, Bo

AU - Bai, Ruwen

AU - Li, Linghan

AU - Ren, Junxing

PY - 2020

Y1 - 2020

N2 - Object detectors typically use bounding box regressors to improve the accuracy of object localization. Currently, the two types of bounding box regression loss are \ell _{n} -norm-based and intersection over union ( IoU )-based. However, we found that these two types of losses have their drawbacks. First, for \ell _{n} -norm-based loss, large-scale objects are more likely to obtain a larger penalty than the smaller ones when calculating localization errors, which will cause regression loss imbalance. Second, \ell _{n} -norm-based loss has symmetry so that when the predicted bounding boxes are in some unique symmetrical relationships (i.e., Symmetric Trap), the regression loss remains unchanged. Third, for IoU -based loss, the overlap area and the union area do not change as the shape or relative position of two bounding boxes changes in some cases(i.e., Area Maze). To address these problems, we propose the scale balanced loss( \mathcal {L}_{SB} ), which is asymmetric, position-sensitive, and scale-invariant. First, in order to obtain the property of scale invariance, it is designed as a fraction to eliminate the scale information contained in the numerator and denominator. Second, by incorporating the Euclidean distance between different corner points instead of the area, \mathcal {L}_{SB} is sensitive to the changes of coordinates of any corner point, so as to solve the area maze problem. Finally, by incorporating the diagonals of the overlap and the smallest enclosing rectangle, this fraction is no longer symmetric, thus solving the symmetry trap problem. To validate the proposed algorithm, we have replaced the \ell _{n} -norm-based loss of YOLOv3 and SSD with \mathcal {L}_{GIoU} and \mathcal {L}_{SB} and evaluate their performance on Pascal Visual Object Classes and Microsoft Common Objects in Context benchmarks. The final results show that \mathcal {L}_{SB} has improved their average precisions at different IoU thresholds and scales. We envision that this regression loss can also improve the performance of other visual tasks.

AB - Object detectors typically use bounding box regressors to improve the accuracy of object localization. Currently, the two types of bounding box regression loss are \ell _{n} -norm-based and intersection over union ( IoU )-based. However, we found that these two types of losses have their drawbacks. First, for \ell _{n} -norm-based loss, large-scale objects are more likely to obtain a larger penalty than the smaller ones when calculating localization errors, which will cause regression loss imbalance. Second, \ell _{n} -norm-based loss has symmetry so that when the predicted bounding boxes are in some unique symmetrical relationships (i.e., Symmetric Trap), the regression loss remains unchanged. Third, for IoU -based loss, the overlap area and the union area do not change as the shape or relative position of two bounding boxes changes in some cases(i.e., Area Maze). To address these problems, we propose the scale balanced loss( \mathcal {L}_{SB} ), which is asymmetric, position-sensitive, and scale-invariant. First, in order to obtain the property of scale invariance, it is designed as a fraction to eliminate the scale information contained in the numerator and denominator. Second, by incorporating the Euclidean distance between different corner points instead of the area, \mathcal {L}_{SB} is sensitive to the changes of coordinates of any corner point, so as to solve the area maze problem. Finally, by incorporating the diagonals of the overlap and the smallest enclosing rectangle, this fraction is no longer symmetric, thus solving the symmetry trap problem. To validate the proposed algorithm, we have replaced the \ell _{n} -norm-based loss of YOLOv3 and SSD with \mathcal {L}_{GIoU} and \mathcal {L}_{SB} and evaluate their performance on Pascal Visual Object Classes and Microsoft Common Objects in Context benchmarks. The final results show that \mathcal {L}_{SB} has improved their average precisions at different IoU thresholds and scales. We envision that this regression loss can also improve the performance of other visual tasks.

KW - Object detection

KW - bounding box

KW - regression loss

KW - scale imbalance

UR - http://www.scopus.com/inward/record.url?scp=85086991556&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2020.3001234

DO - 10.1109/ACCESS.2020.3001234

M3 - Article

AN - SCOPUS:85086991556

SN - 2169-3536

VL - 8

SP - 108438

EP - 108448

JO - IEEE Access

JF - IEEE Access

M1 - 9112187

ER -

A Scale Balanced Loss for Bounding Box Regression

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this