TY - GEN
T1 - UnitBox
T2 - 24th ACM Multimedia Conference, MM 2016
AU - Yu, Jiahui
AU - Jiang, Yuning
AU - Wang, Zhangyang
AU - Cao, Zhimin
AU - Huang, Thomas
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/10/1
Y1 - 2016/10/1
N2 - In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods. However, existing deep CNN methods assume the object bounds to be four independent variables, which could be regressed by the ℓ2 loss separately. Such an oversimplified assumption is contrary to the well-received observation, that those variables are correlated, resulting to less accurate localization. To address the issue, we firstly introduce a novel Intersection over Union (IoU) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit. By taking the advantages of IoU loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales, and converges fast. We apply UnitBox on face detection task and achieve the best performance among all published methods on the FDDB benchmark.
AB - In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods. However, existing deep CNN methods assume the object bounds to be four independent variables, which could be regressed by the ℓ2 loss separately. Such an oversimplified assumption is contrary to the well-received observation, that those variables are correlated, resulting to less accurate localization. To address the issue, we firstly introduce a novel Intersection over Union (IoU) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit. By taking the advantages of IoU loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales, and converges fast. We apply UnitBox on face detection task and achieve the best performance among all published methods on the FDDB benchmark.
KW - Bounding Box Prediction
KW - IoU Loss
KW - Object Detection
UR - http://www.scopus.com/inward/record.url?scp=84994560549&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84994560549&partnerID=8YFLogxK
U2 - 10.1145/2964284.2967274
DO - 10.1145/2964284.2967274
M3 - Conference contribution
AN - SCOPUS:84994560549
T3 - MM 2016 - Proceedings of the 2016 ACM Multimedia Conference
SP - 516
EP - 520
BT - MM 2016 - Proceedings of the 2016 ACM Multimedia Conference
PB - Association for Computing Machinery, Inc
Y2 - 15 October 2016 through 19 October 2016
ER -