UnitBox: An advanced object detection network

Jiahui Yu, Yuning Jiang, Zhangyang Wang, Zhimin Cao, Thomas Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods. However, existing deep CNN methods assume the object bounds to be four independent variables, which could be regressed by the ℓ2 loss separately. Such an oversimplified assumption is contrary to the well-received observation, that those variables are correlated, resulting to less accurate localization. To address the issue, we firstly introduce a novel Intersection over Union (IoU) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit. By taking the advantages of IoU loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales, and converges fast. We apply UnitBox on face detection task and achieve the best performance among all published methods on the FDDB benchmark.

Original languageEnglish (US)
Title of host publicationMM 2016 - Proceedings of the 2016 ACM Multimedia Conference
PublisherAssociation for Computing Machinery, Inc
Pages516-520
Number of pages5
ISBN (Electronic)9781450336031
DOIs
StatePublished - Oct 1 2016
Event24th ACM Multimedia Conference, MM 2016 - Amsterdam, United Kingdom
Duration: Oct 15 2016Oct 19 2016

Publication series

NameMM 2016 - Proceedings of the 2016 ACM Multimedia Conference

Other

Other24th ACM Multimedia Conference, MM 2016
CountryUnited Kingdom
CityAmsterdam
Period10/15/1610/19/16

Fingerprint

Neural networks
Face recognition
Object detection

Keywords

  • Bounding Box Prediction
  • IoU Loss
  • Object Detection

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Yu, J., Jiang, Y., Wang, Z., Cao, Z., & Huang, T. (2016). UnitBox: An advanced object detection network. In MM 2016 - Proceedings of the 2016 ACM Multimedia Conference (pp. 516-520). (MM 2016 - Proceedings of the 2016 ACM Multimedia Conference). Association for Computing Machinery, Inc. https://doi.org/10.1145/2964284.2967274

UnitBox : An advanced object detection network. / Yu, Jiahui; Jiang, Yuning; Wang, Zhangyang; Cao, Zhimin; Huang, Thomas.

MM 2016 - Proceedings of the 2016 ACM Multimedia Conference. Association for Computing Machinery, Inc, 2016. p. 516-520 (MM 2016 - Proceedings of the 2016 ACM Multimedia Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yu, J, Jiang, Y, Wang, Z, Cao, Z & Huang, T 2016, UnitBox: An advanced object detection network. in MM 2016 - Proceedings of the 2016 ACM Multimedia Conference. MM 2016 - Proceedings of the 2016 ACM Multimedia Conference, Association for Computing Machinery, Inc, pp. 516-520, 24th ACM Multimedia Conference, MM 2016, Amsterdam, United Kingdom, 10/15/16. https://doi.org/10.1145/2964284.2967274
Yu J, Jiang Y, Wang Z, Cao Z, Huang T. UnitBox: An advanced object detection network. In MM 2016 - Proceedings of the 2016 ACM Multimedia Conference. Association for Computing Machinery, Inc. 2016. p. 516-520. (MM 2016 - Proceedings of the 2016 ACM Multimedia Conference). https://doi.org/10.1145/2964284.2967274
Yu, Jiahui ; Jiang, Yuning ; Wang, Zhangyang ; Cao, Zhimin ; Huang, Thomas. / UnitBox : An advanced object detection network. MM 2016 - Proceedings of the 2016 ACM Multimedia Conference. Association for Computing Machinery, Inc, 2016. pp. 516-520 (MM 2016 - Proceedings of the 2016 ACM Multimedia Conference).
@inproceedings{268e88c096014d088550b3f9b077dc48,
title = "UnitBox: An advanced object detection network",
abstract = "In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods. However, existing deep CNN methods assume the object bounds to be four independent variables, which could be regressed by the ℓ2 loss separately. Such an oversimplified assumption is contrary to the well-received observation, that those variables are correlated, resulting to less accurate localization. To address the issue, we firstly introduce a novel Intersection over Union (IoU) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit. By taking the advantages of IoU loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales, and converges fast. We apply UnitBox on face detection task and achieve the best performance among all published methods on the FDDB benchmark.",
keywords = "Bounding Box Prediction, IoU Loss, Object Detection",
author = "Jiahui Yu and Yuning Jiang and Zhangyang Wang and Zhimin Cao and Thomas Huang",
year = "2016",
month = "10",
day = "1",
doi = "10.1145/2964284.2967274",
language = "English (US)",
series = "MM 2016 - Proceedings of the 2016 ACM Multimedia Conference",
publisher = "Association for Computing Machinery, Inc",
pages = "516--520",
booktitle = "MM 2016 - Proceedings of the 2016 ACM Multimedia Conference",

}

TY - GEN

T1 - UnitBox

T2 - An advanced object detection network

AU - Yu, Jiahui

AU - Jiang, Yuning

AU - Wang, Zhangyang

AU - Cao, Zhimin

AU - Huang, Thomas

PY - 2016/10/1

Y1 - 2016/10/1

N2 - In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods. However, existing deep CNN methods assume the object bounds to be four independent variables, which could be regressed by the ℓ2 loss separately. Such an oversimplified assumption is contrary to the well-received observation, that those variables are correlated, resulting to less accurate localization. To address the issue, we firstly introduce a novel Intersection over Union (IoU) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit. By taking the advantages of IoU loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales, and converges fast. We apply UnitBox on face detection task and achieve the best performance among all published methods on the FDDB benchmark.

AB - In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods. However, existing deep CNN methods assume the object bounds to be four independent variables, which could be regressed by the ℓ2 loss separately. Such an oversimplified assumption is contrary to the well-received observation, that those variables are correlated, resulting to less accurate localization. To address the issue, we firstly introduce a novel Intersection over Union (IoU) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit. By taking the advantages of IoU loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales, and converges fast. We apply UnitBox on face detection task and achieve the best performance among all published methods on the FDDB benchmark.

KW - Bounding Box Prediction

KW - IoU Loss

KW - Object Detection

UR - http://www.scopus.com/inward/record.url?scp=84994560549&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84994560549&partnerID=8YFLogxK

U2 - 10.1145/2964284.2967274

DO - 10.1145/2964284.2967274

M3 - Conference contribution

AN - SCOPUS:84994560549

T3 - MM 2016 - Proceedings of the 2016 ACM Multimedia Conference

SP - 516

EP - 520

BT - MM 2016 - Proceedings of the 2016 ACM Multimedia Conference

PB - Association for Computing Machinery, Inc

ER -