TY - GEN
T1 - Improving Real-Time Construction Equipment Detection by Learning to Correct False Positives
AU - Tang, Shuai
AU - Chen, Peng
AU - Yu, Liang
AU - Golparvar-Fard, Mani
N1 - Publisher Copyright:
© 2020 American Society of Civil Engineers.
PY - 2020
Y1 - 2020
N2 - Detecting construction equipment such as trucks, excavators, and mobile cranes from surveillance cameras is a vital part to provide robust and stable construction safety monitoring. A robust and real-time detection system is the key to success. In this paper, we investigate YOLOv3, a real-time object detector, for detecting common vehicle, i.e., car, bus, and truck as a preliminary study to gain insight on improving construction equipment detection. We prefer YOLOv3 for its superior accelerated inference speed, moderate model size, and efficient performance. Our major findings include (1) pretrained general-purpose YOLOv3 equipment detection model is adequate but often misses equipment at the far-end side of camera view; (2) we choose traffic scenes resembles construction sites, our best YOLOv3 model reaches 65.8% mean average precision (mAP) on test set; (3) YOLOv3-based equipment detection model has difficulty to transfer to novel scenes. When trained on 4 scenes and tested on 3 other scenes, test mAP drops to 31%; (4) Systematic confusion with background results in many vehicle detection false positives. We considerably improve overall detection mAP by learning a graph convolutional network (GCN) model to predict if detections are false positives. This GCN model improves equipment detection mAP from 65.8% to 69.3%.
AB - Detecting construction equipment such as trucks, excavators, and mobile cranes from surveillance cameras is a vital part to provide robust and stable construction safety monitoring. A robust and real-time detection system is the key to success. In this paper, we investigate YOLOv3, a real-time object detector, for detecting common vehicle, i.e., car, bus, and truck as a preliminary study to gain insight on improving construction equipment detection. We prefer YOLOv3 for its superior accelerated inference speed, moderate model size, and efficient performance. Our major findings include (1) pretrained general-purpose YOLOv3 equipment detection model is adequate but often misses equipment at the far-end side of camera view; (2) we choose traffic scenes resembles construction sites, our best YOLOv3 model reaches 65.8% mean average precision (mAP) on test set; (3) YOLOv3-based equipment detection model has difficulty to transfer to novel scenes. When trained on 4 scenes and tested on 3 other scenes, test mAP drops to 31%; (4) Systematic confusion with background results in many vehicle detection false positives. We considerably improve overall detection mAP by learning a graph convolutional network (GCN) model to predict if detections are false positives. This GCN model improves equipment detection mAP from 65.8% to 69.3%.
UR - http://www.scopus.com/inward/record.url?scp=85096800799&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096800799&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85096800799
T3 - Construction Research Congress 2020: Computer Applications - Selected Papers from the Construction Research Congress 2020
SP - 1300
EP - 1309
BT - Construction Research Congress 2020
A2 - Tang, Pingbo
A2 - Grau, David
A2 - El Asmar, Mounir
PB - American Society of Civil Engineers
T2 - Construction Research Congress 2020: Computer Applications
Y2 - 8 March 2020 through 10 March 2020
ER -