TY - JOUR
T1 - End-to-end vision-based detection, tracking and activity analysis of earthmoving equipment filmed at ground level
AU - Roberts, Dominic
AU - Golparvar-Fard, Mani
N1 - Funding Information:
The authors would like to acknowledge the financial support of National Science Foundation (NSF) Grants 1544999 and 1446765 . The support of Juan Carlos Niebles in elaborating our activity analysis method in its first version is very much appreciated, as is the support of many members of the Real-time and Automated Monitoring and Control (RAAMAC) lab including former graduate students Arsalan Heydarian, Milad Memarzadeh, and Ruxiao Bao in collecting, annotating, and labeling of videos and Jun Young Gwak in helping refine the activity analysis module. The support of our industry partners in consenting to our requests to access and record their jobsites is also very much appreciated. Finally, the authors would like to thank Hongjo Kim from Yonsei University for providing access to the AIM dataset. Any opinions, findings, and conclusion or recommendations expressed in this material are those of the authors and do not necessarily reflect the view of the NSF, industry partners, or professionals mentioned above.
Publisher Copyright:
© 2019
PY - 2019/9
Y1 - 2019/9
N2 - This paper presents a new benchmark dataset for validating vision-based methods that automatically identifies visually distinctive working activities of excavators and dump trucks from individual frames of a video sequence. Our dataset consists of 10 videos of interacting pairs of construction equipment filmed at ground level with accompanying ground truth annotations. These annotations consist of per-equipment and per-frame equipment bounding boxes that also have associated identities and activity labels. Our videos depict an excavator interacting with 1 or more dump trucks. We also propose a deep learning-based method for detecting and tracking objects based on Convolutional Neural Networks (CNNs). The tracking trajectories are fed into a Hidden Markov Model (HMM)that automatically discovers and assigns activity labels for any observed object. Our HMM method leverages trajectories to train a Gaussian Mixture Model (GMM)with which we estimate the probability density function of each activity using Support Vector Machine (SVM)classifiers. The proposed HMM also models activity duration and the transition between activities. We show that our method can accurately distinguish between individual equipment working activities. Results show 97.43% detection Average Precision (AP)for excavators and 75.29% AP for dump trucks, as well as cross-category tracking accuracy of 81.94% and tracking precision of 87.45%. Separate experiment results show activity analysis results of 86.8% accuracy for excavators and 88.5% for dump trucks. Our results show that our method can accurately conduct activity analysis and can be fused with methods that detect motion trajectories to scale to the needs of practical applications.
AB - This paper presents a new benchmark dataset for validating vision-based methods that automatically identifies visually distinctive working activities of excavators and dump trucks from individual frames of a video sequence. Our dataset consists of 10 videos of interacting pairs of construction equipment filmed at ground level with accompanying ground truth annotations. These annotations consist of per-equipment and per-frame equipment bounding boxes that also have associated identities and activity labels. Our videos depict an excavator interacting with 1 or more dump trucks. We also propose a deep learning-based method for detecting and tracking objects based on Convolutional Neural Networks (CNNs). The tracking trajectories are fed into a Hidden Markov Model (HMM)that automatically discovers and assigns activity labels for any observed object. Our HMM method leverages trajectories to train a Gaussian Mixture Model (GMM)with which we estimate the probability density function of each activity using Support Vector Machine (SVM)classifiers. The proposed HMM also models activity duration and the transition between activities. We show that our method can accurately distinguish between individual equipment working activities. Results show 97.43% detection Average Precision (AP)for excavators and 75.29% AP for dump trucks, as well as cross-category tracking accuracy of 81.94% and tracking precision of 87.45%. Separate experiment results show activity analysis results of 86.8% accuracy for excavators and 88.5% for dump trucks. Our results show that our method can accurately conduct activity analysis and can be fused with methods that detect motion trajectories to scale to the needs of practical applications.
KW - Activity analysis
KW - Convolutional Neural Networks
KW - Deep learning
KW - Earthmoving operations
KW - Hidden Markov Models
UR - http://www.scopus.com/inward/record.url?scp=85065825942&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065825942&partnerID=8YFLogxK
U2 - 10.1016/j.autcon.2019.04.006
DO - 10.1016/j.autcon.2019.04.006
M3 - Article
AN - SCOPUS:85065825942
SN - 0926-5805
VL - 105
JO - Automation in Construction
JF - Automation in Construction
M1 - 102811
ER -