TY - JOUR
T1 - Synthesizing Pose Sequences from 3D Assets for Vision-Based Activity Analysis
AU - Torres Calderon, S. M.Asce
AU - Roberts, Dominic
AU - Golparvar-Fard, Mani
N1 - Publisher Copyright:
© 2020 American Society of Civil Engineers.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - In recent years, computer vision algorithms have shown to effectively leverage visual data from jobsites for video-based activity analysis of construction equipment. However, earthmoving operations are restricted to site work and surrounding terrain, and the presence of other structures, particularly in urban areas, limits the number of viewpoints from which operations can be recorded. These considerations lower the degree of intra-activity and interactivity category variability to which said algorithms are exposed, hindering their potential for generalizing effectively to new jobsites. Secondly, training computer vision algorithms is also typically reliant on large quantities of hand-annotated ground truth. These annotations are burdensome to obtain and can offset the cost-effectiveness incurred from automating activity analysis. The main contribution of this paper is a means of inexpensively generating synthetic data to improve the capabilities of vision-based activity analysis methods based on virtual, kinematically articulated three-dimensional (3D) models of construction equipment. The authors introduce an automated synthetic data generation method that outputs a two-dimensional (2D) pose corresponding to simulated excavator operations that vary according to camera position with respect to the excavator and activity length and behavior. The presented method is validated by training a deep learning-based method on the synthesized 2D pose sequences and testing on pose sequences corresponding to real-world excavator operations, achieving 75% precision and 71% recall. This exceeds the 66% precision and 65% recall obtained when training and testing the deep learning-based method on the real-world data via cross-validation. Limited access to reliable amounts of real-world data incentivizes using synthetically generated data for training vision-based activity analysis algorithms.
AB - In recent years, computer vision algorithms have shown to effectively leverage visual data from jobsites for video-based activity analysis of construction equipment. However, earthmoving operations are restricted to site work and surrounding terrain, and the presence of other structures, particularly in urban areas, limits the number of viewpoints from which operations can be recorded. These considerations lower the degree of intra-activity and interactivity category variability to which said algorithms are exposed, hindering their potential for generalizing effectively to new jobsites. Secondly, training computer vision algorithms is also typically reliant on large quantities of hand-annotated ground truth. These annotations are burdensome to obtain and can offset the cost-effectiveness incurred from automating activity analysis. The main contribution of this paper is a means of inexpensively generating synthetic data to improve the capabilities of vision-based activity analysis methods based on virtual, kinematically articulated three-dimensional (3D) models of construction equipment. The authors introduce an automated synthetic data generation method that outputs a two-dimensional (2D) pose corresponding to simulated excavator operations that vary according to camera position with respect to the excavator and activity length and behavior. The presented method is validated by training a deep learning-based method on the synthesized 2D pose sequences and testing on pose sequences corresponding to real-world excavator operations, achieving 75% precision and 71% recall. This exceeds the 66% precision and 65% recall obtained when training and testing the deep learning-based method on the real-world data via cross-validation. Limited access to reliable amounts of real-world data incentivizes using synthetically generated data for training vision-based activity analysis algorithms.
UR - http://www.scopus.com/inward/record.url?scp=85095970460&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85095970460&partnerID=8YFLogxK
U2 - 10.1061/(ASCE)CP.1943-5487.0000937
DO - 10.1061/(ASCE)CP.1943-5487.0000937
M3 - Article
AN - SCOPUS:85095970460
SN - 0887-3801
VL - 35
JO - Journal of Computing in Civil Engineering
JF - Journal of Computing in Civil Engineering
IS - 1
M1 - 937
ER -