TY - JOUR
T1 - Machine Learning-Based Risk Analysis for Construction Worker Safety from Ubiquitous Site Photos and Videos
AU - Tang, Shuai
AU - Golparvar-Fard, Mani
N1 - Funding Information:
We would like to thank RAAMAC lab students for their suggestions and support of the present study. We thank Professor Jun Yang for granting access to her construction worker activity videos dataset. This material is based in part upon work supported by the National Science Foundation (NSF) under Grants CMMI 1446765 and CMMI 1544999. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author (s) and do not necessarily reflect the views of the NSF.
Publisher Copyright:
© 2021 American Society of Civil Engineers.
PY - 2021/11/1
Y1 - 2021/11/1
N2 - This paper proposes a new method for single-worker severity level prediction from already collected site images and video clips. Onsite safety observers often assess workers' severity levels during construction activities. While risk analysis is key to improving long-term construction site safety, omnipresent monitoring is still time-consuming and costly to implement. The recent growth of visual data captured actively on construction sites has opened a new opportunity to increase the frequency of worker safety monitoring. This paper shows that a comprehensive vision-based assessment is the most informative to automatically infer worker severity level from images. Efficient computer vision models are presented to conduct this risk analysis. The method is validated on a challenging image dataset first of its kind. Specifically, the proposed method detects and evaluates the worker state from visual data, defined by (1) worker body posture, (2) the usage of personal protective equipment, (3) worker interactions with tools and materials, (4) the construction activity being performed, and (5) the presence of surrounding workplace hazards. To estimate the worker state, a multitasked recognition model is introduced that recognizes objects, activity, and keypoints from visual data simultaneously, taking 36.6% less time and 40.1% less memory while keeping comparably performances compared to a system running individual models for each subtask. Worker activity recognition is further improved with a spatio-temporal graph neural network model using recognized per-frame worker activity, detected bounding boxes of tools and materials, and estimated worker poses. Finally, severity levels are predicted by a trained classifier on a dataset of images of construction workers accompanied with ground truth severity level annotations. In the test dataset assembled from real-world projects, the severity level prediction model achieves 85.7% cross-validation accuracy in a bricklaying task and 86.6% cross-validation accuracy for a plastering task, demonstrating the potential for near real-time worker safety detection and severity assessment.
AB - This paper proposes a new method for single-worker severity level prediction from already collected site images and video clips. Onsite safety observers often assess workers' severity levels during construction activities. While risk analysis is key to improving long-term construction site safety, omnipresent monitoring is still time-consuming and costly to implement. The recent growth of visual data captured actively on construction sites has opened a new opportunity to increase the frequency of worker safety monitoring. This paper shows that a comprehensive vision-based assessment is the most informative to automatically infer worker severity level from images. Efficient computer vision models are presented to conduct this risk analysis. The method is validated on a challenging image dataset first of its kind. Specifically, the proposed method detects and evaluates the worker state from visual data, defined by (1) worker body posture, (2) the usage of personal protective equipment, (3) worker interactions with tools and materials, (4) the construction activity being performed, and (5) the presence of surrounding workplace hazards. To estimate the worker state, a multitasked recognition model is introduced that recognizes objects, activity, and keypoints from visual data simultaneously, taking 36.6% less time and 40.1% less memory while keeping comparably performances compared to a system running individual models for each subtask. Worker activity recognition is further improved with a spatio-temporal graph neural network model using recognized per-frame worker activity, detected bounding boxes of tools and materials, and estimated worker poses. Finally, severity levels are predicted by a trained classifier on a dataset of images of construction workers accompanied with ground truth severity level annotations. In the test dataset assembled from real-world projects, the severity level prediction model achieves 85.7% cross-validation accuracy in a bricklaying task and 86.6% cross-validation accuracy for a plastering task, demonstrating the potential for near real-time worker safety detection and severity assessment.
UR - http://www.scopus.com/inward/record.url?scp=85113714960&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85113714960&partnerID=8YFLogxK
U2 - 10.1061/(ASCE)CP.1943-5487.0000979
DO - 10.1061/(ASCE)CP.1943-5487.0000979
M3 - Article
AN - SCOPUS:85113714960
SN - 0887-3801
VL - 35
JO - Journal of Computing in Civil Engineering
JF - Journal of Computing in Civil Engineering
IS - 6
M1 - 04021020
ER -