TY - JOUR
T1 - Segmentation and recognition of highway assets using image-based 3d point clouds and semantic Texton Forests
AU - Golparvar-Fard, Mani
AU - Balali, Vahid
AU - De La Garza, Jesus M.
N1 - Publisher Copyright:
© 2014 American Society of Civil Engineers.
PY - 2015/1/1
Y1 - 2015/1/1
N2 - Efficient data collection of high-quantity and low-cost highway assets such as road signs, traffic signals, light poles, and guardrails is a critical element to the operation, maintenance, and preservation of transportation infrastructure systems. Despite its importance, current practice of highway asset data collection is time-consuming, subjective, and potentially unsafe. The high volume of the data that needs to be collected can also negatively impact the quality of the analysis. To address these limitations, this paper proposes a new algorithm for semantic segmentation and recognition of highway assets using video frames collected from a car-mounted camera. The proposed set of algorithms (1) takes the captured frames and using a pipeline of structure from motion and multiview stereo reconstructs a three-dimensional (3D) point cloud model of the highway and surrounding assets; (2) using a Semantic Texton Forest classifier, each geo-registered two-dimensional (2D) video frame at the pixel-level is segmented based on shape, texture, and color of the highway assets; and finally, (3) based on the results of the 2D segmentation and a new voting scheme, each reconstructed 3D point in the cloud is also categorized for one type of asset and is color coded accordingly. The resulting augmented reality environment that integrates the color-coded point clouds with the geo-registered video frames enables a user to conduct visual walk through and query different categories of assets. Experiments were performed on a challenging video data set containing sequences filmed from a moving car on a 2.2-mi-long, two-lane highway research facility. Experimental results with an average accuracy of 76.50 and 86.75% in segmentation and pixel-level recognition of 12 types of asset categories reflect the promise of the applicability of this approach for segmentation and recognition of highway assets from image-based 3D point clouds. It also enables future algorithmic developments for 3D localization of traffic signs and other assets that are detected using the state-of-the-art vision-based methods.
AB - Efficient data collection of high-quantity and low-cost highway assets such as road signs, traffic signals, light poles, and guardrails is a critical element to the operation, maintenance, and preservation of transportation infrastructure systems. Despite its importance, current practice of highway asset data collection is time-consuming, subjective, and potentially unsafe. The high volume of the data that needs to be collected can also negatively impact the quality of the analysis. To address these limitations, this paper proposes a new algorithm for semantic segmentation and recognition of highway assets using video frames collected from a car-mounted camera. The proposed set of algorithms (1) takes the captured frames and using a pipeline of structure from motion and multiview stereo reconstructs a three-dimensional (3D) point cloud model of the highway and surrounding assets; (2) using a Semantic Texton Forest classifier, each geo-registered two-dimensional (2D) video frame at the pixel-level is segmented based on shape, texture, and color of the highway assets; and finally, (3) based on the results of the 2D segmentation and a new voting scheme, each reconstructed 3D point in the cloud is also categorized for one type of asset and is color coded accordingly. The resulting augmented reality environment that integrates the color-coded point clouds with the geo-registered video frames enables a user to conduct visual walk through and query different categories of assets. Experiments were performed on a challenging video data set containing sequences filmed from a moving car on a 2.2-mi-long, two-lane highway research facility. Experimental results with an average accuracy of 76.50 and 86.75% in segmentation and pixel-level recognition of 12 types of asset categories reflect the promise of the applicability of this approach for segmentation and recognition of highway assets from image-based 3D point clouds. It also enables future algorithmic developments for 3D localization of traffic signs and other assets that are detected using the state-of-the-art vision-based methods.
KW - High-quantity low-cost assets
KW - Image-based 3D reconstruction
KW - Segmentation
KW - Semantic Texton Forest
UR - http://www.scopus.com/inward/record.url?scp=84920830702&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84920830702&partnerID=8YFLogxK
U2 - 10.1061/(ASCE)CP.1943-5487.0000283
DO - 10.1061/(ASCE)CP.1943-5487.0000283
M3 - Article
AN - SCOPUS:84920830702
SN - 0887-3801
VL - 29
JO - Journal of Computing in Civil Engineering
JF - Journal of Computing in Civil Engineering
IS - 1
M1 - 04014023
ER -