TY - JOUR
T1 - Segmentation and recognition of roadway assets from car-mounted camera video streams using a scalable non-parametric image parsing method
AU - Balali, Vahid
AU - Golparvar-Fard, Mani
N1 - Publisher Copyright:
© 2014 Elsevier B.V. All rights reserved.
PY - 2015/1
Y1 - 2015/1
N2 - This paper presents a non-parametric image parsing method for segmentation and recognition of roadway assets such as traffic signs, traffic lights, pavement markings, and guardrails from 2D car-mounted video streams. The method can be easily scaled to thousands of video frames captured during data collection and does not need training. Instead, it retrieves a set of most relevant video frames (e.g. highway vs. secondary road) which serve as candidates for superpixel-level annotation. It then obtains superpixels from the video frames and using the retrieval set encodes their visual characteristics using a histogram of different shape, appearance, and color descriptors. Neighborhood contexts are incorporated by using Markov Random Field (MRF) optimization and two types of semantic (e.g. guardrail) and geometric (e.g. horizontal) labels are simultaneously assigned to the superpixels. We introduce a new dataset from I-57 together with its ground truth and present experimental results on both I-57 and SmartRoad datasets. Experimental results with an average accuracy of 88.24% for recognition and 82.02% for segmentation show that our local visual features provide acceptable performance, while the method overall does not require any significant supervised training. This scalable method has potential to reduce the time and effort required for developing road inventories, especially for those such as guardrails and traffic lights that are not typically considered in 2D asset recognition methods.
AB - This paper presents a non-parametric image parsing method for segmentation and recognition of roadway assets such as traffic signs, traffic lights, pavement markings, and guardrails from 2D car-mounted video streams. The method can be easily scaled to thousands of video frames captured during data collection and does not need training. Instead, it retrieves a set of most relevant video frames (e.g. highway vs. secondary road) which serve as candidates for superpixel-level annotation. It then obtains superpixels from the video frames and using the retrieval set encodes their visual characteristics using a histogram of different shape, appearance, and color descriptors. Neighborhood contexts are incorporated by using Markov Random Field (MRF) optimization and two types of semantic (e.g. guardrail) and geometric (e.g. horizontal) labels are simultaneously assigned to the superpixels. We introduce a new dataset from I-57 together with its ground truth and present experimental results on both I-57 and SmartRoad datasets. Experimental results with an average accuracy of 88.24% for recognition and 82.02% for segmentation show that our local visual features provide acceptable performance, while the method overall does not require any significant supervised training. This scalable method has potential to reduce the time and effort required for developing road inventories, especially for those such as guardrails and traffic lights that are not typically considered in 2D asset recognition methods.
KW - High-quantity low-cost highway assets
KW - Parsing
KW - Recognition
KW - Segmentation
UR - http://www.scopus.com/inward/record.url?scp=84922569843&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84922569843&partnerID=8YFLogxK
U2 - 10.1016/j.autcon.2014.09.007
DO - 10.1016/j.autcon.2014.09.007
M3 - Article
AN - SCOPUS:84922569843
SN - 0926-5805
VL - 49
SP - 27
EP - 39
JO - Automation in Construction
JF - Automation in Construction
IS - PA
ER -