TY - GEN
T1 - Text Query based Traffic Video Event Retrieval with Global-Local Fusion Embedding
AU - Nguyen-Ho, Thang Long
AU - Pham, Minh Khoi
AU - Nguyen, Tien Phat
AU - Nguyen, Hai Dang
AU - Do, Minh N.
AU - Nguyen, Tam V.
AU - Tran, Minh Triet
N1 - This research is supported by Vingroup Innovation Foundation (VINIF) in project code VINIF.2019.DA19 and National Science Foundation (NSF) under Grant No. 2025234. Tien-Phat Nguyen was funded by Vingroup JSC and supported by the Master, PhD Scholarship Programme of Vingroup Innovation Foundation (VINIF), Institute of Big Data, code VINIF.2021.ThS.JVN.04.
PY - 2022
Y1 - 2022
N2 - Retrieving event videos based on textual description is a promising research topic in the fast-growing data field. However, traffic data increases every day, so it is essential to need intelligent traffic system management in conjunction with humans to speed up the search. We propose a multi-module system that delivers accurate results that meet objectives, including explainability and scalability at the same time. Our solution considers neighbors entities related to the mentioned object to represent an event by rule-based, which can represent an event by the relationship of multiple objects. In our proposed retrieval method, we add our modified model of Alibaba solution with the post-processing techniques from HCMUS method in AI City Challenge 2021 to boost the explainability of the obtained results. As the traffic data is vehicle-centric, we apply two language and image modules to analyze the input data and obtain the global properties of the context and the internal attributes of the vehicle. We introduce a one-on-one dual training strategy for each representation vector to optimize the interior features for the query. Finally, a refinement module gathers previous results to enhance the final retrieval result. We benchmarked our approach on the data of the AI City Challenge 2022 and obtained the competitive results at an MMR of 0.3611. We were ranked in the top 4 on 50% of the test set and in the top 5 on the full set.
AB - Retrieving event videos based on textual description is a promising research topic in the fast-growing data field. However, traffic data increases every day, so it is essential to need intelligent traffic system management in conjunction with humans to speed up the search. We propose a multi-module system that delivers accurate results that meet objectives, including explainability and scalability at the same time. Our solution considers neighbors entities related to the mentioned object to represent an event by rule-based, which can represent an event by the relationship of multiple objects. In our proposed retrieval method, we add our modified model of Alibaba solution with the post-processing techniques from HCMUS method in AI City Challenge 2021 to boost the explainability of the obtained results. As the traffic data is vehicle-centric, we apply two language and image modules to analyze the input data and obtain the global properties of the context and the internal attributes of the vehicle. We introduce a one-on-one dual training strategy for each representation vector to optimize the interior features for the query. Finally, a refinement module gathers previous results to enhance the final retrieval result. We benchmarked our approach on the data of the AI City Challenge 2022 and obtained the competitive results at an MMR of 0.3611. We were ranked in the top 4 on 50% of the test set and in the top 5 on the full set.
UR - http://www.scopus.com/inward/record.url?scp=85137793645&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137793645&partnerID=8YFLogxK
U2 - 10.1109/CVPRW56347.2022.00353
DO - 10.1109/CVPRW56347.2022.00353
M3 - Conference contribution
AN - SCOPUS:85137793645
T3 - IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
SP - 3133
EP - 3140
BT - Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022
PB - IEEE Computer Society
T2 - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022
Y2 - 19 June 2022 through 20 June 2022
ER -