TY - GEN
T1 - Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model
AU - Qi, Zhenting
AU - Zhu, Ruike
AU - Fu, Zheyu
AU - Chai, Wenhao
AU - Kindratenko, Volodymyr
N1 - ACKNOWLEDGMENT This work utilized resources supported by the National Science Foundations Major Research Instrumentation program,
This work utilized resources supported by the National Science Foundations Major Research Instrumentation program, grant #1725729, as well as the University of Illinois at Urbana- Champaign.
PY - 2022
Y1 - 2022
N2 - Fight detection in videos is an emerging deep learning application with today's prevalence of surveillance systems and streaming media. Previous work has largely relied on action recognition techniques to tackle this problem. In this paper, we propose a simple but effective method that solves the task from a new perspective: we design the fight detection model as a composition of an action-aware feature extractor and an anomaly score generator. Also, considering that collecting frame-level labels for videos is too laborious, we design a weakly supervised two-stage training scheme, where we utilize multiple-instance-learning loss calculated on video-level labels to train the score generator, and adopt the self-training technique to further improve its performance. Extensive experiments on a publicly available large-scale dataset, UBI-Fights, demonstrate the effectiveness of our method, and the performance on the dataset exceeds several previous state-of-the-art approaches. Furthermore, we collect a new dataset, VFD-2000, that specializes in video fight detection, with a larger scale and more scenarios than existing datasets. The implementation of our method and the proposed dataset is available at https://github.com/Hepta-Col/VideoFightDetection.
AB - Fight detection in videos is an emerging deep learning application with today's prevalence of surveillance systems and streaming media. Previous work has largely relied on action recognition techniques to tackle this problem. In this paper, we propose a simple but effective method that solves the task from a new perspective: we design the fight detection model as a composition of an action-aware feature extractor and an anomaly score generator. Also, considering that collecting frame-level labels for videos is too laborious, we design a weakly supervised two-stage training scheme, where we utilize multiple-instance-learning loss calculated on video-level labels to train the score generator, and adopt the self-training technique to further improve its performance. Extensive experiments on a publicly available large-scale dataset, UBI-Fights, demonstrate the effectiveness of our method, and the performance on the dataset exceeds several previous state-of-the-art approaches. Furthermore, we collect a new dataset, VFD-2000, that specializes in video fight detection, with a larger scale and more scenarios than existing datasets. The implementation of our method and the proposed dataset is available at https://github.com/Hepta-Col/VideoFightDetection.
KW - Computer Vision
KW - Self-Training
KW - Video Anomaly Detection
KW - Video Fight Detection
KW - Weakly Supervised Learning
UR - http://www.scopus.com/inward/record.url?scp=85156095188&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85156095188&partnerID=8YFLogxK
U2 - 10.1109/ICTAI56018.2022.00105
DO - 10.1109/ICTAI56018.2022.00105
M3 - Conference contribution
AN - SCOPUS:85156095188
T3 - Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
SP - 677
EP - 685
BT - Proceedings - 2022 IEEE 34th International Conference on Tools with Artificial Intelligence, ICTAI 2022
A2 - Reformat, Marek
A2 - Zhang, Du
A2 - Bourbakis, Nikolaos G.
PB - IEEE Computer Society
T2 - 34th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2022
Y2 - 31 October 2022 through 2 November 2022
ER -