TY - GEN
T1 - Discriminative Appearance Modeling with Multi-track Pooling for Real-time Multi-object Tracking
AU - Kim, Chanho
AU - Fuxin, Li
AU - Alotaibi, Mazen
AU - Rehg, James M.
N1 - Funding Information:
This work was supported in part by NIH award 1R24O D020174-01A1 and DARPA contract N66001-19-2-4035. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Defense Advanced Research Projects Agency (DARPA).
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - In multi-object tracking, the tracker maintains in its memory the appearance and motion information for each object in the scene. This memory is utilized for finding matches between tracks and detections, and is updated based on the matching. Many approaches model each target in isolation and lack the ability to use all the targets in the scene to jointly update the memory. This can be problematic when there are similarly looking objects in the scene. In this paper, we solve the problem of simultaneously considering all tracks during memory updating, with only a small spatial overhead, via a novel multi-track pooling module. We additionally propose a training strategy adapted to multi-track pooling which generates hard tracking episodes online. We show that the combination of these innovations results in a strong discriminative appearance model under the bilinear LSTM tracking framework, enabling the use of greedy data association to achieve online tracking performance. Our experiments demonstrate real-time, state-of-the-art online tracking performance on public multi-object tracking (MOT) datasets. The code and trained models are available at https://github.com/chkim403/blstm-mtp.
AB - In multi-object tracking, the tracker maintains in its memory the appearance and motion information for each object in the scene. This memory is utilized for finding matches between tracks and detections, and is updated based on the matching. Many approaches model each target in isolation and lack the ability to use all the targets in the scene to jointly update the memory. This can be problematic when there are similarly looking objects in the scene. In this paper, we solve the problem of simultaneously considering all tracks during memory updating, with only a small spatial overhead, via a novel multi-track pooling module. We additionally propose a training strategy adapted to multi-track pooling which generates hard tracking episodes online. We show that the combination of these innovations results in a strong discriminative appearance model under the bilinear LSTM tracking framework, enabling the use of greedy data association to achieve online tracking performance. Our experiments demonstrate real-time, state-of-the-art online tracking performance on public multi-object tracking (MOT) datasets. The code and trained models are available at https://github.com/chkim403/blstm-mtp.
UR - http://www.scopus.com/inward/record.url?scp=85113928531&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85113928531&partnerID=8YFLogxK
U2 - 10.1109/CVPR46437.2021.00943
DO - 10.1109/CVPR46437.2021.00943
M3 - Conference contribution
AN - SCOPUS:85113928531
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 9548
EP - 9557
BT - Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
PB - IEEE Computer Society
T2 - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
Y2 - 19 June 2021 through 25 June 2021
ER -