TY - GEN
T1 - Two body problem
T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
AU - Jain, Unnat
AU - Weihs, Luca
AU - Kolve, Eric
AU - Rastegari, Mohammad
AU - Lazebnik, Svetlana
AU - Farhadi, Ali
AU - Schwing, Alexander G.
AU - Kembhavi, Aniruddha
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/6
Y1 - 2019/6
N2 - Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities. Addressed extensively in both conventional and modern AI, multi-agent collaboration has often been studied in the context of simple grid worlds. We argue that there are inherently visual aspects to collaboration which should be studied in visually rich environments. A key element in collaboration is communication that can be either explicit, through messages, or implicit, through perception of the other agents and the visual world. Learning to collaborate in a visual environment entails learning (1) to perform the task, (2) when and what to communicate, and (3) how to act based on these communications and the perception of the visual world. In this paper we study the problem of learning to collaborate directly from pixels in AI2-THOR and demonstrate the benefits of explicit and implicit modes of communication to perform visual tasks. Refer to our project page for more details: Https://prior.allenai.org/projects/two-body-problem.
AB - Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities. Addressed extensively in both conventional and modern AI, multi-agent collaboration has often been studied in the context of simple grid worlds. We argue that there are inherently visual aspects to collaboration which should be studied in visually rich environments. A key element in collaboration is communication that can be either explicit, through messages, or implicit, through perception of the other agents and the visual world. Learning to collaborate in a visual environment entails learning (1) to perform the task, (2) when and what to communicate, and (3) how to act based on these communications and the perception of the visual world. In this paper we study the problem of learning to collaborate directly from pixels in AI2-THOR and demonstrate the benefits of explicit and implicit modes of communication to perform visual tasks. Refer to our project page for more details: Https://prior.allenai.org/projects/two-body-problem.
KW - Categorization
KW - Recognition: Detection
KW - Retrieval
KW - Scene Analysis and Understanding
KW - Visual Reasoning
UR - http://www.scopus.com/inward/record.url?scp=85078761435&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078761435&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2019.00685
DO - 10.1109/CVPR.2019.00685
M3 - Conference contribution
AN - SCOPUS:85078761435
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 6682
EP - 6692
BT - Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
PB - IEEE Computer Society
Y2 - 16 June 2019 through 20 June 2019
ER -