TY - JOUR
T1 - Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents
AU - Patel, Shivansh
AU - Wani, Saim
AU - Jain, Unnat
AU - Schwing, Alexander
AU - Lazebnik, Svetlana
AU - Savva, Manolis
AU - Chang, Angel X.
N1 - We thank the anonymous reviewers for their suggestions and feedback. This work was funded in part by a Canada CIFAR AI Chair, a Canada Research Chair and NSERC Discovery Grant, and enabled in part by support provided by WestGrid and Compute Canada. This work is supported in part by NSF under Grant #1718221, 2008387, 2045586.
PY - 2021
Y1 - 2021
N2 - Communication between embodied AI agents has received increasing attention in recent years. Despite its use, it is still unclear whether the learned communication is interpretable and grounded in perception. To study the grounding of emergent forms of communication, we first introduce the collaborative multi-object navigation task ‘CoMON.’ In this task, an ‘oracle agent’ has detailed environment information in the form of a map. It communicates with a ‘navigator agent’ that perceives the environment visually and is tasked to find a sequence of goals. To succeed at the task, effective communication is essential. CoMON hence serves as a basis to study different communication mechanisms between heterogeneous agents, that is, agents with different capabilities and roles. We study two common communication mechanisms and analyze their communication patterns through an egocentric and spatial lens. We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.
AB - Communication between embodied AI agents has received increasing attention in recent years. Despite its use, it is still unclear whether the learned communication is interpretable and grounded in perception. To study the grounding of emergent forms of communication, we first introduce the collaborative multi-object navigation task ‘CoMON.’ In this task, an ‘oracle agent’ has detailed environment information in the form of a map. It communicates with a ‘navigator agent’ that perceives the environment visually and is tasked to find a sequence of goals. To succeed at the task, effective communication is essential. CoMON hence serves as a basis to study different communication mechanisms between heterogeneous agents, that is, agents with different capabilities and roles. We study two common communication mechanisms and analyze their communication patterns through an egocentric and spatial lens. We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.
UR - http://www.scopus.com/inward/record.url?scp=85132862826&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85132862826&partnerID=8YFLogxK
U2 - 10.1109/ICCV48922.2021.01565
DO - 10.1109/ICCV48922.2021.01565
M3 - Conference article
AN - SCOPUS:85132862826
SN - 1550-5499
SP - 15933
EP - 15943
JO - Proceedings of the IEEE International Conference on Computer Vision
JF - Proceedings of the IEEE International Conference on Computer Vision
T2 - 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021
Y2 - 11 October 2021 through 17 October 2021
ER -