TY - JOUR
T1 - Last-Mile Embodied Visual Navigation
AU - Wasserman, Justin
AU - Yadav, Karmesh
AU - Chowdhary, Girish
AU - Gupta, Abhinav
AU - Jain, Unnat
N1 - We thank the reviewers for suggesting additional experiments to make the work stronger. JW and GC are supported by ONR MURI N00014-19-1-2373. We are grateful to Akihiro Higuti and Mateus Valverde for physical robot help, Dhruv Batra for helping broaden the scope, Jae Yong Lee for help with geometric vision formulation, Meera Hahn for assistance in reproducing NRNS results, and Shubham Tulsiani for helping ground the work better to 3D vision. A big thanks to our friends who gave feedback to improve the submission draft - Homanga Bharadhwaj, Raunaq Bhirangi, Xiaoming Zhao, and Zhenggang Tang
PY - 2023
Y1 - 2023
N2 - Realistic long-horizon tasks like image-goal navigation involve exploratory and exploitative phases. Assigned with an image of the goal, an embodied agent must explore to discover the goal, i.e., search efficiently using learned priors. Once the goal is discovered, the agent must accurately calibrate the last-mile of navigation to the goal. As with any robust system, switches between exploratory goal discovery and exploitative last-mile navigation enable better recovery from errors. Following these intuitive guide rails, we propose SLING to improve the performance of existing image-goal navigation systems. Entirely complementing prior methods, we focus on last-mile navigation and leverage the underlying geometric structure of the problem with neural descriptors. With simple but effective switches, we can easily connect SLING with heuristic, reinforcement learning, and neural modular policies. On a standardized image-goal navigation benchmark [1], we improve performance across policies, scenes, and episode complexity, raising the state-of-the-art from 45% to 55% success rate. Beyond photorealistic simulation, we conduct real-robot experiments in three physical scenes and find these improvements to transfer well to real environments. Code and results: https://jbwasse2.github.io/portfolio/SLING.
AB - Realistic long-horizon tasks like image-goal navigation involve exploratory and exploitative phases. Assigned with an image of the goal, an embodied agent must explore to discover the goal, i.e., search efficiently using learned priors. Once the goal is discovered, the agent must accurately calibrate the last-mile of navigation to the goal. As with any robust system, switches between exploratory goal discovery and exploitative last-mile navigation enable better recovery from errors. Following these intuitive guide rails, we propose SLING to improve the performance of existing image-goal navigation systems. Entirely complementing prior methods, we focus on last-mile navigation and leverage the underlying geometric structure of the problem with neural descriptors. With simple but effective switches, we can easily connect SLING with heuristic, reinforcement learning, and neural modular policies. On a standardized image-goal navigation benchmark [1], we improve performance across policies, scenes, and episode complexity, raising the state-of-the-art from 45% to 55% success rate. Beyond photorealistic simulation, we conduct real-robot experiments in three physical scenes and find these improvements to transfer well to real environments. Code and results: https://jbwasse2.github.io/portfolio/SLING.
KW - AI Habitat
KW - Embodied AI
KW - Perspective-n-Point
KW - Robot Learning
KW - Sim-to-Real
KW - Visual Navigation
UR - http://www.scopus.com/inward/record.url?scp=85164947085&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85164947085&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85164947085
SN - 2640-3498
VL - 205
SP - 666
EP - 678
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 6th Conference on Robot Learning, CoRL 2022
Y2 - 14 December 2022 through 18 December 2022
ER -