Learning exploration policies for navigation

Tao Chen, Saurabh Gupta, Abhinav Gupta

Research output: Contribution to conferencePaper

Abstract

Numerous past works have tackled the problem of task-driven navigation. But, how to effectively explore a new environment to enable a variety of down-stream tasks has received much less attention. In this work, we study how agents can autonomously explore realistic and complex 3D environments without the context of task-rewards. We propose a learning-based approach and investigate different policy architectures, reward functions, and training paradigms. We find that use of policies with spatial memory that are bootstrapped with imitation learning and finally finetuned with coverage rewards derived purely from on-board sensors can be effective at exploring novel environments. We show that our learned exploration policies can explore better than classical approaches based on geometry alone and generic learning-based exploration techniques. Finally, we also show how such task-agnostic exploration can be used for down-stream tasks. Videos are available at: https://sites.google.com/view/exploration-for-nav/.

Original languageEnglish (US)
StatePublished - Jan 1 2019
Externally publishedYes
Event7th International Conference on Learning Representations, ICLR 2019 - New Orleans, United States
Duration: May 6 2019May 9 2019

Conference

Conference7th International Conference on Learning Representations, ICLR 2019
CountryUnited States
CityNew Orleans
Period5/6/195/9/19

Fingerprint

Navigation
reward
Data storage equipment
Geometry
Sensors
learning
imitation
search engine
video
coverage
mathematics
paradigm
Reward
Spatial Memory
Paradigm
Sensor
Imitation

ASJC Scopus subject areas

  • Education
  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Cite this

Chen, T., Gupta, S., & Gupta, A. (2019). Learning exploration policies for navigation. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.

Learning exploration policies for navigation. / Chen, Tao; Gupta, Saurabh; Gupta, Abhinav.

2019. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.

Research output: Contribution to conferencePaper

Chen, T, Gupta, S & Gupta, A 2019, 'Learning exploration policies for navigation' Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States, 5/6/19 - 5/9/19, .
Chen T, Gupta S, Gupta A. Learning exploration policies for navigation. 2019. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.
Chen, Tao ; Gupta, Saurabh ; Gupta, Abhinav. / Learning exploration policies for navigation. Paper presented at 7th International Conference on Learning Representations, ICLR 2019, New Orleans, United States.
@conference{6a253cf777514b45a1f00f08eeb11640,
title = "Learning exploration policies for navigation",
abstract = "Numerous past works have tackled the problem of task-driven navigation. But, how to effectively explore a new environment to enable a variety of down-stream tasks has received much less attention. In this work, we study how agents can autonomously explore realistic and complex 3D environments without the context of task-rewards. We propose a learning-based approach and investigate different policy architectures, reward functions, and training paradigms. We find that use of policies with spatial memory that are bootstrapped with imitation learning and finally finetuned with coverage rewards derived purely from on-board sensors can be effective at exploring novel environments. We show that our learned exploration policies can explore better than classical approaches based on geometry alone and generic learning-based exploration techniques. Finally, we also show how such task-agnostic exploration can be used for down-stream tasks. Videos are available at: https://sites.google.com/view/exploration-for-nav/.",
author = "Tao Chen and Saurabh Gupta and Abhinav Gupta",
year = "2019",
month = "1",
day = "1",
language = "English (US)",
note = "7th International Conference on Learning Representations, ICLR 2019 ; Conference date: 06-05-2019 Through 09-05-2019",

}

TY - CONF

T1 - Learning exploration policies for navigation

AU - Chen, Tao

AU - Gupta, Saurabh

AU - Gupta, Abhinav

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Numerous past works have tackled the problem of task-driven navigation. But, how to effectively explore a new environment to enable a variety of down-stream tasks has received much less attention. In this work, we study how agents can autonomously explore realistic and complex 3D environments without the context of task-rewards. We propose a learning-based approach and investigate different policy architectures, reward functions, and training paradigms. We find that use of policies with spatial memory that are bootstrapped with imitation learning and finally finetuned with coverage rewards derived purely from on-board sensors can be effective at exploring novel environments. We show that our learned exploration policies can explore better than classical approaches based on geometry alone and generic learning-based exploration techniques. Finally, we also show how such task-agnostic exploration can be used for down-stream tasks. Videos are available at: https://sites.google.com/view/exploration-for-nav/.

AB - Numerous past works have tackled the problem of task-driven navigation. But, how to effectively explore a new environment to enable a variety of down-stream tasks has received much less attention. In this work, we study how agents can autonomously explore realistic and complex 3D environments without the context of task-rewards. We propose a learning-based approach and investigate different policy architectures, reward functions, and training paradigms. We find that use of policies with spatial memory that are bootstrapped with imitation learning and finally finetuned with coverage rewards derived purely from on-board sensors can be effective at exploring novel environments. We show that our learned exploration policies can explore better than classical approaches based on geometry alone and generic learning-based exploration techniques. Finally, we also show how such task-agnostic exploration can be used for down-stream tasks. Videos are available at: https://sites.google.com/view/exploration-for-nav/.

UR - http://www.scopus.com/inward/record.url?scp=85067056062&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067056062&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85067056062

ER -