A robotic learning system must be robust to
- the difference between an offline training dataset and the real world,
- non-stationary changes in the real world(ignore visual distractors).
They consider the problem of navigating to a user-specified goal in a previously unseen environment.
Broadly, their method, Rapid Exploration Controller for Outcome-driven Navigation(RECON) consists of three separate stages:
- learning a subgoal representation using Deep Variational Information Bottleneck architecture from offline data,
- Using subgoals generated by previous trained goal model for data collection(new environment) and building a map in a new environment,
- navigating to subgoals in the new environment.
Notice that the exploration and navigation is executing at the same time.