A robotic learning system must be robust to

  1. the difference between an offline training dataset and the real world,
  2. non-stationary changes in the real world(ignore visual distractors).

They consider the problem of navigating to a user-specified goal in a previously unseen environment.

Broadly, their method, Rapid Exploration Controller for Outcome-driven Navigation(RECON) consists of three separate stages:

  1. learning a subgoal representation using Deep Variational Information Bottleneck architecture from offline data,
  2. Using subgoals generated by previous trained goal model for data collection(new environment) and building a map in a new environment,
  3. navigating to subgoals in the new environment.

Notice that the exploration and navigation is executing at the same time.