This paper proposes a hierarchical RL architecture which consists of two modules: one meta controller for generate next subtask’s parameters by reading the subtask’s instruction; and one parameterized skill policy model to generalized in different picking tasks by using analogy-making methods.
Still by giving an instruction or sequence to do all subtasks.