Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
Same idea with # Skew-Fit: State-Covering Self-Supervised Reinforcement Learning. Trying to maximize the distribution of less visited archived goals, that is also the idea of maximize the entropy of $H(S)$.
But the structure of subgoal exploration algorithm organized by this paper is really great.