Notes on Implicit Behavioral Cloning

Key idea

Behavioral cloning policies are often represented by explicit continuous feed-forward models that mappingg directly from input observations $o \in O$ to output action $a \in A$ .

\overset{a}{^} = F_{θ} (o)

In this work, they reformulate BC using the composition of argmin with an energy function $E$ to represent the policy,

\overset{a}{^} = ar g a min E_{θ} (o, a)

Then train the model with different EBM training methods.

What you can learn?

From explicit to implicit. If there’s a way to transfer the methods/optimization goal from explicit to implicit, there’s a way to write a paper.

Drawbacks

Not found.

FF's Roam Notes

Explorer

Notes on Implicit Behavioral Cloning

Key idea

What you can learn?

Drawbacks

Graph View

Table of Contents