FF's Notes
← Home

Gradient Temporal-Difference

May 19, 2021
rl

This method cannot be applied directly to neural network because they involve projection on to the space of the basis functions.

Moreover, GTD for neural networks has been done only for policy evaluation, not control.

And the "semi-gradient" TD works faster in practice on almost all problems.