FF's Roam Notes

Home

❯

Gradient Temporal Difference

Gradient Temporal-Difference

Jun 05, 20251 min read

  • rl

This method cannot be applied directly to neural network because they involve projection on to the space of the basis functions.

Moreover, GTD for neural networks has been done only for policy evaluation, not control.

And the “semi-gradient” TD works faster in practice on almost all problems.

Reference

  • http://incompleteideas.net/Talks/gradient-TD-2011.pdf

  • https://www.reddit.com/r/reinforcementlearning/comments/9dh1th/why_are_gradient_td_methods_not_used_in_deep_rl/


Graph View

Backlinks

  • Off Policy Actor Critic

Created with Quartz v4.5.1 © 2025

  • GitHub
  • Discord Community