Policy gradient formalism

Loading...
来自 National Research University Higher School of Economics 的课程
Practical Reinforcement Learning
117 个评分
National Research University Higher School of Economics
117 个评分
课程 4(共 7 门,Specialization Advanced Machine Learning
从本节课中
Policy-based methods
We spent 3 previous modules working on the value-based methods: learning state values, action values and whatnot. Now's the time to see an alternative approach that doesn't require you to predict all future rewards to learn something.

与讲师见面

  • Pavel Shvechikov
    Pavel Shvechikov
    Researcher at HSE and Sberbank AI Lab
    HSE Faculty of Computer Science
  • Alexander Panin
    Alexander Panin
    Lecturer
    HSE Faculty of Computer Science

探索我们的目录

免费加入并获得个性化推荐、更新和优惠。