课程信息
139,090 次近期查看

100% 在线

立即开始,按照自己的计划学习。

第 1 门课程(共 4 门)

可灵活调整截止日期

根据您的日程表重置截止日期。

中级

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

完成时间大约为19 小时

建议:4-6 hours/week...

英语(English)

字幕:英语(English)

您将学到的内容有

  • Check

    Formalize problems as Markov Decision Processes

  • Check

    Understand basic exploration methods and the exploration / exploitation tradeoff

  • Check

    Understand value functions, as a general-purpose tool for optimal decision-making

  • Check

    Know how to implement dynamic programming as an efficient solution approach to an industrial control problem

您将获得的技能

Artificial Intelligence (AI)Machine LearningReinforcement LearningFunction ApproximationIntelligent Systems

100% 在线

立即开始,按照自己的计划学习。

第 1 门课程(共 4 门)

可灵活调整截止日期

根据您的日程表重置截止日期。

中级

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

完成时间大约为19 小时

建议:4-6 hours/week...

英语(English)

字幕:英语(English)

教学大纲 - 您将从这门课程中学到什么

1
完成时间为 1 小时

Welcome to the Course!

4 个视频 (总计 20 分钟), 2 个阅读材料
4 个视频
Course Introduction5分钟
Meet your instructors!8分钟
Your Specialization Roadmap3分钟
2 个阅读材料
Reinforcement Learning Textbook10分钟
Read Me: Pre-requisites and Learning Objectives10分钟
完成时间为 7 小时

The K-Armed Bandit Problem

8 个视频 (总计 46 分钟), 3 个阅读材料, 2 个测验
8 个视频
Learning Action Values4分钟
Estimating Action Values Incrementally5分钟
What is the trade-off?7分钟
Optimistic Initial Values6分钟
Upper-Confidence Bound (UCB) Action Selection5分钟
Jonathan Langford: Contextual Bandits for Real World Reinforcement Learning8分钟
Week 1 Summary3分钟
3 个阅读材料
Module 2 Learning Objectives10分钟
Weekly Reading30分钟
Chapter Summary30分钟
1 个练习
Exploration/Exploitation45分钟
2
完成时间为 4 小时

Markov Decision Processes

7 个视频 (总计 36 分钟), 2 个阅读材料, 2 个测验
7 个视频
Examples of MDPs4分钟
The Goal of Reinforcement Learning3分钟
Michael Littman: The Reward Hypothesis12分钟
Continuing Tasks5分钟
Examples of Episodic and Continuing Tasks3分钟
Week 2 Summary1分钟
2 个阅读材料
Module 3 Learning Objectives10分钟
Weekly Reading30分钟
1 个练习
MDPs45分钟
3
完成时间为 3 小时

Value Functions & Bellman Equations

9 个视频 (总计 56 分钟), 3 个阅读材料, 2 个测验
9 个视频
Value Functions6分钟
Rich Sutton and Andy Barto: A brief History of RL7分钟
Bellman Equation Derivation6分钟
Why Bellman Equations?5分钟
Optimal Policies7分钟
Optimal Value Functions5分钟
Using Optimal Value Functions to Get Optimal Policies8分钟
Week 3 Summary4分钟
3 个阅读材料
Module 4 Learning Objectives10分钟
Weekly Reading30分钟
Chapter Summary13分钟
2 个练习
Value Functions and Bellman Equations45分钟
Value Functions and Bellman Equations45分钟
4
完成时间为 7 小时

Dynamic Programming

10 个视频 (总计 72 分钟), 3 个阅读材料, 2 个测验
10 个视频
Iterative Policy Evaluation8分钟
Policy Improvement4分钟
Policy Iteration8分钟
Flexibility of the Policy Iteration Framework4分钟
Efficiency of Dynamic Programming5分钟
Warren Powell: Approximate Dynamic Programming for Fleet Management (Short)7分钟
Warren Powell: Approximate Dynamic Programming for Fleet Management (Long)21分钟
Week 4 Summary2分钟
Congratulations!3分钟
3 个阅读材料
Module 5 Learning Objectives10分钟
Weekly Reading30分钟
Chapter Summary30分钟
1 个练习
Dynamic Programming45分钟
4.8
122 条评论Chevron Right

来自Fundamentals of Reinforcement Learning的热门评论

创建者 PVNov 10th 2019

I understood all the necessary concepts of RL. I've been working on RL for some time now, but thanks to this course, now I have more basic knowledge about RL and can't wait to watch other courses

创建者 ABSep 7th 2019

Concepts are bit hard, but it is nice if you undersand it well, espically the bellman and dynamic programming.\n\nSometimes, visualizing the problem is hard, so need to thoroghly get prepared.

讲师

Avatar

Martha White

Assistant Professor
Computing Science
Avatar

Adam White

Assistant Professor
Computing Science

关于 阿尔伯塔大学

UAlberta is considered among the world’s leading public research- and teaching-intensive universities. As one of Canada’s top universities, we’re known for excellence across the humanities, sciences, creative arts, business, engineering and health sciences....

关于 Alberta Machine Intelligence Institute

The Alberta Machine Intelligence Institute (Amii) is home to some of the world’s top talent in machine intelligence. We’re an Alberta-based research institute that pushes the bounds of academic knowledge and guides business understanding of artificial intelligence and machine learning....

关于 强化学习 专项课程

The Reinforcement Learning Specialization consists of 4 courses exploring the power of adaptive learning systems and artificial intelligence (AI). Harnessing the full potential of artificial intelligence requires adaptive learning systems. Learn how Reinforcement Learning (RL) solutions help solve real-world problems through trial-and-error interaction by implementing a complete RL solution from beginning to end. By the end of this Specialization, learners will understand the foundations of much of modern probabilistic artificial intelligence (AI) and be prepared to take more advanced courses or to apply AI tools and ideas to real-world problems. This content will focus on “small-scale” problems in order to understand the foundations of Reinforcement Learning, as taught by world-renowned experts at the University of Alberta, Faculty of Science. The tools learned in this Specialization can be applied to game development (AI), customer interaction (how a website interacts with customers), smart assistants, recommender systems, supply chain, industrial control, finance, oil & gas pipelines, industrial control systems, and more....
强化学习

常见问题

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

还有其他问题吗?请访问 学生帮助中心