课程信息
67,991 次近期查看

第 3 门课程(共 4 门)

100% 在线

立即开始,按照自己的计划学习。

可灵活调整截止日期

根据您的日程表重置截止日期。

中级

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

完成时间大约为18 小时

建议:4-6 hours/week...

英语(English)

字幕:英语(English)

您将获得的技能

Artificial Intelligence (AI)Machine LearningReinforcement LearningFunction ApproximationIntelligent Systems

第 3 门课程(共 4 门)

100% 在线

立即开始,按照自己的计划学习。

可灵活调整截止日期

根据您的日程表重置截止日期。

中级

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

完成时间大约为18 小时

建议:4-6 hours/week...

英语(English)

字幕:英语(English)

教学大纲 - 您将从这门课程中学到什么

1
完成时间为 1 小时

Welcome to the Course!

2 个视频 (总计 12 分钟), 2 个阅读材料
2 个视频
Meet your instructors!8分钟
2 个阅读材料
Read Me: Pre-requisites and Learning Objectives10分钟
Reinforcement Learning Textbook10分钟
完成时间为 6 小时

On-policy Prediction with Approximation

13 个视频 (总计 69 分钟), 1 个阅读材料, 2 个测验
13 个视频
Generalization and Discrimination5分钟
Framing Value Estimation as Supervised Learning3分钟
The Value Error Objective4分钟
Introducing Gradient Descent7分钟
Gradient Monte for Policy Evaluation5分钟
State Aggregation with Monte Carlo7分钟
Semi-Gradient TD for Policy Evaluation3分钟
Comparing TD and Monte Carlo with State Aggregation4分钟
Doina Precup: Building Knowledge for AI Agents with Reinforcement Learning7分钟
The Linear TD Update3分钟
The True Objective for TD5分钟
Week 1 Summary4分钟
1 个阅读材料
Weekly Reading: On-policy Prediction with Approximation40分钟
1 个练习
On-policy Prediction with Approximation30分钟
2
完成时间为 8 小时

Constructing Features for Prediction

11 个视频 (总计 52 分钟), 1 个阅读材料, 2 个测验
11 个视频
Generalization Properties of Coarse Coding5分钟
Tile Coding3分钟
Using Tile Coding in TD4分钟
What is a Neural Network?3分钟
Non-linear Approximation with Neural Networks4分钟
Deep Neural Networks3分钟
Gradient Descent for Training Neural Networks8分钟
Optimization Strategies for NNs4分钟
David Silver on Deep Learning + RL = AI?9分钟
Week 2 Review2分钟
1 个阅读材料
Weekly Reading: On-policy Prediction with Approximation II40分钟
1 个练习
Constructing Features for Prediction28分钟
3
完成时间为 8 小时

Control with Approximation

7 个视频 (总计 41 分钟), 1 个阅读材料, 2 个测验
7 个视频
Episodic Sarsa in Mountain Car5分钟
Expected Sarsa with Function Approximation2分钟
Exploration under Function Approximation3分钟
Average Reward: A New Way of Formulating Control Problems10分钟
Satinder Singh on Intrinsic Rewards12分钟
Week 3 Review2分钟
1 个阅读材料
Weekly Reading: On-policy Control with Approximation40分钟
1 个练习
Control with Approximation40分钟
4
完成时间为 6 小时

Policy Gradient

11 个视频 (总计 55 分钟), 1 个阅读材料, 2 个测验
11 个视频
Advantages of Policy Parameterization5分钟
The Objective for Learning Policies5分钟
The Policy Gradient Theorem5分钟
Estimating the Policy Gradient4分钟
Actor-Critic Algorithm5分钟
Actor-Critic with Softmax Policies3分钟
Demonstration with Actor-Critic6分钟
Gaussian Policies for Continuous Actions7分钟
Week 4 Summary3分钟
Congratulations! Course 4 Preview2分钟
1 个阅读材料
Weekly Reading: Policy Gradient Methods40分钟
1 个练习
Policy Gradient Methods45分钟

讲师

Avatar

Martha White

Assistant Professor
Computing Science
Avatar

Adam White

Assistant Professor
Computing Science

关于 阿尔伯塔大学

UAlberta is considered among the world’s leading public research- and teaching-intensive universities. As one of Canada’s top universities, we’re known for excellence across the humanities, sciences, creative arts, business, engineering and health sciences....

关于 Alberta Machine Intelligence Institute

The Alberta Machine Intelligence Institute (Amii) is home to some of the world’s top talent in machine intelligence. We’re an Alberta-based research institute that pushes the bounds of academic knowledge and guides business understanding of artificial intelligence and machine learning....

关于 强化学习 专项课程

The Reinforcement Learning Specialization consists of 4 courses exploring the power of adaptive learning systems and artificial intelligence (AI). Harnessing the full potential of artificial intelligence requires adaptive learning systems. Learn how Reinforcement Learning (RL) solutions help solve real-world problems through trial-and-error interaction by implementing a complete RL solution from beginning to end. By the end of this Specialization, learners will understand the foundations of much of modern probabilistic artificial intelligence (AI) and be prepared to take more advanced courses or to apply AI tools and ideas to real-world problems. This content will focus on “small-scale” problems in order to understand the foundations of Reinforcement Learning, as taught by world-renowned experts at the University of Alberta, Faculty of Science. The tools learned in this Specialization can be applied to game development (AI), customer interaction (how a website interacts with customers), smart assistants, recommender systems, supply chain, industrial control, finance, oil & gas pipelines, industrial control systems, and more....
强化学习

常见问题

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

还有其他问题吗?请访问 学生帮助中心