182,923 次近期查看

## 10%

100% 在线

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

### 您将学到的内容有

• Formalize problems as Markov Decision Processes

• Understand basic exploration methods and the exploration / exploitation tradeoff

• Understand value functions, as a general-purpose tool for optimal decision-making

• Know how to implement dynamic programming as an efficient solution approach to an industrial control problem

### 您将获得的技能

Artificial Intelligence (AI)Machine LearningReinforcement LearningFunction ApproximationIntelligent Systems

## 10%

100% 在线

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

1

## Welcome to the Course!

4 个视频 （总计 20 分钟）, 2 个阅读材料
4 个视频
Course Introduction5分钟
2 个阅读材料
Reinforcement Learning Textbook10分钟
Read Me: Pre-requisites and Learning Objectives10分钟

## An Introduction to Sequential Decision-Making

8 个视频 （总计 46 分钟）, 3 个阅读材料, 2 个测验
8 个视频
Learning Action Values4分钟
Estimating Action Values Incrementally5分钟
Optimistic Initial Values6分钟
Upper-Confidence Bound (UCB) Action Selection5分钟
Jonathan Langford: Contextual Bandits for Real World Reinforcement Learning8分钟
Week 1 Summary3分钟
3 个阅读材料
Module 1 Learning Objectives10分钟
Chapter Summary30分钟
1 个练习
Sequential Decision-Making45分钟
2

## Markov Decision Processes

7 个视频 （总计 36 分钟）, 2 个阅读材料, 2 个测验
7 个视频
Examples of MDPs4分钟
The Goal of Reinforcement Learning3分钟
Michael Littman: The Reward Hypothesis12分钟
Examples of Episodic and Continuing Tasks3分钟
Week 2 Summary1分钟
2 个阅读材料
Module 2 Learning Objectives10分钟
1 个练习
MDPs45分钟
3

## Value Functions & Bellman Equations

9 个视频 （总计 56 分钟）, 3 个阅读材料, 2 个测验
9 个视频
Value Functions6分钟
Rich Sutton and Andy Barto: A brief History of RL7分钟
Bellman Equation Derivation6分钟
Why Bellman Equations?5分钟
Optimal Policies7分钟
Optimal Value Functions5分钟
Using Optimal Value Functions to Get Optimal Policies8分钟
Week 3 Summary4分钟
3 个阅读材料
Module 3 Learning Objectives10分钟
Chapter Summary13分钟
2 个练习
[Practice] Value Functions and Bellman Equations45分钟
Value Functions and Bellman Equations45分钟
4

## Dynamic Programming

10 个视频 （总计 72 分钟）, 3 个阅读材料, 2 个测验
10 个视频
Iterative Policy Evaluation8分钟
Policy Improvement4分钟
Policy Iteration8分钟
Flexibility of the Policy Iteration Framework4分钟
Efficiency of Dynamic Programming5分钟
Warren Powell: Approximate Dynamic Programming for Fleet Management (Short)7分钟
Warren Powell: Approximate Dynamic Programming for Fleet Management (Long)21分钟
Week 4 Summary2分钟
Congratulations!3分钟
3 个阅读材料
Module 4 Learning Objectives10分钟