Welcome to our third course of

the Coursera specialization in machine learning in finance.

The course will be on reinforcement learning one of

the most exciting topics in all of machine learning.

What is even more exciting that this course will talk

exclusively about reinforcement learning in finance from the very beginning.

This means that we will not spend any time on

looking at reinforcement learning algorithms outside of finance.

Many general reinforcement learning courses

that are designed for people that work in robotics,

start presenting reinforcement learning algorithms with

very simple problems such as tic-tac-toe game,

mace problems, or involved with [inaudible] problems.

In the interest of time,

we will bypass such examples and we'll start studying reinforcement learning directly,

within financial models, rather than such examples.

This will take us to our goals much faster.

If you are specifically interested in using

reinforcement learning to solve for example maze problems,

you will understand how it's done as well,

after reintroducing reiforcement learning models directly in the financial context.

This course will continue from where we left it in the previous course,

but this time we will go in more details and at a different pace.

And this is because, in this course we will be

looking at details of reinforcement learning algorithms.

We will have a bit more of math in this course,

so please bear with me on that.

If you find yourself confused with some with mathematical details,

at some point please recall that,

what we discuss on math and reading papers in our introductory course.

If you find some points difficult to grasp in real time,

you can always stop the speed in the wind,

or stop and look at the formulas or derive some formulas that I will be showing.

I will also provide links to the literature that might be helpful

for understanding lectures and doing homework, and of course project.

So, here is what we will be doing in this course.

We will start in the first week with one of

the most classical problems of quantitative finance namely,

with the problem of pricing financial options,

also called financial derivatives on stocks.

This is a topic of a huge literature,

starting with the celebrated paper by Black-Scholes and paper by Norton.

If you're not familiar with options,

I will provide background material on that as well.

Now the reason we started with

the reinforcement learning can finance with options rather than stocks,

is that from the RL perspective,

the problem we will be solving with an option in the first week, is simpler.

In the first week, we will consider modification of

the Black-Scholes model to discrete time formulation.

And as we will see later, such models,

can re-formulated as a marker of decision proces model with a certain reward function.

As we will see pricing and management of Norton want to solving such MDP problem.

And this is unlike the original Black-Scholes model,

where as we will see later,

there is no reward to maximize.

In the second week,

we will consider a dynamic programming approach to solving this MDP problem.

We will build numerical computational scheme,

that you will compare with the Black-Scholes' model in your homework.

Then in third week,

we will learn about some of the main algorithms of reinforcement learning.

We will first introduce Q learning,

one of the most famous algorithms off RL.

And then, we will introduce its extension called, fitted Q iteration.

In your homework for this week,

you will work with these algorithms,

and see how they work to price an option,

in a reinforcement learning way.

The fourth week of this course,

will be devoted to the implications of

reinforcement going for dynamic management of stock portfolios.

And here we are talking not about just one,

but a few classical quantitative finance problems,

such as the Markowitz optimal investment portfolio and optimal stock trading.

And as we will see,

these problems are more complex than the problem of

pricing a single option and they require some new methods.

And such methods are needed to deal with

a high dimensional state and action spaces in such problems.

These methods include, inverse reinforcement learning,

as well as other methods for example a method called

G-learning and some other interesting and useful approaches.

We will study these methods in last week,

and you will apply them later in order to learn

optimal trading in your final course project.

So, good luck with this course and let's get started.