Now, let's make a summary of what we learnt in the four weeks of this course.

What we viewed in this course is we took

two most fundamental problems of Quantitative Finance,

and then showed that not only Reinforcement Learning can be used for these problems,

but rather these problems themselves can be

formulated in terms of Reinforcement Learning tasks.

We started with the problem of option pricing and hedging,

and showed how this problem can be

reformulated as a discrete time mark of decision process.

Then we learnt about methods of Reinforcement Learning,

such as Q-Learning and Fitted Q-Iteration,

and saw how they apply to this model.

We found that the whole scheme in

the case of a single option is rather simple computationally.

To compute the optimal price and hedge over an option,

you only need a bunch of linear regressions and nothing else.

Their model learns directly from trading data from

replication portfolio for such option because it's a Reinforcement Learning model.

We used Model-Based Reinforcement Learning,

where the word function is a quadratic function of its argument,

and this is what made the model very tractable.

In particular, we do not need

fancy functional approximation methods such as neural networks for such simple settings.

Then in this week of the course,

we look at the different and very large class of portfolio optimization problems.

Here again, we first built

a simple portfolio model that can be applied to all such tasks,

including optimal portfolio liquidation,

optimal investment portfolio, and index tracking.

And using this model,

we learnt the new topics in Reinforcement Learning itself,

including Stochastic Policies, and Entropy-Regularized Reinforcement Learning.

Such Entropy-Regularized Reinforcement Learning is very useful for learning in

noisy environment which is almost always the case in Finance.

And finally, we saw how we can do inverse

Reinforcement Learning with the same dynamic portfolio model,

and how we can use it to infer market views and values of

private signals much in the spirit of the famous black weights amount model.

Because the model is quite simple.

We again managed to proceed in this model without

invoking sophisticated functional approximation method such as neural networks.

As we will see in our next course,

other problems of Reinforcement Learning in Finance,

do use neural networks,

and in particular GP neural networks producing GP Reinforcement Learning.

We'll talk more about GP Reinforcement Learning

in the next course of this specialization.

And for now, I wish you good luck for your course project,

and see you in the next course.