In this module, we're going to walk through and identify some im,

implementation difficulties with mean-variance portfolio selection.

We're going to walk through 3 main ideas, one, what happens when there are parameter

address. Two, what happens when you have to take

negative positions and you want to avoid short positions.

And three, what happens when variance is not really the best measure for risk.

There are many aspects of the implementation details of mean-variance

that one could focus on. We chose to focus on the three most

important ones. First, has to do with parameter

estimation. The parameters that go into a

mean-variance portfolio selection problem in practical situations is never known.

The true mean vector and the true covariance matrix of the assets is

unknown. All we have is historical data, and we

will have to estimate these parameters using these historical returns.

And as a consequence, we end up making statistical errors.

For the mean vector, the data is often sufficient, but when you start estimating

the covariance matrix, the data is never sufficient.

The reason is, that this covariance matrix has ordered d squared independent

parameters. In order to have sufficient data to

estimate these d squared parameters, you have to collect returns over a very long

period, and over this long period, the market parameters shift.

So, you sort of are playing a game where you can never able to get enough data to

estimate these parameters sufficiently. Moreover, the portfolio that you compute

to allow to be very sensitive to estimation errors and we'll focus on this

in one of the modules. We're going to show you why this happens,

how you could correct for it, and what are the current state of the art on taking

estimates and constructing portfolios from them.

They're also going to focus on how does one get negative exposures in the Excel

module, that goes with the mean-variance theoretical module which showed you that

very often, the optimal portfolio has short positions.

Taking on short positions is very dangerous particularly because it has an

unlimited downside. You can lose a lot of money because the

price could suddenly jump very high and you end up losing a lot of money on the

short positions and it's for this reason that it's not very often allowed for

wealth managers. One way to get negative exposure is to use

a leverage exchange traded fund, or a leveraged ETF.

But if you use leveraged ETFs, you have to be very careful.

And in one of the modules, we're going to focus on how do ETFs work, what are the

difficulties associated with ETFs, how should you interpret the returns of ETFs?

Finally, we're going to talk about whether variance itself is a good measure for

risk. Mean-variance portfolio selection focuses

on variance as a risk measure or equivalently volatility as the risk

measure. Does it make sense to use this risk

measure? What, what are the limitations of

variance? What can you do to mitigate some of these

limitations is going to be the focus of another module.

In this module, we will mainly focus on the issues associated with parameter

estimation. And the starting point of this module is

that the true parameters that we are after, which is the mean vector and the

covariance matrix of the assets is never known.

And we are going to use historical returns to compute estimates for mean var, mean

return, and the covariance matrix. And the easiest way to do that is to

estimate the mean return by the sample average of the returns over some period N.

Once you have the sample average for the mean, you can compute the covariance

matrix by just substituting instead of the true mean, the estimated mean, to get an

estimate for what the variance is. What I've done on this plot that goes on

this slide is I simulated the returns using the mean vector and the covariance

matrix given in the spreadsheet that goes with these modules, I simulated 60 months

of data. And using those 60 months of data, I

estimated the mean. Each of these green dots on this plot are

an estimated value of the mean using one particular simulation of 60 months worth

of data. I'm only plotting the estimated mean for

asset 1 and asset 2. The point that I want you to focus on is

that the estimated mean can often be very far away from the true mean.

The true mean has been plotted on this plot with a red square.

Here's where the true mean is. This is a valid estimated mean generated

from 60 months of data. And as you can notice, it's very, very far

away from what the true mean is going to be.

What we do know is that if I estimate the mean and I construct the 95% confidence

interval around it, so here is one particular value of the estimated mean,

here is the 95% confidence interval around it.

And because we are talking about two assets, this interval becomes an ellipse.

It's a 95% confidence ellipse. Then, with probability 0.95, the true mean

lies in the ellipse. So, in this particular case, the true mean

barely makes itself into the 95% ellipse. So, the question you should ask yourself

is, does parameter error matter? And in this slide, I want to tell you that

parameter error is often very serious for mean-variance portfolio selection.

And what I'm describing on this is the same experiment that I described in the

last slide, taken one step further. I estimated the mean and the covariance

matrix using 60 months of data. So, I take one sample from all those green

dots that I showed you in that slide. I have a mean vector.

I have a covariance matrix. So, I can construct an efficient frontier

using that data. I'm going to call that the estimated

frontier. So, the green line here on this slide,

this one, is the estimated frontier. It's the frontier that has been computed

using an estimate for the mean and estimate for the covariance matrix.

The blue line is the true frontier. This is the frontier corresponding to the

unknown true mean and the unknown true covariance matrix.

The red line is, is labelled the realized frontier.

What that means is I take a frontier portfolio, on the green estimated

frontier, compute the true mean return on that portfolio and the true volatility of

the portfolio and plot it. And the line that I get from doing that is

the red line. So, this diamond here actually gets moved

to this diamond when you replace the estimated mean with the true mean and the

estimated covariance matrix with the true covariance matrix.

And as you can notice, there is a big gap between what the estimated return on that

portfolio is going to be and what the true return on that portfolio is.

The estimated return is around 6.4%, and the true return, or the realized return if

you were to use that portfolio in the market, would be close to 4.4%, a good 2%

drop. Why does this happen?

Is this generic or did it happen just for one of the samples?

In this slide, I'm plotting the estimated frontiers corresponding to 5 different

simulation runs. I simulated 60 months of data 5 different

times, computed the estimated mean, the estimated covariance matrix and I've

plotted the corresponding estimated frontier.

The green lines on this plot are five different estimated frontiers and as you

can see, these frontiers are extremely unstable.

Not only are the frontiers unstable, the difference between the frontiers and the

estimated frontiers and the realized frontiers can also be very large.

So, we want to understand why this happens.

Why is there such a big gap between what happens in the estimated frontier and what

is actually realized? Why is the estimated frontier so unstable

and is there anything that we can do to remove this gap and remove this

instability? Why is parameter error so serious?

In order to understand this, let's walk through a very simple example.

Suppose I have two identical assets with mean mu and covariance sigma squared and

correlation equal to 0, then the optimal investment for these 2 assets would be to

take half a position in asset 1 and a half a position in asset 2.

That's what would give you the least volatility.

Suppose now that the estimate for these returns are slightly off their true

values. So, I estimate the return on asset 1 to be

slightly larger than the true value so it's mu plus epsilon.

I estimate the mean return on asset to be slightly smaller than the true value, mu

minus epsilon. So, on average, I'm making zero error.

On average, the estimator is very good. So, if you were thinking about the

properties of a statistical estimator, you would say that whatever the estimator is

been used here, is pretty good. Across the assets, you're not making a lot

of error. But the problem with mean-variance

portfolio selection is that after I estimate these parameters, I'm going to

optimize my portfolio using these parameters.

So, what happens? I've estimated that the return on asset 1

is slightly larger than the return on asset 2.

And therefore, I will overweight asset 1 as compared to asset 2.

If I'm allowed short positions, then I'm going to short asset 2 and actually start

investing more, take more leverage on asset 1.

But this is precisely the wrong thing to do.

If I take the portfolio that I compute which overweights asset 1 and underweights

asset 2 and put it into the market, I would get a return where the overweighted

asset is going to perform worse than expected, the realized return is going to

be mu, below mu plus epsilon. And the undeweighted asset, which is asset

2, will perform better than expected. So, instead of having a return mu minus

epsilon, this asset, asset 2, is going to have a return mu, which is an espilon

larger than the expected return which is mu minus espilon.

And this performance, this gap between the estimated performance and the realized

performance will become worse as more and more shorting is allowed.

This is what accounts for the big difference between the estimated

performance and the realized performance. The main difficulty is, we take the

estimated parameters and then optimize. And this optimization procedure inflates

or maximizes the statistical errors in the parameters.

There is a quote which sort of sums up the situation.

Mean-variance results in error maximizing investment irrelevant portfolios.

So, we have to do something in order to make mean-variance portfolio selection

practical. So, one idea that might come out of

looking at this slide is that the performance becomes worse as we allow more

leverage. So perhaps, the idea would be to limit

short positions, not allow short positions at all.

And then, let's see what happens to the performance.

In this slide, I'm plotting what happens to the estimated frontier, which is the

green line, and the realized frontier, which is the red line, when you have a

no-short sales constraint. And as you can see, that the realized

frontier becomes very unstable this has a large part of the curve down here which is

actually inefficient. And the reason behind this is because the

feasible region for the portfolios now has a corner.

So, if this is x1, that is x2, you want x1, x2 to be grater than equal to 0, so

you end up getting a corner in the feasible region and this corner causes

problems in portfolio selection, it causes instabilities in portfolio selections.

As you add more constrains, maybe you have some asset sector constraints, maybe you

have some constraints on how much money a particular sector can have and so on.

All of these become linear constraints. All of these induce more corners and more

instabilities. If you want to get at what the no-short

sales constraints was doing, which is to limit leverage, the better thing to do is

directly put a constrain on leverage. And if you put a constraint on leverage,

you end up getting performance shown up in this curve.

Now, the realized performance of the portfolio is pretty close to the expected

performance. The gap between these two is small.

But the gap between what is expected and what is realized, this gap is still very

large. So, I expect to perform on the green line

based on the data. I get the real, realized performance is

going to be the red line. Remember, this blue line is actually not

known in practice so even though the true performance and the realized performance

are very close, I have no way of knowing how well I'm performing.

So, leverage constraints do work well in practice but still, the estimated frontier

is very bad and so there's needs to be some work in trying to bring that down.

The state of the art right now is something called robust portfolio

selection. In the robust portfolio selection, what

one does is removes the target constraint, which is imposed with respect to the

estimated value of the mean and replaces it by a target return constraint which is

with respect to the worst possible mean in the confidence region.

So, let Sm denote the confidence region for the mean.

A few slides back, I showed you that the confidence region is an ellipse.

So, instead of using a target return constraints, which says to take the

estimated value of the mu transpose x and in, insist that, that should be greater

than equal to r, we'll going to replace it by these constraints.

And what do these constraints say? It says, you choose your portfolio x, the

return that you're going to get is going to be the worst possible return in the

confidence region. Any point in the confidence region is

possible and, therefore, this worst return is something that you could possibly see

in the market. And now, instead of that constraint on the

target return, I'm going to put a constraint that this minimum value must be

greater or equal to r. I can do portfolio selection with this

constraint. It's a little bit harder but not much

harder. And now, the picture I end up getting

looks like the plot here. The estimated frontier starts coming down.

Why does this happen? This happens because now, I'm putting the

worst case. So, I have estimated value of mu, this

could be the estimated portfolio performance.

But because now I have put the worst case constraint, this gets dragged down.

The realized performance also becomes bigger than expected.

So, that starts getting pulled up and therefore, the gap between these two

starts to become very small. There are issues with this technology.

You can sometimes get portfolios which are not very interpretable and therefore it's

having a little difficulty getting fraction.

But over time technology either directly or some version of this technology is

likely to become very practical. All of these methods were focused on

trying to improve the optimization strategy.

There is a flip side to this methodology, where one tries to improve the estimation

strategy. So, here's, are some methods that people

have used to improve parameter estimate. One of the most popular methods are

so-called the shrinkage methods. And what one does in these shrinkage

methods is that one shrinks to some global quantity.

These were introduced by Charles Stein in 1961.

There's a paper by James and Stein, and more recently, Ledoit and Wolf have

extended this to the case of covariance matrices in other circumstances.

So, let's take the case of the mean. Earlier, I would have estimated each of

the asset means separately. So, mu est i stands for the estimated mean

for asset i. Now, in the shrinkage technology, instead

of just estimating this asset mean, I'm also going to estimate a global mean,

global average mean on the assets. And there is a reason why I put this

estimation outside this bracket. And the reason for that is when I estimate

this quantity, I don't simply take the estimate for all of the d assets and add

them up. I assume that all the assets have the same

expected mean and use the data of all the assets to estimate that mean.

As a result, I have more data when I'm estimating the total mean, than when I'm

estimating a given assets mean. As a result, I expect that the error in

this global mean is smaller. So, error is smaller.

And the error is larger in individual means.

Now, this shrunk estimate, what it does is it takes the estimate for a particular

asset, estimate for the global one, let's just call it mu bar and it moves on this

line, some element alpha. When alpha is equal to 1, it's up here.

When alpha is equal to 0, it's down here. For some intermediate value of alpha

between 0 and 1, it's some point over here.

This one has a very small error. That one has a bigger error.

And when you shrink, you end up getting that the error at this point would be

smaller. The tradeoff is as you, as you decrease

alpha and start coming closer to the global mean, you have less information

about what the asset is going to do, but you have less statistical error.

As you increase alpha, you have more information about the asset is going to

do, but you have more estimation error. So, somewhere in between is the best

thing. The next expression is a same kind of idea

applied to the covariance matrix. So here, there shouldn't be estimated, but

shrunk. This should be estimated down here.

So, we have a shrunk estimate for the covariance.

All it does is takes the estimated value for the covariance matrix and shrinks it

towards another covariance matrix where all the assets have the same volatility or

the same variance. Again, the idea is the same but if I want

to compute one variance for all the assets, I have a lot more data, I can

estimate it better, and if I shrink the estimated covariance matrix towards this

global covariance matrix, I end up getting a better estimate, meaning an estimate

with lower errors. Another way to improve parameter estimates

is to use subjective views and the most popular way of doing that is the so-called

Black-Litterman method. Recently, people have been starting to use

non-parametric nearest neighbor like methods to estimate performance and this

is because people have started going away from parametric models like mean-variance

and going towards more data driven models. And the idea here is to observe the

current return here r, go back into the past and find all those times t where the

return is close to the current return. So, this is the current return, this is

the return at some point t in the past. You want to make sure that it's pretty

close to the current return. And for all those times t, find out what

happened to time t plus 1 and use that as sample of what is going to happen in the

future. These non-parametric methods are currently

at a very theoretical level, but there is a possibility that these methods will

provide a better way of doing portfolio selection in the future.