In this module we're going to discuss a varied topic of problem. Namely, how should average returns be computed? We'll see why this is an important question and why there can be different answers to this question. We'll also see what answer is more appropriate to investors. Here's an example. Suppose an investment fund delivers the following performance. In year 1, they returned 20%. In year 2, they returned minus 10%. What is the average annual return of the fund? Well, it's going to be 20% minus 10% divided by 2, which is equal to 5%. So we can say the average annual return is 5% here. But suppose I change things just a little bit. Actually, I won't change anything, I'm going to give you a little bit more information. Consider this example. The exact same fund, the exact same performances in years 1 and 2, plus 20% and minus 10%. But now I also tell you what number of dollars were invested in the fund. In year 1 there was 1 million dollars invested in the fund. In year 2, there was 10 million dollars invested in the fund. Now I'm going to ask you the same question. What is the average annual return of the fund again? Well, it's not clear any longer because there's two possibilities. We could say it's 5% as before where we just take the average of 20 and minus 10. Or we could choose to compute a dollar weighted average return. If you look at this, you can see there's 1 million dollars, which received an average return of 20%. And there's another 10 million dollars which received an average return of minus 10%. So if I look at the average return to each dollar, then this is the correct answer. It's 1 million times 20% minus 10 million times 10% divided by total of 11 million and I get minus 7.27%. So in this case the average annual return is actually much smaller. So we can see 5%, or minus 7.27%. And the question is which return is more compelling if any? Why is this important? Well it is important because investors care about returns to their dollars. And so in fact you could argue that at the aggregate level, investors should be caring much more about a dollar weighted return, in which case this number is more significant. And so to emphasize this claim, consider the following two situations. If you're an aggregate investor, in other words if you take all investors together and you asked them which would they prefer, would they prefer this situation? Let's call this situation 1, or situation 2. The difference between situation 1 and situation 2, is that in situation 1, 1 million dollars was invested in year 1 and that earned 20%. And $10 million was invested in year 2, and that earned minus 10%. Or the reverse of that is situation 2. 10 million dollars in year 1, earning 20%. And one million dollars in year 2, losing 10%. While I think investors in aggregate would per, far prefer to be investing in this situation here, because this is what will happen to their dollars. Investors care about dollars invested, what's going to happen to their dollars. They don't necessarily care about average annual return of 5%. If in years where the returns were very high, they didn't have any dollars invested. And years in which returns were very low they had lots of dollars invested. What they care about is the return on their dollar. Here's another reason why investors should care about the total number of dollars invested. In financial markets, expected returns often decrease as the dollars invested increase. This is because the liquidity of a market, or the so called capacity of a trading strategy is not unbounded. Now this isn't always obvious to the small investor who only invests in liquid markets and therefore does not move the market. So, what I'm getting at here is a small investor might buy some shares in an S&P 500 ETF. Or maybe they buy some foreign exchange. Those markets are extremely liquid, so a small investor trading in those markets is not going to move the markets. In other words, the act of their trading is not going to have an impact on the market price of those securities. This is not true in general for large investors. The larger they are, the more they tend to move a market. The more liquid the market, the more they move it. And in this case the cost per security increases with the number of securities they buy. And the cost per security decreases with the number of securities they sell. So this implies that returns decrease on average as dollars invested increases. Let me give you an example. A simple example which is, might be a gambling example. Suppose we've got 2 teams. We've got team A, and team B. Let's suppose, that the odds of team A beating team B are 50%, and the odds of team B beating team A are 50%. And let's suppose that the market agrees on these odds, maybe you're going to Vegas and you want to bet on team A versus team B, you see these odds in the casino. You however think that the probability that team A will win is 75% and the team B will win is 25%. So in this situation you'd like to bet on team A. But you won't be able to bet an unlimited amount. Maybe it's not Vegas, may your friend is giving you these odds. So your friend is giving you these odds of 50% and 50%, but they'll tell you, sure you can bet but I'm not going to accept a bet of more than $10. Well in that case the most you can bet is $10. And so in this case it's a very ill liquid market. There's not much capacity in the market, the capacity is $10, after which there's no ability to trade anymore. So believe it or not, financial markets behave like that as well. The more you trade in some of these markets, especially for big investors, the more you move the market against you. And so what happens is you tend to see decreasing returns to dollars invested. Now the question of how to compute average returns is important. Depending on how you answer it, certain types of investing can seem more or much less attractive. An example of this is the hedge fund industry. On aggregate, they would prefer to report average returns over time. And in fact they do so. Now that's not to say the hedge funds are being dishonest, they're certainly not. One can just view it as being good marketing. Every industry markets and the hedge fund industry is no different. So if they wish to report their returns as being average returns over time, then that's fair enough. However, we as investors should be aware of this and be aware that from our perspective we care more about average net returns per dollar invested. So if we measure returns this way we might get a far different average return than that reported by say the hedge fund industry. And it's important to be aware of this, because there are very different, different ways of computing returns, and you get very different answers depending on how you compute them. This has actually caused some controversy and debate. There are some financial blogs out there that discuss this topic. A nice blog and a nice discussion of this topic can be found at this URL here. And I'll encourage you to take a look at it and read this discussion. Here's another problem with averages. It's not a financial example but it is a nice example because it demonstrates how people can be easily confused by the way a question is worded. Sometimes the confusion becomes very apparent once it's explained, but in everyday conversation, sometimes this, these issues go by us we don't really notice we're calculating the wrong quantity. So here's a question. Suppose I wish to estimate the average number of children per family in the US. And to compute an estimate I do the following. I sample n people randomly, maybe n will be a very large number. Maybe it's a 1000 or 10,000 or 50,000. And for the i th person I determine x i, which is the number of siblings in his or her family. My estimate, c hat say, is then given by the following. C hat is going be some of the XI's plus 1. So, this extra 1 is for the person that I sampled. So, the number of children in that family will be the number of siblings plus the person I sampled. So that's x i plus 1.and then I divide by n. So, that's my estimate of the average number of children per family in the US. Now let's ignore any minor problems that you might see with this sket sampling. There's a bigger question here. And the bigger question is does the sampling scheme have a fundamental problem? If so, in what way will c hat be biased? And how does this problem compare to the average return problem? So these are some other questions we are interested in as well. To explain to you why there's a problem with c hat, consider the following situation. Let's assume there's a universe of 5 families. So this is family. We've got family number 1. This is the number of kids, or children, in each family. So family number 1 we'll assume has 4 children. Family number 2 we'll assume has 3 children. Family number 3 has 2 children, family number 4 has 3 children, and family number 5 has 0 children. So this is our universe. The total number of children is 12 and so the average number of children per family is 12 over 5 which is equal to 2.4. So this is the correct answer. 2.4 is the average number of children per family in this universe. But if I use the sampling scheme in the previous slide, where I sampled by child or by kid, I'm going to get a different answer. To see this note the following. There's a total of 12 children. So if I sample by child, then 4 out of 12 times, I'm going to sample 1 of these 4 children. Each of these children will say 3 siblings plus themselves will lead to 4. I've got 2 families with 3 kids, so that's a total of 6 kids. So, 6 out of 12 times I'm going to sample a child from here or from here. Each of those children will say they've got 3 fam, 3 kids in their family including themselves. 2 out the 12 times I'm going to sample 1 of these 2 children. And each of these 2 children will say they have 1 sibling. So, 1 plus themselves will equal to 2 so I get an answer of 2 here. And then 0 out of 12 times, I'll sample from down here and a reported number of siblings will be 0. So, I'll get a total here equal to let's see, it's 16 plus 18, 34. 34 plus 4 is 38 over 12. And 38 over 12 is equal to 3 and 1 6th. So in this case, the way I compute the average here, by sampling by child, I'm going to get an average of 3 and 1 6th, and this is the wrong number. The average I want is 2.4. So what I've done here is I've actually calculated the average incorrectly. I want to know the average number of children per family. So what I should be doing is sampling by family. Which is effectively what I'm doing down here. Instead, the sampling scheme I gave to you on this previous slide, I'm sampling by person, or by child if you like. And by doing that, I'm getting this average over here. And in fact, I'm getting a number that's too large. And in fact, that's how c hat would be biased. I'm more likely to sample children from large families, as we saw here, so those families will over report themselves. They'll have, we'll see higher average numbers as a result. We'll get 3 and 1 6th in this case. And in fact, an easy way to see this is to note the families with 0 children will never be sampled. So if we're ignoring all families with 0 children it should be clear that our bias is upwards. And here's another problem that has been very topical recently. It concerns the controversy surrounding waiting times to get through immigration at Heathrow Airport In London. This was a big news story last year when many people who were entering Heathrow airport, and had to wait a very long time to get through immigration. So a lot of newspapers were writing in about, writing about this problem at the time. It was definitely a source of controversy in Britain. And so people were interested in estimating the average waiting time of travelers at immigration at Heathrow airport. One way in which this est, in which this average waiting time is estimated was as follows. Sample 1 person every hour, compute that person's waiting time and then take the average of all these people. So maybe there is 16 hours in a day. We get x 1 up to x 16. We sample 1 person from each hour, find their waiting time, and take the average. And then report this as the average waiting time to get through immigration. The question here is, is this a good scheme? I'm not going to answer this question, but you can think about it. I will give you a hint, it's a bad scheme. And it has a fundamental problem which is similar to the problem on the previous slide where we discussed ways to compute the average number of children per family.