0:00

Hi, In the Sun Lectures we're talking about prediction, And here's the idea, we

want to think of individual people make prediction based on models. Those models

can be based on categories or linear models or Markov models, any of the models

you've learned in this class you could use to make some sort of prediction. And what

we want to talk about is where collective wisdom can come from. So if we have a

whole bunch of people using a whole bunch of different models, how does that enable

the crowd of models to do better at making sense of the world, making an accurate

prediction? Now, the essence of our argument is going to be something called

The Diversity Prediction Theorem. And, the Diversity Prediction Theorem is gonna

relate the wisdom of the crowd to the wisdom of the individual. So, in other

words the accuracy of the crowd, in relationship to the accuracy of the

individuals. Now one logic that should come to you right away is if I had more

accurate individuals, I should also get a more accurate crowd. But a logic that

might not come to you right away is that if I had a more diverse crowd, I should

also get more accuracy. So, we can think of things like the crowds accuracy is

going to depend on the individual accuracy Plus The diversity. Now the question is

how much do these things matter. How much does it depend on, individual accuracy and

how much does it depend on diversity. That's why we wanna use a model, to figure

out. So let's first do an example just to get some bearings, some inner bearings in

terms of what it means for a crowd to make a mistake versus an individual to make a

mistake and what diversity is. So we'll do an extremely simple example. We have three

people, Amy, Bell and Carlos. And let's suppose that they're picking the number of

people that come to our diner on a particular day for lunch. And so Amy

predicts it's gonna be ten. Val predicts it's gonna be sixteen and Carla predicts

it's gonna be 25. Now if I add these up, I'm gonna get 51 and divide by three, I'm

gonna get an aver age value of seventeen. So the crowd predicts seventeen. Now let's

suppose the actual value is eighteen. Now again, if, let's suppose the crowd is

pretty accurate, it's not gonna matter but this is just for the purpose of the

example, I'm gonna make it so the crowd actually does pretty well, I just wanna

work to the logic. So the first thing I'm gonna do is I wanna figure out how

accurate are these people. Well what I can do is I can compute the error of each

person. So remember the, the true value was eighteen, that's the number that

showed up. And I can ask what's the error of each individual? And remember we

computed errors by looking at variations, squared error. So INU predicted ten, the

truth was eighteen, so her squared error is 64. Belle predicted sixteen, The true

value is eighteen, So her score there is four. Carlos predicts 25, The true value

is eighteen, So his score there is 49. And if we add all those up, I get 117, And if

I divide that by three, I get the average there that's 39. So on average, these

people are off by 39. Some people, Belle, are really accurate, She's only off by

four. Other people, Amy, is off by quite a bit. She's off by, her error is 64, But

the average is 39. So this sort of gives us a sense of how accurate the individuals

are. The individuals off, are off by an average of 64, four and 49, for an average

of 39. Now we can ask how accurate was the crowd? Remember the crowd predicted

seventeen, because that was The average prediction of the three people. The

[inaudible] is eighteen, so we get the crowd was only off by one. So the crowd

here is, if you notice, better than anybody in it. So, we get "the wisdom of

crowds." Well, let's try and think about. Why that makes sense and to do that we're

going to look at diversity. So diversity [inaudible] is the variation in the

predictions. So how do we do the variation in the predictions? We look at each

person's prediction and its distance from the mean prediction not from the true

[inaudible], the mean prediction. So the mean predicti on was seventeen so. Amy's

contribution to this sort of total variation of predictions is ten minus

seventeen squared, which is 49. Belle's is sixteen. It should be seventeen minus

seventeen squared, which is one, and Carlos's is 25 minus seventeen square

root, which is gonna be 64. Now if I add all these up, I get 114, And if I divide

by three, I get 38. So the diversity of these predictions is 38. Well notice this.

The crowd's error was one. The average error was 39 and the diversity was 38. So

I look at that, I get one equals 39, minus 38. The crowd's error in this case equals

the average error minus the diversity. But I just Set this up, What turns out That's

always true. This is what the diversity prediction theorem says; That the crowd's

error equals the average error minus the diversity. Now, this isn't some, you know,

feel good setting, This is a mathematical fact, This is an identity. So no

assumptions have to be made here. This, there's no opposite [inaudible]. This is

just true. If I have a set of predictions, it will always be the case that the error

of the crowd to the average errors. Squared error, the average prediction

squared error is going to equal the average squared error of the people in

that crowd, minus the diversity of their predictions. Now the way to write that

formally is like this, Now this looks pretty scary, but let's just walk through

it. So let's let C be the crowd's prediction. Data be the truth, so data is

equal to the true value, And so this [inaudible] thing. This is the crowd

square [inaudible]. So it's the distance from the crowd to the truth. Let's let SI

here equal individual I's prediction So individual I's prediction. And so we're

gonna get... This is I's prediction minus the truth squared. And then we sum that

all up over all the individuals, and we divide by the number of individuals. So

that's just gonna be the average error. So crowd [inaudible] equals average error

minus... Now we take each person's prediction minus the crowd's prediction,

which is C. Remem ber, because C is the crowd. So this tells us how far people are

from the crowd on average. We sum those up and rate divide by N, So we get the crowds

there equals the average airlines diversity. Now if you take this equation

and expand all the terms and cancel everything out you'll see that it's an

identity. It's a mathematical identify. So it's always true. Crowds there equals

average year minus diversity. Let me give a famous example to sort of drive this on.

So in a book called The Wisdom of Crowds by Jim Surowiecki, he talks about the 1906

West of England Fat, Stock, and Poultry Exhibition. At this exhibition, 787

people, Guess the weight of a steer. Their average guess was I think, 197 pounds; the

actual weight of the steer was 198 pounds. So they're only off by a pound. So you're

looking at it and say, oh my gosh that's amazing, that's the wisdom of crowds. But

let's think about it, what's going on? We've got a bunch of predictions, there's

a true value, there's an average value, our theorem, this thing, this [inaudible]

theorem must hold. And, in fact, if you take Galton's data. And you plug it all

in, here's what you get. The crowd's error is actually a little bit less than a

pound, it's.6. The average error is 2956. Now, wait, that seems crazy, 'cause

remember, the steer only weighs 1100 pounds. So if this thing weighs 1100

pounds, how could they be up by 2956? Whenever these are squared errors, so if I

square 50 I get 2500 and if I square 60 I get 3600. So this is probably 55, 56

squared. Something like that. Well, that makes sense because people could probably

guess the weight of a steer within about 55, 56 pounds. Well, why is that? Well

think about it. A steer's five times the size of a person. If you can guess the

weight of a person within about ten pounds, you can probably guess the weight

of a steer to about 50 pounds. So what you've got is you've got some sort of, you

know people are reasonably good at guessing the weight of steers. They're not

geniuses, but they're also not crazy. They're not guessing 15,000 pounds. So

these are reasonably knowledgeable people who for whatever reason are you know

making these errors of about 55, 56 pounds. Not it's interesting is that there

diversity is 29 55. So, what you get is. The crowd is wise because they're

moderately accurate. I drop by 55, 56 pounds and they're are also diverse and

it's that accuracy plus diversity that makes the crowd do so well. Now if you

think about this book, the wisdom of crowds, see we can get a bunch of

examples, well that's the case. Let's think about in the context of our theorem,

so we've got crowd air, equals average air, minus diversity. Now in this book,

Sir Wiki says, here's what matters, diversity matters a lot. Well why does

diversity matter a lot f we're looking at the wisdom of crowds, let's see, this

actually, the math will tell us why. If you make it into the book, the wisdom of

crowds, what has to be true? This has to be small. The collective area has to be

small. So if the collective area isn't small it doesn't make the book it's not

the wisdom of crowds, it's the madness of crowds. So for the wisdom of crowds to

exist this has to be small, collective area has to be small. Let's think what

else has to be true, the average air has to be fairly large, why does it have to be

fairly large? If the average air is small, that means it was a easy thing to predict,

everybody can pretty much get it right. So if it's interesting enough to make a book

called the wisdom of crowds, where the crowd is smart and the people aren't, if

the people are not smart that means the average area has to be large. Well if you

got something small equal something large minus something else. This other thing has

to be large, Which means diversity has to be large. So when Surowiecki walks through

all these examples and he looks at what's going on, he says, look there's a lot of

diversity. And diversity seems to be a key component in the wisdom of crowds. You

want to encourage people to think about the world in different ways if you wanna

get the wisdom of crowds. And our model explains why that's the case. It's

collective error equals average error minus diversity. If people aren't that

smart, average error is gonna be big. If you want the crowd to be smart, the only

way to get it is by having that crowd be diverse. So if we look at the first

example of [inaudible]. Where we get 0.6, 29 56 and 29 55. We see that's exactly the

case, small crowd error, you know fairly large average error because it's not an

easy thing to do and then high diversity. And if you take examples of wisdom of

crowds from all over the place, you'll see they all look like this, they look exactly

like this, Small crowd error, large individual error, large diversity.

Question is, how do you get how do you get that diversity? Well you get that

diversity by people using Different categorizations. Different that your

models. Maybe people using entirely different models. Maybe, one person's

using a mark off model and one person's using a diffusion model. Maybe one

person's using a linear model. Maybe one person's got a non linear terminal model.

There is a lot of different variables. So, what you get is this how originating what

we see the world. In the boxes we use and the variables we use, and the models that

we construct. I would give Diversity to these collective predictions. In that

collective prediction, those collective predictions, then, lead to accurate

crowds, provided you've got reasonably accurate people who are reasonably

diverse. And what we've learned is by constructing a very simple model of that

predictive task, where the wisdom of crowds come from. And we've learned that

individual ability. And collective diversity matter equally, they're equal

partners. So if someone were to say to you, where does the wisdom of crowds come

from? You could say, well it comes from. You know, reasonably smart people who are

diverse, And you could also ask, where does the madness of crowds come from? How

could it be that a crowd could get something totally wrong? Well, that's not

har d either; cuz crowd error equals average error times diversity. Well, if I

want this to be large. I want large collective air, then I need large average

air, cuz I need people to, on average, be getting things wrong, and I need diversity

to be small. So the [inaudible] of crowds come from like-minded people who are all

wrong, and again, the equation gives us that result. Alright, thank you.