0:02

Okay, so we've seen a bunch of different centrality measures, so let's take a look

at an application which can begin to distinguish between them.

And let me emphasize just from the start that when we're doing this application,

it's not designed to show that one in centrality measures is always better than

another. But just to show that in particular

context, we can actually say something systematically about which ones seem to

be working better than others in making some predictions.

And the question that we're going to be looking at is diffusion, and we're

going to be looking at the first contact points in a process.

So there was a, in this case, um... diffusion process that was started and we

have some idea of which points in the network were first contacted.

And then we see what the diffusion looks like, and we have a bunch of different

networks going on. And we can try and compare across the

different networks and say, how does the, the, the centrality of the nodes predict

how successful the diffusion would be? The eventual diffusion.

So let me put this in context, and this is a part of a joint project that I've

have been involved with for a number of years with Abhijit Banerjee, Arun G

Chandrasekhar, and Esther Duflo. And what were looking at in particular

was diffusion of micro finance in 75 rural villages in Karnataka, which is

Southern India. These were villages that were fairly

remote and isolated from outside alone availability initially, and a partiuclar

bank, BSS, entered 43 of these villages and offered micro-finance to them.

And we went in and surveyed the villages and mapped out social networks before

the, the lending agency went into these villages, and then we tracked the

microfinance participation over time. So we've got diffusion over time.

And we can look at, we know the initial points that they touched, who, who they

first told about microfinance. So the bank would come into a town and

say, look here's a, a group of people that we want to so what they did in each

village was identify a particular set of people that they should talk to first.

shopkeepers, teachers, self-help group leaders, people that they thought might

be well connected in a village, and then they told those people look, we're

going to come in and we're going to offer loans.

we'll be back in a couple of weeks. Tell your friends about it and have them

spread news and then in a couple of weeks we'll come back and then tell you more

about it. And then over time they kept coming back

every two weeks, and then people could join the loan program and so forth.

And across these different villages, in some villages, they would get an eventual

participant rate in the loan program of you know, say mid-40's.

About 44% was the highest of any villages.

The lowest of any villages was about 7%. And one thing we can ask is, did it

matter which points, which people they talked to in a village first?

So it might be that in village number one the teacher's a very central individual.

But it happens that in village number 12 the teacher's not a very central

individual. So if you talk to the teacher in both

villages then in one village you're talking to a very central individual,

another village you're, you're talking to a non very central individual.

Then, does that make a difference in terms of what the eventual microfinance

participation rate was. Does it make a different in how much news

got out? so we have 43 different villages and we

can look at how central those nodes are and we can use different notions of

centrality that we've looked at and see which ones work well and which ones

don't. So, just to picture Karnataka here.

so actually in, the slide got a little distorted, but this is the area of

Karnataka here. it's all within you know, a couple of

hundred kilometers of Bangalore in South Western India.

And when we loook at the different villages, in each village we mapped out a

full series of networks, so this is, if you had to borrow 50 rupies for a day who

would you borrow them from, so we've got a borrowing question and...

I-, if we blow this up a little bit so you get a better picture, then what we've

got is each little collection of dots here is a household, and the arrows

indicate who they said they would borrow from, so somebody in this household said

they would borrow from somebody in this household and so forth, so we end up with

a borrowing network We asked a series of different questions, we actually have 13

different networks in total. Who do you go to temple with?

Who would you go to for advice? Who comes to you to borrow kerosene?

Who would you go to in an emergency for medical help?

So we have a whole series of different questions And we can then aggregate these

up and, and say that two households are connected.

They could talk to each other if they answered yes to any of these questions.

And, and we'll, we can work with the networks in different ways, but lets take

an undirected version of this, where we aggregate things at the household and say

that two households are connected if they either borrow kerosene or would go to

each other for medical help, or would borrow rupies from each other, et cetera,

et cetera. Okay, so we've got networks.

We've got a lot of other information, demographics

We've got the microfinance participation over time, number of households and their

composition, age, genders, subcaste, religion, profession, education levels, a

bunch of other things we can control for. cast information wealth variables,

participation rates in, in self help groups and ration cards, voting, behavior

in a whole series of other things. Okay?

So, so now we want to see whether centrality makes a difference in, in the

diffusion of this lone program. And so what we can begin to do is start

with say degree centrality, right. So, so you know, here if this were what

we saw in a village then, you know picking you know this individual and this

individual would be the most central individuals in the village.

And if you hit those individuals, you would expect to, to reach more just

because they have higher degree. so one hypothesis is that if we look at,

in villages where the first contacted individuals have more connections, so

higher degree centrality, then there should be a better spread of information

about microfinance. and more people knowing should lead to

higher participation, so basically high degree centrality of the first nodes,

should equal high microfinance participation.

Okay, so what do we see in the data? Here is the average degree of the first

contacted individuals, which we call Leaders here.

So these are the degree of the first contacted teachers, self help group

leaders, and shop keepers in the village. And here, on this axis, is the eventual

participation rate of the village. So, each one of these dots is a village.

So for instance, this village had a 7% participation rate.

So fairly low participation. And the average degree of the leaders was

about 17. this village over here had average degree

of leaders about 21, and a participation rate of 44%.

and so we've got a bunch of things. If you fit a best fit line through this,

actually it doesn't look like there's any relationship.

And if anything, the slope is actually negative.

So it doesn't appear as if degree centrality really captures what's going

on. Okay, so maybe we need another centrality

measure. Let's have a look at, you know, again,

when we talked about Eigenvector centrality we realized that looking at

degree doesn't tell a lot of the story because it doesn't capture how well you

are positioned in a network. And so if we look at Eigenvector

Centrality, where we have the centrality being proportional to the sum of the

centralities of your neighbors, then we are getting something which reflects this

better connectedness, as we talked about in the last lecture.

Okay, so let's have a fiat and look and see if Eigenvector centrality does a

better job. So, revisit our hypothesis.

In villages where the first connected people have higher eigenvector

centrality, there should be a better spread of information about microfinance.

And more people knowing should lead to higher participation.

So let's have a look. And indeed, when we put now the

eigenvector centrality, the average eigenvector centrality of the leaders.

And plot that against the participation rate on this other axis.

Now we get a significantly positive and, and strong relationship.

So having better placed leaders in terms of eigenvector centrality does a

reasonably good job of predicting the eventual mark microfinance participation.

whereas the degree centrality didn't seem to pick things up.

And, the idea here is that, why's eigenvector centrality is working better?

Because, you know this communication's a repeated process.

You tell your friends. They have to tell their friends.

And so forth. So if you have well-positioned friends,

and they have well-positioned friends, that is good for diffusion.

An eigenvector centrality is measuring that whereas degree centrality is not.

if you begin to, you know you can do the regression.

Regress micro finance participation on a series of variables.

If we look at the eigevectors of the leaders compared to the degree of the

leaders and regress micro finance participation on these variables.

We get positive, and significant relationship between eigenvectors of the

leaders and mirco-finance participation. Slightly negative and insignificant

relationship of the degree centrality. So indeed, eigenvector centrality seems

to be doing a better job. you know, we can look at a bunch of

different, notions, so here we look at regressing micro finance on different

notions of centrality, so the Eigenvector centrality degree of closeness, Bonacich

between this... Here, what I've done also is, is we're

also correcting not only for the centrality, but also let's keep track of.

You know, some villages are going to be larger, so they might have larger numbers

of people. Some might have more people who

participate in self help groups, which means they're already more prone to be

borrowing and lending from eachother. we have variables on savings, we have

cast variables, we can look at a whole series of different variables and control

for those and see, you know, that takes some of things out.

And again, I can vector centrality, so now degree turns out to be positive and

we control for these variables, but still insignificant compared to its standard

area... Eigenvector centrality is the one which

turns out to be positive and significant the other ones turn out not to be

significant. So you know, this is just one

application, but it's one application where now if we have a very particular

question in mind and we look at which of the centrality measures correlates with

the eventual outcome Eigenvector centrality is one that's correlating in a

positive way and the other ones are not correlating significantly once we've

controlled for a bunch of other variables.

So this just gives us an idea that these things are measuring different aspects of

the network and sometimes one can be a better predictor than another.

Now exactly what the causation here is we can tell stories, I can explain that it

probably has to do with communication and better connected friends leads to better

communication and so forth. Eigenvector centrality's picking that up.

but, you know, this is observational data, so we're not sure what the

causation is, but we do see that different.

Measures or picking up different things in the data that's going to be important.

Now again I want to emphasize here that this does not mean eigenvector centrality

should be your only centrality measure. It just means in this particular

application where we looking at a very specific type of diffusion it seemed to

be a better correlator than these other standard measures of, of centrality, and

depending on which application you're looking at, you know, between this seemed

to do a little better at explaining what was going on possibly in the Florentine

marriage data. So depending on which application you're

looking at it might demand a different, centrality measure.