0:00

Hi folks. So we're back again, and now we're

looking at eigenvector-based centrality measures, and and again, as, in terms of

what we've looked at for characterizing the position of a node in a network,

we've looked at things like degree. or, you know, local clustering, distance

to other nodes and then in terms of measuring the centrality, influence and

power, one difficulty is that when we've looked at things like degree centrality

it doesn't necessarily capture the importance of the node's friends, so you

know, when we look at this picture, for instance...

the, there, there's sort of two aspects to it.

One is that we're, you know, this node might be much further, from a lot of

nodes than this one. So, things like de, decay and closeness

centrality can capture the fact that it's sort of out in the outskirts.

But another thing is that when we look at the friends of this particular node.

So if we look at the connections of this node they each have degree 2 themselves

whereas this node's friends have degree 6 and 7.

And so in some sense they are better connected than than this other node's

friends. And so the idea of,of eigenvector

centrality is that your importance comes from being connected to other important.

Nodes, and in particular here, if we look at a centrality notion where the

centrality is proportional to the sum of the neighbor centralities, then we're

defining eigenvector centrality. So the node i's centrality is

proportional to the centrality Of the nodes to which I is connected.

So that gives us a, a, an equation which says that CI is dependent on the sum of

the CJs, where what's going to happen here is if GIJ is equal to one, then we

get. The c, j being counted and if it's equal

to 0 then it's not counted so we're just counting the centrality's of the friends

you're connected to. Okay, what's what's interesting here now

is that this is a system of equations and a system of unknowns.

So we have That's in order to figure out what Ci is we have to figure out what all

the Cj's are and so this is a set of equations and a set of unknowns and so

basically what we have is that the vector C is equal to sum a times the matrix g

times the vector C. And so this is what's known as an

Eigenvector. Right?

So it's vector where you multiply the matrix by that vector and you get back

something which is proportional to the original vector you started with.

So this a is just some proportional constant which tell you how you're

scaling and that's what's known as an Eigenvector.

Now we'll say a little more about that in a moment.

but what we can begin to see is that if we calculate the eigenvector centrality

of these, so solving that system of equations and unknowns and looking for

non negative solutions what we'll find is this one gets a 0.31.

This one get a 0.11. This one is essentially three times as

important according to this Eigenvector Centrality measure of importance.

Okay, so what's happening here is you're getting value from connections to others,

it's a self-referential concept. So, we have to figure out how to solve

that. And in particular what we have is we have

this equation which we just wrote the Eigenvector centrality then of a network

g is proportional to itself times g. And one thing about Eigenvectors is

there's many possible solutions. So if we have n equations and n unknowns.

generally there can be upton N different solutions.

And what we tend to do is look for the one with the largest eigenvalue and

there's a theorem in matrix algebra known as the Perron-Frobenius Theorem, which

says that if we're dealing with a non negative matrix.

Then the eigenvector associated with the largest eigenvalue is going to have

nonnegative values. and in particular in a lot of contexts

we'll be looking at the other eigenvectors wont make sense that might

have some negative values or even nonreal values.

Use. so the conventional beats, so look for

the one which has all, non negative real values.

That's going to be the one with the largest Eigen value.

And, you know, there's a variety of programs that will calculate Eigen

vectors for you, so if you have a matrix...

representing your, your network. You can plug them into different programs

and at least get an approximation of the largest the eigenvalue.

the eigenvector associated with the largest eigenvalue.

then we can sort of normalize the, the entries to sum to one so any eigenvector

if, if you multiply it by two then that's also an eigen vector.

So, so these are all going to be taken up to.

re-scale things and so if we normalize these things and sum to 1 and then, then,

we'll have a well defined definition. Okay so if we begin to look at you know

the Eigen vector centrality of different farmers and the peasant nation data.

Actually these are not quite normalized. but here, you know, we see that again the

Medici come out higher than other families.

Not as strikingly high but that's going to give us a different picture of,

of what's going on in this, in this kind of setting.

now when we talk about eigenvector centrality, there's a number of, of

different concepts that are related to it.

Google page rank, so didn't got for little page, was the system that sort of

helped Google become the search engine that, that really dominated the market

early on. And the idea was that, that when you

tighten something and it had a bunch of pages, that it could show what it have,

whatever term you to type in on it. it had to pick one and the way that it

sort of picked the ordering originally was that it would look for the ones with

the largest eigenvector centrality. So it was trying to look for pages that

were connected to by pages that were important and that was a sign of its

important. And no it's that.

Algorithm is now much more complicated and is evolved over time to do all kinds

of other kind, you know, things people are manipulating the algorithm.

It also could take into account what it thinks people might want and so forth.

but the basic idea here was eigenvector centrality to begin with.

Something known as the random surfer model as well, which is.

You know, suppose you hopped on the web and you just started picking pages at

random. And then so you go to a page and click on

one of it's links at random, go to that page, click on a link at raondom and so

forth. And then look at the fraction of time you

would spend on each page as you went through this process over time.

That's going to be related to an eigenvector calculation.

So eigenvector calculations are ones that encapsulate this idea of, of important

connections to other nodes and then being connected to by big important ones helps

bring more traffic to you. we can go back to the the network we had

before and you know we can put in eigenvector centrality measures, and now

node 3 actually comes out ahead of node 4.

so, so you know depending on which way you do these things, you get different

measures. One last measure which is going to be

very useful and is related to eigenvector centrality is what's known as Bonacich

centrality or Katz- Bonacich centrality. It actually builds on a, a definition

originally by Katz that was later extended by Bonacich, and the idea is

that you give each nodes on base value. And the base value is some some number a

times the degree of the node. Okay, so, so if I have degree 10, and ,

and a is, is a half, then I start with a value of 5.

And so everybody starts with some value proportional to their degree.

And now you add in all paths of length 1 from i to some j times b times j's base

value. So I get some fraction of my neighbors

value. So let's say that b is a third so now

I've got a base value, plus I add in 1 3rd times my friends space values.

Then I go to length 2 and I add in b squared times those peoples values.

So this is you know 1/2 times my friends or sorry 1/3 times my friends values now

this is going to be 1/9 Times friends of friends values, right,.

So I can go out length 2, blocks of length 2.

Some of them might come back to me. so it, does sort of counts these things

doubly. But what it's doing is just counting, so

I get a times my degree which is just g times the vector of 1.

Then I get b times walks of length 2, remember g squared gives us walks of

length 2. so I'm getting b times the connections of

length 2, I have times their base values. Then b squared times length 3 and so

forth. So what we end up with is this

calculation where we're sort of raising g to successive powers, and then weighting

that by some level b. And if we normalize the a here to 1, it's

just multiplying the whole thing. Then basically, you know, this thing is

going to be like a sum of a series, and I'm getting more Value from inderect

walks times the value of those nodes. If you want this thing to ber convergent

so that actually gives you a number, b's going to have to be relatively small in

order for that to be finite. it's going to have to depend on the

magnitude of the matrix. the, the largest Eigen value.

but a small enough B will make sure that this thing converges.

So when we look and Bonacich Centrality, then we can go through and calculate this

for different values. and you know go through and for instance

then, the Medici again appear more important that some of the other

families. so, you know, this kind of calculation

gives you something like Eidenvector centrality, and in fact for b being close

enough to 1 over the, largest eigen value of the matrix, then it's going to begin

to approach eigen centrality, but more generally it can diverge from them.

And depending on what bs you work with you are going to get different notions,

so here if we normalize a to 1 and work with different bs we can get different

values of Bonacich centrality out, and it's going to be another measure of

centrality. Basically we've go a whole sequence of

different measures of centrality. Some of them are going to work better in

some contexts than others. Exactly how we do that and which ones are

going to be right for what contexts depends on, on what we're studying.

And again this sort of four different ideas that we've gone through degree is

just getting basic connectedness of a node.

How, how many neighbors does it have. Then there's things like how close are

you, how easily can you access other nodes.

Then there's connector based definitions. So how important are you as an

intermediary. And then there are these sets of

influence, prestige, and eigenvectors sorts of things that are capturing not

only who you know, but weighting by their importance and doing sort of iterative

calions of calculations that will capture importance of the value of your friends

as measuring your own importance. so what we'll do is, is now we'll have a,

a quick look at one application, which will distinguish which one of these

centrality measures does well. And which ones don't, in a particular

setting. and then we'll start to move on and, and

try and understand a little bit more about formation of networks.