0:01

Hello again, folks. So, let's talk a little bit about other

Â kinds of growing random network models. And in particular, we're going to talk

Â about a hybrid model now that spans between these uniform at random models

Â and the preferential attachment model. And you know the idea here is, is again

Â we're looking at random network models and now what we want, want to do is, is

Â try and match degree distributions that are out there in the real world.

Â So, we have a variety of, of degree distributions that are out there in the

Â real world. Some of them look more like uniform at

Â random, some look more like preferential attachment.

Â But we want a way of actually fitting a model so that we can say that this one

Â looks like you know, entirely preferential attachment or 80%

Â preferential attachment or some percentage.

Â So, we have some idea of exactly what is going on in, in, in generating different

Â degree distributions and the, the idea here is that we'll, we'll be able to fit

Â these models to data. So again, when we looked back at the

Â Notre Dame worldwide web and so forth and we end up working with preferential

Â attachment. Preferential attachment gave us a curve

Â which has a slope of minus 3 and would be linear in log log plot...

Â at least to our eyeballs, that looks like it might match this data pretty well, but

Â we'll have to actually get it, you know, tested carefully to see whether or not

Â that's a good match for what's actually going on in those data.

Â what's interesting though, is when we look at say, co-authorship data so here,

Â this is co-authorship. This is data from the economic

Â co-authorship data sets of Goyal, van der Leij and Moraga Gonzales.

Â And in particular we see that there are fat tails right?

Â So we've got higher degrees and more high degrees and more low degrees.

Â Then uniformally at random but this is hardly linear right.

Â So if we drew a line through here its not linear.

Â So we've got something which looks like it is has fat tails but its not well fit

Â by a power distribution. And so, can we say a little bit about

Â whats going on in terms of how this network might have been formed?

Â the this kind of growing process. And again that's a natural kind of model

Â to look at because co-authors are coming in over time and forming new

Â relationships as, as they're born and gaining relationships over time as they

Â mature. Okay, so what we're going to do is look

Â at a simple hybrid model. And this hybrid model has been discussed

Â in different papers. I'm going to talk about a version that I

Â did with Brain Rodgers in 2007. The very simple version and will allow us

Â to really span between these and then go directly and fit to data.

Â And the idea is, is the simplest one you could imagine in terms of, of in forming

Â a hybrid of these. what we actually do in the papers is, a,

Â a little richer, but I'm just going to simplify this for the sake of, of our

Â analysis here. what we're going to do is, some fraction

Â of the new links that are formed in any point in time.

Â A fraction, a, are going to be done uniformly at random.

Â And 1 minus a are will be found via searching the neighborhoods of friends,

Â okay. So what does that mean?

Â That means you know there's a bunch of nodes that are already out there.

Â Let's suppose that they've formed some network.

Â So they're already existing. and a new node is born, and that new

Â node, first, forms some links uniformly at random.

Â 'Kay, so let's see, let's do this with blue.

Â So it forms a, a couple of, of links uniformly at random.

Â 4:03

And then what it does is it picks new nodes to link to by looking at the

Â friends of this. So this node, maybe it, it identifies

Â this node then, and then forms some additional ones instead of searching

Â uniformly at random. What it does is it meets the friends of

Â friends. Okay, so it begins to look at the friends

Â of the existing ones, and then form links to those nodes.

Â So it picks some nodes, uniformly at random, and then also searches the

Â neighborhoods of those nodes. So, you just, for instance go on to the

Â worldwide web, you start looking through some webpages.

Â And then you start clicking on links from those pages and then finding those.

Â Or you meet somebody and then you meet some of their friends.

Â So this is a very natural process. And what we're going to is we're going to

Â have one fraction formed uniformly at random and then the rest, the one minus a

Â fraction formed via meeting friends of friends.

Â And what I want to convince you of is that, that second part is going to look

Â like preferential attachment. So, that will give us an explanation for

Â how preferential attachment might actually be working.

Â Why would it, why would we have preferential attachment?

Â Okay, so let's think about meeting friends of friends.

Â So you're forming am of your nodes uniformly at random.

Â And then also meets 1 minus a times m of their neighbors and attaches to those

Â nodes, too. And the distribution what's, what's

Â important is the distribution of neighbors' nodes is not the same as the

Â degree distribution, even with independent link formation.

Â Okay, so a neighbor is more likely to be of high degree.

Â And let's go through the logic of that, why is that true.

Â so let's think of a network, suppose that we are in a situation where have of the

Â nodes had degree k and half the nodes had degree 2 k, okay?

Â So some of them are twice as connected as others.

Â Let's just do a simple thought experiment.

Â So pick a link in that network at random and then pick a node at one end of that.

Â Okay. Now, there's a 2 3rds chance that the

Â node you'll find is actually of degree 2k and a 1 3rd chance that the node is of

Â degree k. So it's actually twice as likely that

Â you're going to find the node with the higher degree as compared to the node

Â with the lower degree. Even though they're evenly distributed,

Â even though there's half as many of each, I mean, there, there, half the population

Â is of each, they're equally likely in the society.

Â Through this process, you're actually going to find the higher degree nodes

Â with more probability. Why?

Â Because they have more links. So, if you find any node on the end of

Â one of their connections. If you find one of their neighbors, you

Â find them. Right?

Â So, here if you select a link and then you select a node on one end of it is, if

Â I have twice as many links it's twice as easy to find me in that network as it is

Â to find somebody who has half as many links.

Â And so if you're looking at my, at my friends and then trying to find any of my

Â friends. If I've got twice as many friends, I'm

Â twice as easy to find in that system as somebody who has half as many friends,

Â okay? So the idea is that if you're actually

Â looking through a network and finding people by looking at nodes and then

Â looking at their friends. You're going to be more likely to find

Â higher degree nodes, and, then we're going to have a form of preferential

Â attachment. Nodes that have more friends are going to

Â be easier to find, their going to get more friends through this process, and it

Â continues. Okay, so, very simple process, very

Â intuitive and natural one now, that will explain why we get preferential

Â attachment. And then as we vary this parameter a, we

Â can have things look like just forming uniformly at random network, or if we're

Â doing things mainly through the network search process.

Â We are going to tend to have most of the links formed in ways that are going to

Â favor better connected nodes.

Â [BLANK_AUDIO]

Â so, what we're going to do, we again the, the, the chance of so, randomly find a

Â node. randomly pick one of the nodes it's

Â attached to. And, that's the sort of finding of

Â friends of friends bit. And chance of finding some node via this,

Â second part of the procedure is proportional to its degree.

Â We find it, if you find one of the, the other nodes that attach to it.

Â So, If it, if it had twice as many people attached to it, it's twice as easy to

Â find. Okay, so now what does this model look

Â like then? fraction a forms uniformly at random, 1

Â minus a via preferential attachment. So we can go back to our mean field

Â approximation, do exactly what we did before.

Â We're going to still have the starting condition that when you're born, you form

Â m links. And now, what we're going to have is a

Â fraction a of my new links over time are going to look like the uniformly random m

Â over t. And a fraction 1 minus a, are going to

Â look like they did in preferential attachment which is proportional to my

Â degree over 2t. Right so there's an m, over 2t m, I've

Â divided out the m's So, we've got this simple differential equation.

Â And, if you solve that, so now we've got another differential equation.

Â you can just check that this differential equation, if you differentiate this with

Â respect to time, you'll get this back. And the initial starting conditions.

Â So, this is the solution to that differential equation.

Â Okay, I'm not going to go through solving differential equations, this is a

Â relatively standard differential form, and this is the, the solution for that

Â differential equation and with that starting condition.

Â Okay? So now we have a very nice sys, system.

Â Again, what this does, is tell us what an expected degree is, at any point in time,

Â for a given node. As a function of how much time has gone

Â by, and what its birth date was. And therefore, we can then solve this to

Â figure out which nodes you're going to have degree less than some amount at any

Â time. That gives us the degree distribution, so

Â exactly the same procedure we've been doing a couple of times before.

Â And so here, the nodes that have de, expected degrees less than d at some time

Â are the ones for which there there equation for what their degree is, is

Â less than d at that time. And here there's an expression just with

Â simple, put an x in here to simplify this.

Â Choose a notation to substitute in x is equal to 2, 1 minus a, and that makes

Â this equation just a little easier to read.

Â So, the critical i is the one for which this is true.

Â So if you solve the i for which they're exactly at degree d the, they solved

Â this. And therefore remember the f of d are

Â going to be the, the fraction of nodes that were born after the ones that have

Â this critical i that has exactly that d. And, so if you substitute in from the,

Â the i here, what do we get when we substitute that in?

Â We get f of d is equal to this equation, where x is equal to 2 over 1 minus a.

Â So this is slightly more complicated expression than we had before.

Â And it looks partly like, so we've got a d in the denominator raised to some

Â power. the power here depends on the parameters

Â of the of, of the a and so forth. And you can check that when you look at

Â this equation. As a gets near 1, so that all of the

Â links are being formed uniformly at random, then this is nearly a negative

Â exponential distribution. Where as, as a gets close to 0, then x

Â just becomes a 2. this is going to disappear, this is

Â going to disappear, and we're going to get our 1 minus m over d raised to the to

Â power the differentiative, the density function, I don't have a minus three.

Â So this will look exactly like preferntial attachment.

Â So this spans between those different distributions and as we change a we're

Â going to end up with a distribution which looks either like this negative

Â exponential. Or begins to spin out and look more like

Â a power law distribution, as we let a become close to zero, and almost all the

Â links before, er, the friends of friends, which gives us preferential attachment.

Â Okay, let's simulate these things just to check.

Â so this is after a thousand periods. simulating when m is equal to 2 and a is

Â equal to half. And you can just check either the log of

Â the distribution function, versus of the degree, or the log of the frequency.

Â So this is a, we know that the extremes that the expected degree is a good match

Â for the actual degree distribution. here, if we put it in the middle it, it

Â seems to be fitting well. So the simulated degree distributions are

Â coming out well compared to the actual degree distribution.

Â So, so when you simulate with a half, you get a good match.

Â if you simulate with a equals 1, so it's uniformly at random, it's a good match.

Â actually interestingly, if you the, this is just on simulations when a is 0 it's

Â not exactly on, but this is one where we know the proof actually shows that

Â you get a good match so it, it just means that for, for pure preferential

Â attachment, you're going to need more than a thousand periods before this thing

Â begins to be a good approximation. Okay so, so we've solved for a degree

Â distribution and what we can do next, is now we've got a degree distribution, a

Â hybrid distribution that spans between these different extremes.

Â We can start taking this to data and begin to back out what that a is in the

Â data. How many links look like they're being

Â formed more at random. How many look like they're being formed

Â through the network search process, being meaning friends of friends.

Â And then see whether we see differences in different data sets.

Â