0:00

So we're back and now let's take a very close look at Preferential Attachment.

Â And begin to understand how it generates links and degree distributions.

Â So, Preferential Attachment so the idea now is newborns again are going to form m

Â links to existing nodes. born one each period, so time just keeps

Â track of birth dates. And you could adjust it however you like.

Â so the important thing now is the formation process is going to be

Â different. So at sometime t, the number of links

Â that are going to be in the system are going to tm links, right there.

Â T nodes born it reached form m links so tm in total.

Â And the total degree is then 2 times tm, right?

Â So each link has two different nodes that are going to get to count it as a link.

Â So we have 2tm as a total number of, of degrees in the system at any point in

Â time. And the probability of attaching to a

Â given node is proportional to it's degree compared to the overall degree in the

Â society. So nodes that have very high degree, are

Â going to have higher chances. And so in particular, the probability of

Â attaching to node i, is going to be it's degree divided by 2tm.

Â Which is very different than the m over t that we had before.

Â So before, every node that was out there had an equal chance of getting a new

Â link. Now the chance of getting a new link is

Â proportional to your degree, compared to the overall degree in the society at that

Â time period, okay. So this is the proportional attachment

Â rule. The Preferential Attachment for which

Â this model is known. Okay, so let's do a mean field

Â approximation. So again we're going to do a continuous

Â time of approximation. We're going to be really finding the

Â degree of the distribution of expected degrees.

Â And then to check whether this actually matches the real degree distribution.

Â We could simulate the model, and see whether this is giving us a good,

Â approximation. There have been theorems now which show

Â that the mean field approximation of this particular model works well.

Â when we get to richer models that are different from Preferential attachment.

Â And the exponential model, then, we're not going to know whether or not the, the

Â actual degree distribution is well matched by this approximation.

Â But we can always check and simulate the model and then make sure that the, that

Â what we're getting out is the, the closed form solution from this kind of system is

Â not too different than what happens from the simulations.

Â Okay Distribution of Expected Degrees. So again, when you're born, you form m

Â links. So that's the initial starting condition.

Â How does your degree change? Well, you're getting new links at a rate

Â which is proportional to how your degree compares to 2t over m.

Â There's m new links being formed per unit of time.

Â Therefore this differential has this equation, and we can cross out the m's in

Â the numerator and the denominator. And so what we'll end up with is a simple

Â differential equation. Right, so we end up with the derivative

Â with the degree, with respect to time, is just looking like the, the actual degree

Â over 2t. And we have a starting condition,

Â differential equation, this one's even easier than the last differential

Â equation to solve, in fact. So the degree looks like m times the time

Â over the birth date raised to the one half power.

Â So square root of t over i. So a simple square root equation that

Â gives what the degree should look like over time.

Â Okay, so we've got a very simple equation for the degrees.

Â And we can compare that. So let's go back to the, situation we

Â looked at before where people were forming links uniformly at random.

Â Let's again have everybody form m equals 20 links when they're born.

Â And let's compare what happens when they're forming the links uniformly at

Â random compared to preferential attachment.

Â And what we see is that the Preferential Attachment is in some senses a steeper

Â curve here. And it ends up with more degree, more

Â high degree notes. The older ones are, are higher degree

Â than the, the uniform at random. And then the younger ones are actually

Â lower degree. It's harder for them to get new links,

Â because they have lower degree. And the older ones now have an added

Â advantage, not only of being older and they've gained things.

Â But they're, also you gain things proportional to how large you already

Â are. So the older ones are even easier to find

Â and easier to attach to, under Preferential Attachment.

Â So they get this extra bump where we see the extra bump in the curve.

Â So this is going to give us the fat tails.

Â So the fact that we've got this extra bit here, and this bit lower means we're

Â going to have more with high degree, more with low degree.

Â And that's going to generate the, the power law that we're looking for.

Â 5:08

Okay so now, degrees in Preferential Attachment.

Â Let's figure out the degree distribution. We can do the same kind of calculation we

Â did before. Which are the nodes with degree less than

Â 35 at some time. Well, we just solve for this curve.

Â Which are the ones that has the degree exactly 35.

Â It's going to be all the ones that are younger than that.

Â So born after that time period, whatever that time was.

Â And then we can figure what's the fraction that have that?

Â So we'll be able to figure out our distribution function Fof d, the same

Â technique we used before. But now with just a slightly different

Â equation that we're using to solve that same system.

Â So remember you're degree at time t is m times t over i to the half.

Â So the nodes with degree less than 35, are going to be the ones for which this

Â function is less than 35. So 35, has to be bigger than our m here

Â is 20, t equals 100. So, if you solve that out, and solve that

Â for i, you get that i has to be bigger than 32.7.

Â So now, this is 32.7 as opposed to the 42 something that we had before.

Â So, what's the fraction that have degree less than 35 at this time.

Â Well, it's going to be 100 minus 32.7 over 100, and then we can solve that out.

Â [BLANK_AUDIO] Right, so so roughly something like 68% of the nodes are

Â going to have that property. so when we look at this, just solving it,

Â generically. Again we've got this equation.

Â the ones that have degree less than d are going to be the ones where this, equation

Â is less than d. And therefore these are the i's that are

Â greater than t times m squared over d squared.

Â And now then to solve for f this, right, so we've got the, the fraction these are

Â the i's beyond some level. So going back we're want to find the,

Â what's the fraction of i's above some degree d.

Â Well we're just going to take the t minus these i's over t, that'll give us the F,

Â right. So F is just t minus this compared to t,

Â which works out to be 1 minus m squared over d squared.

Â So we have a very simple equation for the degree distribution.

Â So our degree distribution looks like 1 minus m over d squared.

Â And if you take the derivative of that and look at what's the density function

Â for the distribution generated by preferential attachment.

Â It's 2m squared over d cubed. So, this is proportional to d to the

Â minus 3, right? So, we've got 2m squared, so this is our

Â constant, times d to the minus 3. So, now we have a power law, we have

Â exactly something which looks like a power law.

Â And indeed when we take logs of each side we get the log of f of d looks like log

Â of 2m squared minus 3 log of d. So we've got a nice linear equation in,

Â in log, log plots. Which is exactly the, power law that we

Â were looking for. And in particular here, one thing that's

Â sort of interesting, when you look at the coefficient that came out, it's exactly

Â 3. so why 3?

Â well, that came from the fact that the change in degree, over time had a 2 in

Â the denominator. And then when we do integration and so

Â forth we came out with a 3. but basically that's, is, is coming from

Â the fact that there's a certain number of links present in this society.

Â And if you want to vary this, you can actually produce different variations on

Â this coefficient. And the way that you can produce

Â different variations on this coefficient is just having the number of links being

Â formed at any point in time. either grow or shrink over time.

Â So you can have the population, the number of newborn nodes not be a constant

Â one per period but changing over time. And that'll allow you to control this,

Â this variable here. So, so a slightly richer variation of

Â Preferential Attachment can adjust that. There's actually an exercise in the book

Â I wrote that shows you how to do that. But basically the idea here is that you

Â can get different variations here by just changing the rate at which the system

Â grows. But the important thing is that the fact

Â that the older nodes are gaining, things that are richer, faster and faster time,

Â meant that, that we end up wth these fat tails and power law.

Â Okay, so we've, we've gone through a couple of different growing systems.

Â What we'll do next is try and enrich this a little bit more to span between these

Â sort of uniform at random and Preferential Attachment, and then that

Â will allow us to actually take these degree distributions.

Â And take them to data, and see which ones actually match the data.

Â So which is, is the actual Preferential Attachment a good match for the, for the

Â world wide web data? does, which ones match the romance

Â networks etcetera. So can we find variations on these models

Â that actually match the observed data.

Â