0:03

Hi folks it's time to talk a little bit now about some observations of actual

Â diameters in the world. And if you recall, we had we talked about

Â diameters of random graphs, of a particular form and we were finding that

Â for large enough graphs and degrees that weren't to large or too small,.

Â Log n over log d was an approximation of the average path link and diameter.

Â And let's do a rough back of the envelope calculations, so you can pull out your

Â calculators. Say that the world population these days

Â is somewhere between 6 and 7 billion, let's take 6.7 billion as an estimate.

Â And let's suppose that you just count you know, friends that you talk to on a

Â reasonably regular basis, so friends, relatives, let's take 50 for an average

Â number of people that people, somebody might talk to on a regular basis.

Â Now do log of 6.7 billion over log of 5, of sorry, log of 50.

Â What do you end up with. 6, so this is the six degrees of

Â separation that is often talked about the idea that, you know, to get from any

Â individual to any other individual in the world.

Â You actually don't need, a lot of hops. You can get there fairly efficiently.

Â so lets take a look at some data and see if, if, those kinds of numbers actually

Â are observed. And so what I want to look at is what's

Â known as the ad-health data, this is the ad-less of health data set.

Â was collected in the 1990s interviews of a bunch of high schools in the United

Â States. And there's network data for a lot of

Â these high schools, so people were asked to name friends, their friends and kept

Â track of their friends. And, you know, the, the schools actually

Â varied quite a bit in racial composition, the size of them.

Â Of the school, how many students are it in, and a bunch of other things.

Â So the networks have some variation and we can see whether the diameters in these

Â networks look like they log n over log d, that, that we found in the estimate.

Â And so let's have a quick peek at some data.

Â So this is the average shortest path And it's plotted for the a giant component,

Â versus log in for log d, and this is from 84 high schools for which there's a

Â fairly complete network data. And this is from work I did with Ben

Â Golub. And when you look at this graph so do we

Â have on the x axis. We have, this is the log n, so look at

Â the number of people in the high school, divide by log of the average number of

Â friends that they had in the network. And then here is the actual average

Â shortest path. Right, and if the theorem is true then

Â there should be, all of these points should lie on the 45 degree line and

Â actually remarkably close in terms of looking at real data.

Â the, the spread here that we get in terms of log n over log d, and average shortest

Â path fits fairly well. And you know indeed for the smaller

Â schools you have fairly shorter average path lengths and for the larger schools

Â you have larger ones. But they're matching up very match with

Â log n over log d, seems to be fairly accurate.

Â Now some other curious numbers that are out there in the world.

Â Erdos had a large number of co-authors. And 509 co-authors and he wrote more than

Â 1400 papers in his life. and so people, mathematicians like to

Â count their Erdos number. So you count how many co-authors does it

Â take you to reach, how many links does it take you to reach Erdos.

Â So Erdos had a co-author, they co-authored with somebody else and so

Â forth you can find what your own Erdos number is.

Â Interestingly enough there was an auction in 2004 of a co-authorship with a person

Â named William Tozier. This was on eBay, his Erdos number was 4

Â and so, if you won the auction then he would put your name on a paper with him

Â so that would make your Erdos number 5. the winner paid more than a thousand

Â dollars actually to have the, have their name on a,

Â A paper with Tozier and end up with an Erdos number of 5, so that's just sort

Â of, an interesting curiosity. when we look at average degree, one thing

Â that's going to be important is that this says that as the density of the network

Â changes we're going to end up changing average path length.

Â And interestingly enough, networks do come in very different varieties of, of

Â sizes. So, for instance this high school

Â friendship networks on average 6.5 connections per individual of degree.

Â There's a paper by Bearman Moody and Stovel looking at romantic relationships

Â in some of these high schools. There people had, on average during a

Â time period about 0.8 of a relationship. you can look at, this is data from work I

Â did with Abhijit Banerjee, Arun Chandrasekhar and Esther Duflo on.

Â borrowing money, borrowing kerosene and rice from other individuals in, in small

Â rural villages in India, average mother of other households that you Bor, er,

Â given household borrows from 3.2. Various co-authorship studies, depending

Â on what you're looking at economics, biology, math, physics you see different

Â number of co-authors that people typically have, say, in a decade or some

Â period. Varying from, you know, just under 2 to

Â over 15.5, if people work in larger teams.

Â So you see different number of co-authors.

Â People always asks about Facebook. Facebook number about 120.

Â So you see different con-, connectivities in these graphs and that's going to lead

Â to different properties. So some of them are going to have

Â different, you know, average path lengths.

Â Other ones are going to have larger ones. And so whatever we're looking at a given

Â problem or given context. It's important to define the network

Â carefully, because these are going to have different properties, depending on

Â whether we're looking at a borrowing network, a collaboration network.

Â something like Facebook where, you know, you just have a friendship, means you

Â have a link to somebody else's page. well, and various other kinds of things.

Â or, you know, friendships, romances, there's a whole series of different kinds

Â of, of ties we might define. And they're going to have different

Â network properties.

Â