0:58

So the application of a modelling of network modeling

of complex systems started in 1736 by Euller who used

the newfound field of graph theory mathematics

to prove the seven bridges of Konigsberg problem.

A lot of advancement in graph theory and

application of network analysis in complex systems was made by

those two Hungarian mathematicians,

A Paul Erdos and Alfred Renyi in the late 50s.

What they did is to study the properties of random networks.

So their famous random network model placed some

buttons on the table and then connected those buttons with random connections.

And they tried to understand the properties of those networks,

of those graphs.

And one of the properties is that nodes in those graphs have an average connectivity,

2:10

and that connectivity distribution if you plot the number of connections per node.

And the frequency of nodes having a specific number of connections,

it follow a Poisson distribution, which is similar to the normal distribution.

It's a bell-shaped curve distribution.

Erdos and Renyi were also famous for proving the problem of,

how many colors would you need in order to distinctly color states on a map?

So in the late 90s there were two seminal

publications that really made a huge impact on this

idea of applying network analysis to real world complex systems.

And first the paper was published in nature by Steve Strogatz and Duncan Watts.

And what they showed is that real world systems follow this Small World topology.

So what does that mean?

And this actually has a very specific mathematical definition.

So real world networks are not random or not like the erostainiant model networks.

The actually have these properties where they have high clustering coefficient and

relatively short characteristic path length.

3:32

So, those are two network properties that can measure the levels clustering and

the average distance between pairs of nodes.

So, as Stevens struggled and

Duncan Watts analyze three complex systems abstracted to networks.

One is the Movie Actor database IMDB.

They also analyze the power grid system that

lists transformers and generators and

also they analyze the normal connectivity map of the warm C elegans.

Where this worm is the only organism that we mapped its entire neuronal connectivity

system or has 200 neurons that we know exactly how they're connecting.

So as you see, they compared the actual L, which is the characteristic path

length to the L that is expected for random networks of the same size,

as well as the C, which is the clustering coefficient, and compare that

to the clustering coefficient expected in random network of the same size.

So how do you compute the clustering coefficient and

the characteristic path length?

So the computer clustering coefficient or what the clustering coefficient means

is it's a measure of how the neighbors are a known or connection.

A direct neighbor of a node or connected.

And in this particular example you can see that the red

node has four neighbors and in the first example on the clustering

coefficient nodes C=1, all of the neighbors are connected.

So there are six possible connections that those four nodes can be connected.

And the first example on the left, all of the are implemented.

6:20

So in this model I started with a regular lattice and then gradually randomized

the connections, taking a link, and then putting it somewhere else.

So they extract it, one of those links, and then gradually randomize the network.

So if you do that process starting with this regular lattice, going all the way to

random network, somewhere in the middle you're going to get a Small-World Network.

So what they then as they go through this process of randomizing

those links from the regular lattice to the completely random network,

they can do a measurement of the clustering coefficient and

a characteristic path name that every few steps.

And what they found is like, somewhere in the middle you still are retaining

the high clustering while your path lengths drops very rapidly and

this explains, or fits, the real world topology.

So just, think there are shortcuts in real-world networks,

that connect modules that are highly clustered.

So in this paper that was published in Science in 1999, Albert and

Barabasi noticed, is that real-world network connectivity distribution is not

like the Pearson Distribution that was reported by Erdos and Renyi in 1959.

It's actually following a power law distribution.

A power law, so

the connectivity distribution has a power law density distribution histogram.

And this is what is shown here.

And then they also, like Watts and Strogatz,

they had a model that can generate those power law

distributed networks that they called Scale Free Networks.

So, let's look at how the model works.

Instead of starting with all the nodes and

all the links, and gradually shuffling the links.

The Rich-Get-Richer model, network growth model, is can

generate scale free networks, or networks that have a parallel distribution,

starts with very few nodes that are randomly connected.

So here, we have an example of five nodes.

With some random links that connect.

Then you start adding nodes and links to the system.

So your introducing nodes,

growing the network in size until it reaches your desired size.

And that growth model has some basic rules of how the node,

the new node, that is introduced to the network.

Will be connected to those other nodes.

If we are interested in generating an [INAUDIBLE] connected

network we will just randomly connect a new node and just grow

the network by adding nodes and randomly connect those nodes to the network.

However with the Rich- Get- Richer model we are connecting the new incoming nodes

by assigning a probability for each node to be connected to the new node

based on the nodes already existing connectivity degree.

One of the things that Barabosi showed is the importance of hub nodes.

Most of the nobs in the network have very few connections, either one or

two connections.

And then there are those highly connected nodes that although there

are not that many of them, compare to the the lowly connected nodes.

There is a lot more of these hubs compared to the hubs that

9:52

you can find in an network.

And those hubs are important components in the network.

If you destroy those hubs obviously the network will fall apart faster.

And what they showed in this two papers that were published in Nature.

Is that when you knock down genes that are highly connected,

you the viability of the organism is going to be

much lower than if you would have knocked down genes at their lowly connections.

So in this review article that Barabasi wrote many years ago,

he compared the view of a random network to a scale-free network.

One of the criticisms about the scale-free model is that when you look at many

complex systems, you see those power law distributions a lot.

This is because complex systems are made of diverse components, diverse agents.

And those different types of agents will have different properties,

when you are comparing a lot of things that are different, that are diverse,

heterogeneous.

Then you will get those parallel distributions.

So this is not necessarily, or at least the proponents,

the opponents of the scale-free model suggest that this is not necessarily

a feature of the structure of the system by design but

just because we are looking at a very diverse system.

If you compare people's height you won't get a bell shaped curve distribution.

However, if you compare all animal's height and weight you will

probably get a power law because you are comparing so many diverse things.

So now I'm going to turn my camera around and I'll show you what I'm looking at.

And this is the North part of New York City in Manhattan.

And you can see that there are many buildings of many different heights

that I'm looking at.

And in this particular case, if you would've just plotted the distribution of

the height of those buildings, you probably also would

obtain a parallel distributions, because those buildings are very different.

But if you only picked a specific type of building, or

only the five floor buildings, or the.

You'll see that then you can obtain a normal, you can see that you can obtain

a bell shaped curve distribution, and not necessarily a parallel distribution.

One thing that to keep in mind about, the scale free distribution

is that there is sometimes a drop in the hubs.

So the hubs you cannot find infinitely large hubs.

So although the power log curve would suggest that there are very large hubs,

sometimes we see a drop in the size of hubs.

And this can be explained by two mechanisms.

So for an interesting paper that came after the publications of Albert and

Barbossi suggested an explanation for that drop in distribution of hubs and

the tale of the scaled free distribution.

And this, like the building that we just looked at,

suggests that there could be some type of a limit of how tall a building can be.

They modeled two types of mechanisms that can explain this drop of the tail

in the hubs and is one is suggesting that the hub nodes age.

And as they age they lose the capacity to accept new

links in the Rich-Get- Richer model.

Another one is that there is some type of a physical limit,

a maximum physical capacity for a hub to grow and

once it has reached that peak max physical capacity that hub will stop growing.

We will talk more later about other mechanism that can generate

scale free networks.

[MUSIC]