0:00

Hi, this is Mat again. And now we're ready to start talking

about how we can represent networks and talk about their properties.

And in terms of simplifying the complexity that we're facing in

describing networks we're going to think about different kinds of patterns that

might be. So one is going to be that there's global

patterns of networks. So overall big picture items.

So, how are the different constructiveness of individuals

distributed through the society? Are there some people who are really well

connected and, and serve as hubs and other people who are not so well

connected, or is everything very evenly distributed, so that's something.

Path links. How far does it take to get on an average

from one node to another. There's going to be segregation patterns,

so if we begin to think of nodes as having characteristics, say if people's

ages, their genders and so forth, do we see separations between nodes of

different types. Or are things fairly well integrated.

local patterns. do we see tight clusters of nodes that

are all tightly connected to each other, cliques.

Do we see that if I'm connected to somebody else and they're connected to a

third person that I also tend to be connected to that third person.

Transitivity. do relationships have friends in common?

Are they supported? so we'll look at local patterns, little,

you know, zooming in on particular parts. Well also talk about nodes positions in

networks so how central is somebody, how influential are they, what does their

neighborhood structure look like? So there will be different ways that

we'll be talking about networks overall and we'll also be looking at nodes in

positions in networks and other kinds of things so there's going to be a series of

definitions that we have to go through to keep track of these things.

So the basics in terms of representing networks in, there's going to be some set

of nodes, vertices, players, depending on what literature you're looking at these

will have different names. I might go back and forth between the

names and the course. Sometimes I'll call them nodes, sometimes

I'll call them agents, sometimes I'll call them players, vertices, etcetera.

But there's, tend to be some finite number, little and will be our typical

characterization of how many there are. And one basic way of representing a

network is just by what's called the adjacency matrix.

So it's going to be a matrix of zeroes and ones so an n by n matrix, and what it

indicates is 2 nodes connected to each others.

So gij equals 1 indicates a link or a tie or an edge between 2 nodes i and j.

Now generally in less otherwise stated in the course we'll be dealing with ones

which are not directed, so if i is connected to j, than j is connected to i.

So if we're friends with each other, we're both friends it's a mutual

relationship. Or if we're allies with each other, we're

both allied to each other. It can't be that one country is ally to

another, and not vice versa. So we'll tend to look at it that, that

way and we'll tend to, to work generically with zeroes and ones as, as

the structure of it. So there's either a relationship or not.

But we could allow for directed networks. We could allow for weighted networks.

so when we looked at the financial relationships, so the amount of debt that

was held inside one country of the sovereign debt of another country.

That's going to tend to be a directed network and it's going to be a weighted

network. So it's going to have different

intensities on things. when we think about an alternative

notation it's going to be very useful for representing networks.

Sometimes we'll just use instead of thinking of an adjacency matrix we'll

think of, of representing a graph or network.

By just listing all the relationships that are present.

So a notation will, will have, is it will say that i and j is in g, if that means

that there's a link between nodes i and j.

So very simple, succinct notation will just keep track of the sets of links that

are present and depending on whether these are ordered or not ordered they'll

be directed or not directed. so a network is going to be a pair of a

set of nodes. And either an adjacency matrix, or a list

of all the links that are present depending on the particular context.

Which was it's easiest to represent the network.

Okay. so, basic definitions.

1 thing we're going to want to keep track of is, sort of.

ways of navigating through a network, other known as walks or paths.

a walk in a network from, say, node i1 to node, ik, is going to be a sequence of

nodes. i2, i2, blah, blah, blah, through ik, and

the sequence of links, i1 is connected to i2, i2 to i3, and so forth, ending up at

k, such that each one of those links is in the network g, okay?

So a walk in a network from one node to another is a sequence of links that take

you from that first node to the last node.

so [INAUDIBLE] often in, in this kind of setting it's just going to be convenient

to represent it just as the corresponding sequence of nodes such that we know that

each subsequence pair are connected in the network g.

So in terms of a picture well before, before I go to a picture, let's also talk

about paths, cycles and, and geodesics. So a walk is going to be some set of, of

links, each connecting to another node but they can possibly cycle back on each

other. A path is going to be a walk where each

node is distinct. So we start at i1.

We go to a new node, i2. Then we go to some node, i3, which is not

in the previous sequence and so forth. Until, eventually, we reach ik.

A cycle is going to be a walk where we end up back at the node that we started

at. So the n node is the same as the starting

node. one other term that's going to be very

useful, geodesic. A geodesic is a shortest path between two

nodes. And shortest means we just count how many

links there are in that path and we want to find what's the fewest number of

links we need to go from. Node say 1 to node 7.

How many links do we need to get from one to another?

So in terms of pictures here's a set of networks on 7 nodes and the first one

represents a path and a walk. From node 1 to node 7, right?

So we go from 1 to 2, 2 to 3, right? So we start 1 to 2, 2 to 3, 3 to 4, 4 to

5, 5 to 6, 6 to 7. Okay, so that's that's going to be a path

which is also in this case, a walk. if we look at a walk that's not a path we

could have from 1 to 2, 2 to 3, 3 to 4, 4 to 5, 5 back to 3, so now we've hit 3

twice and then gone to 7. So that's a walk that's not a path.

And if we wanted the geodesic in this case from 1 to 7.

Then the shortest path would actually have gone just 1, 3 to 7, right, so this

is a path which is longer than a geodesic.

When we look at cycles, 1 to 2, 2 to 3, 3 to 1, that's a cycle, it's also known as

a simple cycle because it just, the only node that repeats itself is the first and

last node. And a cycle and a walk, which is a more

complicated one, is we could go from 1 to 2, 2 to 3, 3 to 4, 4 to 5, 5 back to 3,

back to 1. We cycled, but we actually cycled through

3 and 1 in that case. So it's a more complicated cycle.

So these are different terms, we'll talk about cycles as we go through.

They're going to be important. Paths are going to be important.

Walks are going to be important,. So these are distinct items in that works

and it's going to be important to keep these definitions and this terminology in

our heads as we go through. So if we want to count how many walks

there are in a network. Let's look at a simple example here on 4

nodes. So this is the network we're looking at.

So 1 is connected to to 2 and 4, 3 is connected just to 4, 4 is connected to

everybody. So that is represented here, in this

particular adjacency matrix. And generally, we're not going to have

nodes connected to each other. So when we look at the diagonal, we'll

have diagonal being 0's. there'll be some applications where it's

going to be useful to have nodes connected to each other, or to

themselves. For instance, if.

If we're doing learning, I might be able to learn from myself that will be one

application where we'll have non zero entries on the diagonal.

But generally we're going to have situations where this will be the kind

of, matrix we're looking at. Now if we square this matrix, so we raise

it to the second power, so what we're looking at here is, so there's a

relationship between 1 and 2 captured by this entry right here of a 1.

And so that indicates that 1 is connected to 2.

And so if we ask how many ways there are, how many walks of length 2 there are from

i to j. Then, the, when we square the matrix,

that actually gives us the answer of exactly how many walks there are of

length 2 from node, whatever to whatever. So this is the number of walks of length

1. This is the number of walk, walks of

length 2. If you take it to the 3rd power, you'll

get the number of, of walks of length 3. And here it says that there's two

different walks of going from node 1 to node 1.

I could go up to 2 and then back. I could go to 4 and back, and you get

that by looking, so why is the 2 coming out in this part of the matrix?

When you multiply the matrix it says I could go from 1 to 2 and then 2 back to

1. So that gives us one entry.

If I could go from 1 to 4, and then 4 back to 1, and when you multiply the

matrix, it'll pick up the 1 times the 1 plus the 1 times the 1.

It gives us a 2 in this particular entry. So when you multiply the matrix, it gives

you how many walks of different path times.

So now we've got you know path of length 2 from each node to each other node.

so it's impossible to get from node 3 to node 4 in a walk of length 2.

All the other ones you can get by actually except, and also from 4 to 3.

But all the other ones are possible. And if we keep raising these to powers.

Then we end up with you know, how many walks of length 3 from node i to node j?

And so forth. And so this is a very useful thing to

keep track of because it's going to help is in defining centrality, it's going to

help us in keeping track of diffusion processes, and other kinds of things

where we look at information moving subsequently from node to node in a

network. Okay.

So another thing that's going to be very important to keep track of in a, in a

network is its component structure. What are its components, these are the

connected sub graphs that make up a network.

So we'll say that a network N, g is connected if there's a path between every

two nodes. And a component of a network's going to

be a maximally connected subgraph. What do we mean by a, a maximally

connected subgraph? We're going to mean that the we look at a

subgraph that's a subset of nodes and a subset of the, of the links in the, the

network, so that we have those being, of course, bonding subsets of the original

nodes and original set of links. We want this to be path connected.

So from every node i in N prime, and every node g in g in N prime there exists

a path in G prime connecting those 2. So this is a connected subgraph.

And if we look at somebody, some node who's in N prime and any link in the

original network then any link that we find in the original network if there's

some node j at the end of it, then ij has to be in g prime as well.

Okay. And so what does that, what does that

mean? in terms of.

A picture when we look at components then this part of the network is a component.

But this part of the network is not a component.

Why is it not a component? It doesn't satisfy that last part of a

definition. Because we've got 5 in there, 5 is

connected to 1 and yet we didn't include 1 in our, our little picture here.

So we have to include that and, and so it's a maximally connected subgraph that

would you have include 1, 4, 3 and 5. And it would have to include all the

links there. So component is going to be a maximally

connected subgraph. Another component here it would be 2,

another component 6 and 10 together with air link And then, 7, 8 and 9 together

with air links. So this is a network with four different

components and we can keep track of those.

So a simple concept that's going to be useful in understand how, again,

diffusion works, learning, etcetera, in understanding how separate the network is

in terms of different component structures.

[BLANK_AUDIO]

Okay, so that, that does it for this lecture.

our next lecture is going to start looking at path lengths and diameters.