0:01

High folks.

So this is a lecture which is designed for people who haven't had much

background in social networks or working with representations.

So if you have had a course say course or Michael Kearns course.

If you've seen that course before,

then if you feel comfortable then you can skip this one.

But this is for people who really just want to get the real basics down of how we

represent these things.

And so, well I'm going to start with this just basic different types of network.

So generally we have some set of nodes, vertices, agents, actors, players.

So those are the objects that we'll be connecting via links and ties.

And between those individuals we have edges, links, ties,

you see many different terminologies in the literatures,

depending on whether you're coming from graph theory or sociology, or economics.

And they come in different flavors.

So it might be that they have intensity, they could be weighted.

So it might be, for instance,

that we're keeping track of how many hours two people spend together in a week.

And then it could go anywhere from 0 to whatever,

7 times 24 is a 148.

And then we could also keep track of things like,

how much of 1 country's GDP is traded with another.

It could also be something where instead of having intensity,

we're just keep track of whether there's a relationship or not.

And we could think of those as unweighted.

So we have two people written an article together,

if they're researchers, or two people friends on some social platform.

That's either something that's true or not true.

And they come in flavors that could be directed or undirected.

So it could be that we're looking at mutual relationships, like co-authorships,

being relatives, being married, or it could be something where there's a one-way

relationship, so one web page can link to another without being linked back to.

They could also be that even one article cites another, or you could follow

somebody on a social media platform without having them follow you.

So there's different situations and

we'll represent these depending on exactly what we're looking at in different ways.

So if we are thinking of a simple undirected network, then we can keep

track of this either as what's known as the adjacency matrix.

So that's the g here which is g for graph generally.

2:26

What we're going to do, is we will put in a 1 if the two individuals are connected.

If i and j are linked, then that entry will be a 1.

And it's undirected, so it's symmetric.

So here 1 and 2 are linked together, and

we have the g12 is a a 1 and g21 is a 1.

And that's indicating that these two individuals are linked together.

So sometimes we'll represent these things graphically,

sometimes we'll represent them in terms of adjacency matrix.

It can also be quite handy to just list the links.

So for instance, just keep track of a list of which pairs are linked together.

And in terms of notation, it will be a little easier not to carry around the set

of bracket notation all the time, but just say 12 are linked, 14 are linked, 24, 34.

And why would we want to represent things in this way?

Well when we start working with really large data sets, so

something that has millions or tens of millions or more nodes,

then keeping track of a matrix becomes cumbersome.

So it's going to be hard to put that into memory in a computer.

Whereas if we just keep track of the list of links that

economizes as quite a bit on what you have to keep track of.

And also in terms of our representation, sometimes there will be easier just to

think of sets of links rather than thinking of matrices.

But other times the linear algebra associated with matrices would be

quite handy.

So we'll flip back and forth between these different representations, and

g sometimes will represent a set of links, sometimes it represent a matrix.

Sometimes it will represent things graphically.

We'll be flipping back and forth between all three in this course.

4:03

Okay, now in terms of directed networks, it's a similar situation but

now we don't have the symmetry, so it's possible that 1 is connected to 2, right?

So there's a directed link from 1 to 2, but the link from 2 to 1 is not present.

So we don't have the reverse relationship.

And so we still have an adjacency matrix.

We can still keep track of things.

But now when we keep track of this g, the order of things makes a difference.

So before if it was undirected, 14 meant 1 and 4 were linked together.

Now we have to keep track if there's a link between 1 and 4 in one direction,

and also back in the other direction, but

between 1 and 2 it only goes in one direction.

So now there can be asymmetries, and

the order of the actual nodes in a link makes a difference.

And there's different conventions, but

basically we'll be thinking of this as say the out degree, an arrow out from one.

So 1 is following 2, 1 is connected to 2, but 2 is not connected to 1.

So there's not something going back into node 1.

And that will be the terminology we'll use in

terms of a link going out from the node, or a link coming in to a node.

5:20

Okay, we can also have weighted directed networks.

And when we get to talking about learning especially,

then we'll be looking at this sort of network.

And in particular what does this look at?

What does this represent?

So if we look at the picture here on our right,

what we have is a series of different nodes that have weights between them?

So in particular for instance node 1 is putting a third of its weight on 2,

a third of its weight on 3 and a third back.

So it weights everybody equally.

And we can think of this as if I'm paying attention to opinions and

I'm trying to form my own opinion.

I might be paying attention to the information that comes into each person.

Here, 1 is very democratic and is weighing everybody equally, 2 is

slightly different there, 2 weights one-half and two-half.

So 2 is not paying attention to 3 at all, but it's putting equal weight on 1 and 2.

When we look at 3 in this case, then we see that 3 is putting

most of the weight on him or herself, and only a quarter on 2.

And there's again, going to be an adjacency matrix in this case.

And here this is what we would call row stochastic.

And the situation where we're thinking really of weights that sum up to 1, so

they're all negative.

So somewhere between 0 and 1.

And when we look at the weights in this case they're summing up across each row.

So sometimes we will have weighted, directed networks.

We might be interested in stochastic ones.

So stochastic here, the sum of the gij across j is equal to 1.

So we have that row stochastic matrix.

7:10

It might also be that the weighted directed network doesn't sum to 1.

So for instance here if we're thinking of the number of hours that different nodes

spend with each other.

1 is spending seven hours with two a week,

2 is spending five hours a week following 1.

1 is spending two hours following 3 and so forth, or we could have hours here or we

could think of this as numbers of emails that are sent in one direction or another.

And they don't have to be reciprocal.

It could be that 2 sending emails to 1 and 1 sends more emails back to 2,

and here 3 sending them to 2, but 2 doesn't send them back.

So these are all different representations depending on which applications we'll

be looking at.

We'll be switching back and forth, and

of course one limitation here is all of these is static pictures.

So they're fixed matrices, they're fixed graphs, they're static in time.

Dynamics, we'll talk about a little bit in the course and

we'll be using these over time.

They might be changing, and so

we might keep track of these with time subscripts as well.

Okay, so this is just sort of basic background of how we represent things, and

now we'll talk a little bit more about representing networks and

what we can do with these kinds of representations.