0:00

So now we have done the first attempt at computing pairwise distancing using

Â brute force.

Â We saw it from an efficiency perspective, that's not good.

Â Then we came up with that algorithm that uses basically breadth-first search, and

Â we showed that it is actually efficient, right?

Â That the running time is linear in the number of edges.

Â Quadratic in the number of nodes in the worst case.

Â And now that we have, we have come up with, with an efficient algorithm, we now

Â go back to our original question, which is again, we want to, if we are given a graph

Â that corresponds to our network, Facebook network, computer network and so on.

Â We want to check if it is, if it satisfies or exhibits the small world effect.

Â And for that, remember that the computers or

Â the computational solution was, given the graph,

Â let's compute all pairwise distances and plot the distribution of these distances.

Â Because once we plot the distribution of these distances, we can look at what type

Â of distances we see, what is the frequency of every distance, and so on.

Â For example, if, if all the distances between every pair of nodes,

Â we see distance one, this, then we, we know that it is very, we have very,

Â very short distances.

Â And if we find lots of pairs whose distances are very large, then we know,

Â we don't have a small world [INAUDIBLE] with that, with that graph.

Â Okay?

Â So now, we have an algorithm based on B F S.

Â We are going to use it to compute what we're going to call to compute

Â the distance distribution.

Â Right? So now we can introduce,

Â we can come up with this algorithm, compute.

Â 1:47

Okay?

Â So we want to compute distance distribution.

Â What is the input to the, to this algorithm?

Â Now, we have been dealing with algorithm that took also as part of

Â the input a node.

Â And we computed distances from this node.

Â But when we are looking at this question of the small world phenomenon,

Â we want to compute pair-wise distances between every pair of nodes.

Â So there is no specific node here.

Â We want to compute for every pair of nodes.

Â So the input is

Â only a graph, okay?

Â What is the output?

Â The output is the frequencies of each possible distance.

Â Now how many possible distances before we, we write this we have to

Â figure out how many possible distances can we have in a graph.

Â 2:38

We have talked about this before that if we have n nodes,

Â then the maximum possible distance that we can have,

Â the maximum possible distance that we can have is n minus 1.

Â So if you have only one node in the graph,

Â the distances you're going to see are zero.

Â If you have two nodes, either there is no path between them and the,

Â all the distances are zero, or they're connected by an edge and

Â the distance is 1, which is 2 minus 1.

Â If you have three nodes, for any pair of nodes,

Â the maximum distance that you can have, is two.

Â If you have n, it is n minus 1.

Â So basically we need to return for n minus 1 possible values.

Â We need to return the frequency with which we see each one of these values, okay?

Â So for distance zero, how many pairs of nodes we see at distance zero?

Â For distance one, how many pairs of nodes we see distance one?

Â For distance N minus 1, how many pairs of nodes have distance N minus 1.

Â So the output is, I'm going to store all these in an array.

Â I'm going to call it F for frequency, for example.

Â It's going to be indexed from zero to size of V, minus 1.

Â This notation is going to say if is an array is indexed from zero to,

Â to big v minus 1, the size of v minus 1.

Â Where I'm going to say that if of, of D for example,

Â is the frequency of distance D, okay?

Â With a frequency with which distance D

Â appears over all possible pairs of nodes, okay?

Â So, how would we compute this this output from a graph?

Â So the way to do it, is that again usually algorithms have to initialize things.

Â That before I start computing anything, what are the frequencies?

Â Before I look at the graphs, I have to initialize the,

Â the, this, this array for D is 0 to [SOUND].

Â For every possible distance, I'm going to say that ori, initially, for

Â that distance the frequency is going to be zero.

Â I haven't touched the graph yet, I haven't computed any pairwise distance.

Â I know that every possible distance now appears with frequency,

Â with frequency zero.

Â 5:07

After this, what do we want to do?

Â We want to compute pairwise distance between every pair of nodes.

Â So the way we are going to do it, since we have seen BFS.

Â How does BFS work?

Â BFS takes a node, and compute the distance from that node to every other node.

Â So now, if I want to compute the pairwise distance of

Â every pair of nodes that involve node i, I can run BFS when, for

Â starting from node i, compute the distance to every other one.

Â Now I know the pairwise distance between i and every other node.

Â But I'm not done because now I have to compute the dis,

Â pair-wise distance from every node to every other node.

Â So the way to do this, is by basically going, looping for each node I in V, okay?

Â We can say do here.

Â For each one of them, we are going to do BFS on G and Node I.

Â 6:05

When we run BFS, so I'm calling a function here, BFS,

Â that we had introduced before, we're going to call BFS on G and i.

Â Remember, what does BFS return?

Â It returns it the distances from i to every other node,

Â and I'm going to tell it to return something like D.

Â 7:59

Okay?

Â So for each i, I do BFS from i to

Â every other node computing the distance from i to each one of them.

Â Then for each these, for these distances,

Â I take them one at a time, and I update the frequency of the distances.

Â So if I see dist of j is five, then I add one to f of five.

Â If I see for another node,

Â dist of j is three, I add one to f of three, and I keep updating them.

Â Now once we are done with the first node i, then we go to another node, and

Â do the same thing, another node, we do the same thing.

Â Once I'm done here, I can return f, right?

Â 8:40

So this function here, this algorithm is going to run BFS from every node

Â in the graph, thus computing the pairwise distances between i and every other node.

Â And looping over all possible nodes is going to do it for every pair of nodes.

Â And I will update the, the distances in F,

Â their frequencies, I always add one every time I see it and return F.

Â Notice that I am indexing this F by this dist of J because dist of J is

Â the distance, it's the value of a distance to node J, so I can use it as the index.

Â This dist of g, what are the possible values of dist of g?

Â It can be zero, one, two, three, all the way to the size of v minus 1.

Â So I'm not going to be exceeding or doing anything wrong with indexing this array.

Â Okay?

Â So this is an algorithm that now after, after this

Â sequence of attempts at computing distances, analyzing efficiency and so on.

Â Now we finally have an algorithm that is going to make use of an efficient

Â algorithm for computing distances, that will basically tabulate, or

Â the distance distribution for us, okay?

Â Now that we have done this, what is the running time, how long does it take?

Â And again, now it is very simple because we have done this exercise before.

Â For D equals 0 to V minus 1, do this.

Â If I assume the graph is given by its adjacentcy list,

Â it has N nodes, M edges and connected and so on.

Â This is going to do on the order of N operations, right?

Â For every node is going to, is going to update this value, so

Â it's on the order of N operations.

Â This is going to be on the order of N operations.

Â This here, is order of n.

Â 10:24

Some might think it takes one operation because I'm just calling a function and

Â returning a value, and this is a common mistake.

Â No it doesn't do one operation, it takes as long as BFS takes.

Â Okay? How long does BFS take on a graph that has

Â n nodes and m edges?

Â As we said, if I assume the number of edges is greater than the number of nodes,

Â this takes O of m operations.

Â And this here takes O of n.

Â 11:12

Now this is the dominant term so we can remove this.

Â And as I said if m is greater than n, if m is greater than n, I can remove this n and

Â basically we get an O of N times m times, okay.

Â So we will get here.

Â Again I can drop this n because if m is greater than n,

Â I end up with n times m, and this will be the running time.

Â So when again, when the graph is very dense, m is a node of n squared,

Â in the worst case it's going to be cubic in the number of nodes, okay?

Â Which is still very good algorithm for computing all pairwise distances in

Â an undirected graph with no weights on its edges.

Â So we have this and again for, if the graph is not very dense,

Â it's going to be actually much better than cubic.

Â Okay?

Â So now, again we started with the problem of is the,

Â is the world small for a given network?

Â We for, we, we understood the problem, we formulated it computationally and

Â then we started attempting to come up with algorithm.

Â We came up with brute force algorithm,

Â showed it's not the efficient, came up with BFS, show that it is efficient.

Â Once we have shown that it is efficient,

Â now I can make use of this function and to, to solve our original problem.

Â And we get a very efficient solution for it.

Â