0:36

So that's just a way to encode an ordering, and

Â then here's really the important property that

Â every directed edge of G goes forward in the ordering.

Â That is if u v is directed edge of the directed graph G,

Â then it should be that the f value of the tail is less than the f value of the head.

Â That is this directed edge has the higher f value as traverse

Â in the correct direction.

Â Let me give you an example just to make this more clear.

Â So suppose we have this very simple directed graph, with four vertices.

Â Let me show you two different,

Â totally legitimate topological orderings of this graph.

Â So the first thing you could do,

Â is you could label s1, v2, w3 and t4.

Â Another option would be to label them the same way,

Â except you can swap the labels of v and w.

Â So if you want, you can label v 3 and w 2.

Â So again,

Â what these labelings are really meant to encode is an ordering of the vertices.

Â So the blue labeling, you can think of as encoding

Â the ordering in which we put s first then v then w and then t.

Â Whereas the green labeling can be thought of as the same ordering of

Â the nodes except with w coming before v.

Â What's important is that the pattern of the edge is exactly the same in both

Â cases, and in particular all of the edges go forward in this ordering.

Â So in either case we have s with edges from s to v, and s to w.

Â 2:11

So that looks the same way pictorially, whichever order v or w are in, and

Â then symmetrically there are edges from v and w to t.

Â So you'll notice that no matter which order that we put v and w in,

Â all four of these edges go forward in each of these orderings.

Â Now if you try to put v before s it wouldn't work because the edge

Â from s to v would be going backward if v preceded s.

Â Similarly, if you put t anywhere other than the final position,

Â you would not have a topological ordering.

Â So in fact,

Â these are the only two topological orderings of this directed graph.

Â I encourage you to convince yourself of that.

Â Now, who cares about topological orderings?

Â Well, this is actually a very useful subroutine.

Â This has been come up in all kinds of applications.

Â Really, whenever you want to sequence a bunch of tasks when there's precedent

Â constraints among them.

Â By precedence constraint I mean one task has to be finished before another.

Â You can think, for example, about the courses in some kind of undergraduate

Â major like a computer science major.

Â Here the vertices are going to correspond to all of the course and

Â there's a directed edge from course A to course B if course A is a prerequisite for

Â course B, if you have to take it first.

Â So then of course, you'd like to know a sequence in which you can take these

Â courses so that you always take a course after you've taken its pre-requisite.

Â And that's exactly what a topological ordering will accomplish.

Â So it's reasonable to ask the question when does a directed graph

Â have a topological ordering and when a graph does have such an ordering,

Â how do we get our grubby little hands on it?

Â Well there's a very clear necessary condition for

Â a graph to have a topological ordering, which is it had better be a cyclic.

Â Put differently, if a directed graph has a directed cycle

Â then there is certainly no way there is going to be a topological ordering.

Â 6:24

By assumption there's no sink vertex, so this isn't a sink vertex, so

Â there's an outgoing arc, so let's follow it.

Â We get to some other node.

Â That also has an outgoing arc, let's follow that, and so on.

Â So we just keep following outgoing arcs, and we do this as long as we want

Â because every vertex has at least one outgoing arc.

Â Well there's a finite number of vertices, right this graph has say N vertices.

Â So if we follow N arcs, we are going to see N+1 vertices.

Â So by the pigeon-hole principle, we're going to have to see a repeat.

Â Right so, if N+1 vertices, is only N distinct vertices,

Â we're going to see some vertex twice.

Â So for example, maybe after I take the outgoing arc from this vertex,

Â I get back to this one that I saw previously.

Â Well, what have we done?

Â What happens when we get a repeated vertex?

Â By tracing these outgoing arcs and repeating a vertex,

Â we have exhibited a directed cycle.

Â And that's exactly what we're assuming doesn't exist.

Â We're talking about directed acyclic graphs.

Â So, put differently,

Â we just prove that a vertex with no sink vertex has to have a directed cycle.

Â So a directed acyclic graph therefore has to have at least one sink vertex.

Â 8:39

So let v be a sink vertex of g,

Â if there's many sink vertices, we pick one arbitrarily.

Â We set v's label to be the maximum possible, so

Â if there's n vertices we're going to put that in the nth position.

Â And now we just recurse on the rest of the graph, which has only n-1 vertices.

Â So how would this work in the example on the right?

Â Well in the first iteration, or the first outermost recursive call,

Â the only sink vertex is this right most one, circled in green.

Â So there's four vertices, so we're going to give that the label 4.

Â So, then having labeled that 4,

Â we delete that vertex and all the edges incident to it.

Â And we recurse on what's left of the graph, so

Â that would be the left-most three vertices plus the left-most two edges.

Â Now, this graph has two sink vertices, after we've deleted 4 and

Â everything from it.

Â So both this top vertex and this bottom vertex are sinks in the residual graph.

Â So now in the next recursive call, we can choose either of those as our sink vertex.

Â Because we have two choices, that generates two topological orderings.

Â Those are exactly the ones that we saw in the example.

Â But if, for example,

Â we choose this one to be our sink vertex, then that gets the label 3.

Â Then we recurse just on the northwestern most two edges.

Â This vertex is the unique sink in that graph, that gets the label 2.

Â And then it recurs on the one node that we graph, and that gets the label 1.

Â So, why is this algorithm work?

Â Well, there's just two quick observations we need.

Â So, first of all, we need to argue that it makes sense that in every iteration or

Â in every recursive call, we can indeed find the sink vertex,

Â that we can assign in the final position that's still unfilled.

Â And the reason for that is just if you take a directed acyclic graph and

Â you delete one or more vertices from it,

Â you're still going to have a directed acyclic graph, right?

Â You can't create cycles by just getting rid of stuff.

Â You can only destroy cycles, and we started with no cycles.

Â So through all the intermediate recursive calls we have no cycles by

Â our first observation is always the sink.

Â 13:32

And if you find any vertex that you haven't seen before,

Â you immediately start recursing on that node.

Â So I said the first modification we need is to embed this into an outer for

Â loop to ensure that every single node gets labeled.

Â So I'm going to call that subroutine DFS-Loop.

Â It does not take a start vertex.

Â Initialization, all nodes start out on an explorative course.

Â And we're also going to keep track of a global variable,

Â which I'll call current_label.

Â This is going to be initialized to n,

Â and we're going to count down each time we finish exploring a new node.

Â And these will be precisely the f values.

Â These will be exactly the positions of the vertices

Â in the topological ordering that we output.

Â In the main loop we're going to iterate over all of the nodes of the graph.

Â So for example, we just do a scan through the node array.

Â As usual, we don't want to do any work twice, so if a vertex has already

Â been explored in some previous invocation of DFS, we don't search from it.

Â This should all be familiar from our embedding of breadth first search in a for

Â loop when we computed the connected components of an undirected graph.

Â And if we get to a vertex v of the graph, that we haven't explored yet,

Â then we just invoke DFS in the graph, with that vertex as the starting point.

Â So, the final thing I need to add is I need to tell you what the f values are,

Â what the actual assignments of vertices to positions are.

Â And as I foreshadowed, we're going to use this global current_label variable.

Â And that'll have us assign vertices to positions from right to the left.

Â Very much mimicking what was going on in our recursive solution,

Â where we plucked off sink vertices one at a time.

Â So, when's the right time to assign a vertex its position?

Â Well, it turns out,

Â the right time is when we've completely finished with that vertex.

Â So we're about to pop the recursive call from the stack

Â corresponding to that vertex.

Â 15:30

And that's it, that is the entire algorithm.

Â So, the claim is going to be that the f values produced,

Â which you'll notice are going to be the integers between n through 1,

Â because DFS will be called eventually once on every vertex, and

Â it will get some integer assignment at the end.

Â And everybody is going to get a distinct value, and the largest one is n and

Â the smallest one is 1.

Â The claim is that is a topological ordering.

Â Clearly this algorithm is just as blazingly fast as DFS itself,

Â with just a trivial amount of extra bookkeeping.

Â Lets see how it works on our running example.

Â So lets just say we have this four node directed graph that we're getting

Â quite used to.

Â So this has four vertices, so

Â we initialize the current label variable to be equal to 4.

Â So let's say that in the outer DFS loop,

Â let's say we start somewhere like the vertex v.

Â So notice in this outer for

Â loop, we wind up considering the vertices in a totally arbitrary order.

Â So let's say we first call DFS from this vertex v.

Â So what happens, well, the only place you can go from v is to t, and then at t,

Â there's nowhere to go.

Â So we recursively call DFS at t, there's no edges to go through, we finish the for

Â loop, and so t is going to be assigned an f value equal to the current label,

Â which is n, and here, n is the number of vertices, which is 4.

Â So f(t) is going to get, so our t is going to get the assignment, the label 4.

Â So then now we're done with t, we backtrack back to v.

Â We decrement the current label as we finish up with t, we get to v, and

Â now there's no more outgoing arcs to explore, so for loops finish.

Â So we're done with it in depth-first search.

Â So it gets what's the new current label, which is now 3, and again,

Â having finished with v, we decrement the current label, which is now down to 2.

Â So now we go back to the outer for loop,

Â maybe the next vertex we consider is the vertex t.

Â But we've already been there, so we don't bother to DFS on t.

Â And then maybe after that, we try it on s.

Â So maybe s is the third vertex that the for loop considers.

Â We haven't seen s yet, so we invoke DFS, starting from the vertex s.

Â From s, there's two arcs to explore, the one with v, v we've already seen, so

Â nothing's going to happen with the arc sv.

Â But on the other hand, the arc sW will cause us to recursively call DFS on w.

Â From w, we try to look at the arc from w to t, but we've already been to t, so

Â we don't do anything.

Â That finishes up with w, so depth-first search then finishes up at the vertex w,

Â w gets the assignment of the current label.

Â So f(w) = 2.

Â We decrement current label, now its value is 1.

Â Now we backtrack to s, so we've already considered all of s's outgoing arcs, so

Â we're done with s.

Â It gets the current label, which is 1.

Â And this is indeed one of the two topological orderings of this graph

Â that we exhibited a couple slides ago.

Â So that's the full description of the algorithm and

Â how it works in a concrete example.

Â Let's just discuss what are its key properties, its running time and

Â its correctness.

Â So, as far as the running time of this algorithm the running time is linear.

Â It's exactly what you'd want it to be.

Â And the reason the running time is linear is for

Â the usual reasons that these search algorithms run in linear time.

Â You're explicitly keeping track of which nodes you've been to so

Â that you don't visit them twice, so you only do a constant amount of work for

Â each of the n nodes.

Â And each edge, in a directed graph, you actually only look at each edge once,

Â when you visit the tail of that edge.

Â So you only do a constant amount of work per edge as well.

Â Of course, the other key property is correctness.

Â That is, we need to show that you are guaranteed to get a topological ordering.

Â So what does that mean?

Â That means every arc travels forward in the ordering.

Â So if (u,v) is an edge, then f(u), the label assigned

Â to u in this algorithm is less than the label assigned to v.

Â The proof of correctness splits into two cases,

Â depending on which of the vertices u or v is visited first by depth-first search.

Â Because of our for loop, which iterates over all of the vertices of the graph g,

Â depth-first search is going to be invoked exactly once from each of the vertices.

Â Either u or v could be first, both are possible.

Â So first let's assume that u was visited by DFS before v, so then what happens?

Â Well, remember what depth first search does, when you invoke it from a node,

Â it's going to find everything findable from that node.

Â So if u is visited before v, that means v isn't getting explored, so

Â it's a candidate for being discovered.

Â Moreover, there's a an arc straight from u to v, so

Â certainly DFS invoked at u is going to discover v.

Â Furthermore, the recursive call corresponding to the node v is going to

Â finish, it's going to get popped off the program stack before that of u.

Â The easiest way to see this is just to think about the recursive structure of

Â depth-first search.

Â So when you call depth-first search from u, that recursive call, that's going to

Â make further recursive calls to all of the relevant neighbors including v, and

Â u's call is not going to get popped off the stack until v's does beforehand.

Â That's because of the last in, first out nature of a stack or

Â of a recursive algorithm.

Â