0:00

[SOUND].

So here in lecture 12.5, we're going to finish our analysis of the logic side,

the gate-level delay analysis side of static climbing analysis and we're going

to look at exactly how do I compute the ATS, the RATS, and the slacks.

In a realistic kind of design I've got a million gates or five million gates or

ten million gates. I need to be able to do one quick pass

through this logic or a couple of quick passes through this logic and calculate

everything. So we're going to show you a nice

recursive kind of algorithm which in a forward pass through the graph can label

the arrival times and a backwards pass in the graph can label the required arrival

times can then use the ats and the rats to calculate the slacks.

And get you immediately the worse path, the most violating path if you have a

timing problem. The next thing we're going to do which is

kind of surprising and kind of cool, is that we're going to show you that the

exact same idea for maze routing, the way-front expansion heap, cost-based

expansion can be used to actually report worst case paths, worst delay paths in

path order. So, you know the sort of thing that

happens in a real design is you got ten million gates and you got 30,000 paths

that are violating your timing. What do you want to know?

You want to know the worst one first. What do you want to know next?

You want to know the second worst one, and then the third worst one, and on to

the 29,99ninth worst one. You want them in order.

How do you find them in order? And the answer is an algorithm that looks

exactly like a cost based maze routing. So let's go look at how you calculate the

ATS, how you calculate the RATS, how you calculate the SLACKS, and how you report

the paths. How you enumerate the paths worse, in

worst case order to finish up our discussion of logic side timing.

So let's look for a second and what the, you know, the most typical kind of static

timing analysis problem there is. So you know, you give me your logic, you

tell me your o'clock period. So let's say it's a nanosecond.

So this is a gigahertz clock. And you ask me to answer this question -

what are all the to slow path? Okay, that violate timing.

And the most useful answer is that I report the paths in order from the

slowest to the fastest. Right?

So, in other words, I'm enumerating those paths in, in delay order.

So, not only do I have to be able to tell you the worst path.

I need to be tell you all the worst paths because it doesn't help, right, if you

just tell me one of them. Right so, you know in a big logic network

with, you know, millions and millions of gates, you know, the first time you

design it, the first time you synthesize it, the first time you optimize it you

know, I mean you could have tens of thousands paths that are in violation.

And is just as interesting as side, remember we talking about the fact that

we assumed something was statically sensitizable and there are lots of false

paths. Really in a real world met off in the

case that the designers are the people who are putting the systems you know

specifications together that actually know where are lot of the false paths

are. So if you're actually to really look at a

static timing analysis, kind of the input you know, you might actually have a list

of 30,000 paths that you are telling the tool don't find this please, or if you

find this, and I know you'll probably find this, don't report this to me please

because honestly I don't care, it's not a real path.

Please move on and give me the next path. So, you know?

That, that sort of happens. But you have millions of logic gates in

your, you know? In your, in your network.

You have tens of thousands of paths in violation.

The most useful thing to do is to report them to me in the order of worst to least

violating. So that's the other thing we need to keep

in mind is how to enumerate the problems. So what do we need?

We need to calculate all of the arrival times, the ATs.

We need to calculate all the required arrival times, the RATs.

We need to calculate all the Slacks. And we need to do this very efficiently

because these graphs are gigantic. They're really enormous.

And we also have to figure out- How to enumerate the violating paths in worst

delay order. Okay?

So this is what we want to figure out how to do.

So this is actually all sort of interesting that the delay graph.

And a little bit of more sort of mechanics on this graph will actually let

us do all of this stuff. So, one approach, this is an easy

approach, I like this approach because it is easy to explain is that we do

Topological sorting on the delay graph. And so the way to think about this is

that I am only going to do one recursive thing on the graph which is I am going to

sort it into the nodes into a special order and then pretty much all the

processing, the actual calculation stuff I am going to do on the graph, is really

not going to be recursive. This is going to be sort of manipulating

the nodes. So, a topological sort sorts the vertices

in the clay graph onto one single ordered list with the essential property that if,

if there's an edge from a node P to a node S, P has to appear before S in the

sorted order. So I've got a little graph example here.

There's a source node and then there's nodes B and C...

D and E in sync, so the source has an edge of 3 to B and an edge for 4 to C.

B connects to D with a 5 and E with an 11.

C connects to E with a 9, D connects to sync with a 6, E connects to sync with a

15. A topological sort is a list of the nodes

source B C D E sync in the order so that if there's an edge From one node to the

next, that node appeasr before the next. So for example B has an edge that goes to

D. So B has to appear before D in the order.

And D has an edge that goes to the SNK, so D has to appear before the SNK in the

order. And a topological sort is just a very

simple algorithm that comes up with this order.

So for example, SRC,B,D,C,E,SNK is a, is a legitimate order, SRC,B,C,D,E,SNK is a

legitimate topological sort orders, SRC,B,C,E,D,SNK, SRC,C,B,D,E,SNK,

SRC,C,B,E,D,SNK, those are all. Legitimate topological sorting orders.

And this is a really simple algorithm you can go type topological sorting and

wikipedia comes back immediately and says, oh this is a famous thing, and

wikipedia actually comes back with a little bit of pseudocode and honestly

this is nothing more than a depth first search walk on the graph with a little

bit of processing. So this is a very, very, very easy thing

to do. So, let's assume that you've actually

topologically sorted the graph. Right?

How do you compute the arrival time? So, I've got my little picture again from

the previous couple of lectures ago, of computing the arrival time.

So there's a source node. There's a set of highlighted nodes called

the predecessors. Pred n of node n.

One of them is distinguished with a p, right, and they all have a delay delta

from say from node p to node n. And then there is the remaining side of

the graph. So there are the successor nodes to node

n, which one of them is highlighted with an s, and the successor nodes all have

paths to the sync, but those are all in gray.

What I really care about is node n, and it's predecessors, the predecessors

event. All right, so, if I have topologically

sorted the graph, there is a very simple algorithm, you just walk though the list,

right? So to compute the ATs, the first thing

you do is you set the AT of the source to zero because that's what it is and then

for each node in, in topologically sorted order, the first thing you do is you set

the AT to negative infinity because you're calculating a max and you just

need the first one to. You know, never be the right answer.

And then for each node in the predecessors, right, so you look at all

the nodes in the predecessor set of node n you just calculate.

You know the AT is the max over the AT of node n.

Now this is the running maximum, and the AT of the predecessor node p, plus the

RAT, plus the delay from node p to m. you just simply walk through the node

form the source to to the sync, knowing that you're visiting the nodes in an

order. such that every tiem you get to a node

liek n you knwo youve alresd processed arrival times of predators and you can

calculate straightforwardly this is pretty clse to the definiton of the at we

gave you know 20 slides earlier this is really simple this is why we like

topological soritng it makes it very simple.

How do you calculate the RATs? Well, you know, more or less the same

thing, but, you know, you, sort of, pretend that you’re going in the opposite

direction. So it’s, kind of, like, you’re pretending

all the edges are reversed, that they point from the sink to the source.

So you, you know, you’re just going to walk the graph backwards, which means you

walk the topological sorting order in the other order.

So how do you calculate the RATs? again, assuming you have a topological

sorting you set the RAT of the sync to be cycle time because that's the right

answer. And then for each node and in reverse

topological sorted order. Okay.

you, walk backwards through the graph. and then you look at, in this case, you

look at all the successors, right? So, you look at node n.

You look at all the successor nodes, which you know have been visited already.

Because you've got the. Topological sorting order correct, you'll

look at each successor node S and you calculate the rat with the formula.

The Rat is the minimum of whatever the rat is, because you know you're starting,

this is the running rat computation, or the rat of S the successor minus the edge

delay, Delta from N to S... So all pretty simple, all pretty straight

forward. If you can topologically sort the graph

and you can do that with a very simple depth for a search, then you get this

list with you know, several million nodes in it.

You just walk it straight in order. And you know that every time that you

visit a node in the forward order for the AT, you've already calculated the

predessesor so you can just use the formula.

And you know that when you walk the list in the reverse order, you've already

calculated the RAT for the successor and you can just use the formula.

It's beautiful and simple and easy. Now, another thing that's really not at

all obvious and is kind of cool, is that you can use the slack for path reporting.

Right, so remember one of the things I said was I really want to enumerate the

pads in worst delay order because then I could actually go ask a question like

what are all the pads where the delay greater than 12, right?

In order, I can focus on sort of the biggest problems first, you can use the

slack for path reporting. The slack property is that all the nodes

in the longest path have the same slack Right?

So the surprising result is you can actually use this to find the N worst

paths even though you didn't trace them all.

You can actually enumerate them, one at a time in order, with a really easy

algorhythm. So, let's just actually just sort of put

all the values on this little graph. So this is the same graph I showed

previously. Source b d c e sync source connects with

a 3 to B, 4 to C, B goes to D with a 5, B goes to E with an 11.

C goes to D, C goes to E with a 9, D goes to sync with a 6, E goes to sync with a

15. The cycle time is 29.

Just made it up. The ATs are: the AT for the source is

zero, the AT for B is three, the AT for eight is D, the AT for D is eight, the AT

for C is four, the AT for E is 14, and the sync the AT is 29.

What about the rats? I'm going to do this backwards.

The RAT for the sink is 29. The RAT for D is 23.

The RAT for E is 14. The RAT for B is three.

The RAT for C is five. The RAT for source is zero.

And what are the slacks? The slack is zero at the source, zero for

B. One for C, 15 for D, zero for E, zero for

the sink. Right, so, we can do a forward pass to

get the ATs, a backward pass to get the RAT's.

While we're doing the RAT's we can also' do the Slacks.

So one forward pass to the graph, one backward pass to the graph, all the AT's,

all the RAT's, all the Slacks. Now, there's a surprising and simple

algorithm for, actually, doing the N worst path reporting.

Now, what's really amazing, is that it's basically Maze routing.

And, one of the reasons for talking about the Maze routing stuff first, is that,

you get the, sort of, conceptual framework, and I can do this in one

slide. Right, so we're going to find the N worst

paths by evolving partial paths just like we evolved partial paths in the maze

router, and each partial path stores three things, the path itself, the delay

on this path and the slack of the final node on the path, and we're going to

store the partial paths on a heap just like maze routing.

Only the thing that we're going to use to index the heap is the slack value.

Right, so in[UNKNOWN] routing we were indexing on either the length, you know

the path length, or the estimated path length if we had a predictor, we're just

going to do this on the slack. So we're going to sort so that the path

with the worst slack endpoint is always on the top.

That means you know, we're going to have the heap.

Always make sure that the partial path with the most smallest slack, and that

might even mean a slack with the biggest negative, the most negative number is on

the top. And initially the heat just contains the

two point maze routing, And in the algorithm, the rest of the algorithm

looks very simple and familiar like maze routing.

There's an expansion step, you pop the partial path off the top of the heap.

It has the most negative smallest slack, or at least the smallest slack.

you ask the question have I reached the target?

Is this the end of the synch? If so correct congratulations print out

the path and throw the note away. Otherwise when you expand the path which

means you pull it out of the heap you reach all the other things that are one

edge away. So you add each successor note to make

new partial paths for all of the ed notes that are connected by a edge, and you

push them back on the heap with. A little data structure, that is, the

path, the delay, and the slack value of the, of that new node.

The node at the end of the partial path. And then you just keep going.

Really, it is really just that simple. let's do a little simple example.

So I've got my little graph again. Source, B C, D E Sink.

And I've got all the slack values labelled.

Slack is zero for the source, zero for B, one for C, 15 for D, zero for E and zero

for the sink. And the heap starts as if you will the

source cell just like the source ceill for a maze router.

and it starts, with the source cell, the delay to the source cell, which in this

case is zero, and the slack at the source cell, which is zero.

And so, we're just going to push that thing in the heap.

It's really very much like maze routing, and I've got this highlighted, in the

graph and we're going to see the highlights as these little green boxes

that are going to be trying to show the show the paths.

So this is a path with eactly one cell in it, which is the source.

Right? The delay on this path is 0 because it's

the source and the slack on this path is 0.

And what we're going to do is evolve paths so we're going to actually going to

use this You know, just like May's routing, we're going to be reaching our

neighbors. And you know, you shouldn't be surprised

that we're going to be reaching B and reaching C in the next slide.

We're going to be calculating some stuff. We're going to be calculating some

partial paths. We're going to be shoving things back in

the heap. The heap is going to be driving the

stuff. We pop things off the heap.

We use this to sort of expand, define, define new paths and we throw the cell

away. So, right now it's just like MACE

routing. There's one cell in the heap, it's the

source. The delay is zero.

The slack is zero and let's remember that what we, what we do to sort of make this

process go forward is we index on the slack.

Okay, so what happens next? Well, we take the source path and we pop

it off of the heap, and we use that to reach b and c.

And so we actually get 2 new paths, right.

We get the source to b path, and we get the source to c path.

And the source to b path has the length of 3.

And the source to c path has the length of 4 but we put them indexed in the heap.

On the slack. And so, it is the zero slack of the b

node which is the end of the source to b path.

That's the minimum of the heap. That's the thing that's the top of the

heap. So there's 2 paths in the heap right now.

Source to b, source to c. But the thing that's on the top of the

heap. Is the 0 slack path source to be.

So, what are we going to do, we are going to take that path off, we are going to

pop it, we are going to expand it, we are going to use it to reach it neighbors

Okay. So, going forward we are expanding the

source to B path and we are going to use that to reach it's neighbors.

So, what are we going to get, we are going to reach D and so we are going to

get source B to D path. And then we're going to also reach E, so

we're going to get a source to be to E path, and so I'm still highlighting these

things with kind of green so you can get sense.

There's a source to be to D path and the slack value on the end of that one is 15.

There's a source to be to E path and the slack value is 0, and there's a source to

C path... And the slack value in that is one.

And so when you put those paths in the heap source to be to E source to be to D

source to C, again indexed on the slack value.

So what's on the top? The source to B to E path because it has

the smallest slack. So what are we going to do?

We're going to take the thing off the top of the heap, the source to B to E path,

we're going to pop it, we're going to expand it, that's what we're going to do

next. So going forward, the source to B to E

path is being expanded, we've popped it out of the heap, and we're going to reach

what? We're going to reach it's neighbor which

is the sink, and so we get a source to B to E to sink path Oh hey, the sink is

like the target in May's routing. Hooray.

Only I've, I haven't routed a wire, I have found a path.

I print it out. I print out source BE SNK with a delay.

That delay is 29, and then I throw that away.

And I go back, and I look at okay, what's the next most the next smallest slack

path in the heap, and the answer is it's the source to c path, right, it's this

path the source to c path. So I'm going to pop that off the heap and

I'm going to expand it, just like maze routing, I'm going to use it to reach its

neighbors. Alright, so, that's going to happen next.

What happens? I'm expanding the source-to-C path, and I

reach E. Now, one of the things to note, is that,

in this particular algorithm I'm going to reach e again.

So there's none of this, you know? A bit that you mark that says, oh, I've

reached it, and I don't need to reach it again, you know?

And, sort of, the maze routing with a consistent path, with a consistent cost

function. Look, we're looking for longest delay

paths in the graph. We don't care if a bunch of them go

through the same gate. In fact, you know?

A bunch of them might go through the same gate.

That's just[INAUDIBLE] . That's just the way it works.

So we pop the source to C Path, and we expand it.

What do we expand it to? We expand it to E, okay.

And so we take it, and we push it back in.

What do we push it back in on? We push it back in on the slack value,

alright. The slack value is zero.

So we get a source to see the E path with a slack value of zero.

What else is sitting around in the heap? What else is sitting around in the heap

is a source to B to D path. With a slack of 15.

Okay, what do we do? Go back to the heap, pop the smallest

cell off the top. The smallest cell, you know the cell with

the worst slack value, the slack is zero. We pop the source CE path, and we expand

it. So, pop the source CE path, and what

happens? We expand it.

When we expand it we reach the SNK. Oh, hey!

It's like routing the net again. We have a source to C to SNK path that's

like reaching the target. We print the delay out.

What's the delay on this one? The delay on this one is 28.

And then we throw it away. Just like you expand a cell, you throw it

away in the maze routing. Okay.

Then we go back, and what's still sitting around in the heap?

And the answer is, what's sitting around in the heap is a source to b to d path.

And that has a slack of 15. It turns out, that's the only thing in

the heap right now. So, we are going to expand that.

We're going to pop it off the heap and we're going to expand it and see what

happens. So, we expand it and we're gonnas pop the

source BD path and we're going to reach the sink again.

Hurray! Okay, what happens?

Now, we reach the sink with a path delay of 14.

Great. We print out the source B D sink path of

14, and we go back, and we say, all right, what else is in the heap?

And the answer is, oh, nothing. The heap is empty.

Or in this particular case, like, oh, okay.

We've printed all the paths for this particular graph, which if you look at

it, makes sense. There's only three ways you can get from

the source to the sink in this graph. Look at it, SRC BD SNK, SRC BE SNK, SRC

CE SNK, and look, my little maze routing kind of algorithm where I evolve the

partial path. I index it by the slack on the end node

of the partial path. I push it into the heap, like reaching

things. I pop it out and expand it just like maze

routing I acknowledge the fact that I will reach the same node multiple times.

I don't worry about it. For this particular algorithm I can print

out for you all of the paths in the worst delay order and believe me when you have

millions and millions of gates and you know zillions of paths and you know that,

you know probably the first 35000 paths are screwed up, right, because you know

your timing is not optimized yet and you just want to print them out in the order

of worst to least worst. The fact that you can do this super fast.

Is wonderful. Right?

It's a beautiful, simple algorithm. And it's one of the reasons that I talk

about maze routing before I talk about this stuff, because it really feels just

like maze routing. So, with that.

you know? We basically, we come to the end of our,

of our static timing analysis. Static timing analysis is a very

important step in the design of complex ASICs.

It's what's called a sign off step. And what that means is you don't get to

fabricate. Or you don't get to go to the next step

of the design process until you pass this.

So people spend a lot of time Qualifying their design, optimizing their design,

changing their design, fixing their design so that it passes the static

timing analysis sign-off. You can do it early in the process when

you just have the logic aids and you can either know nothing about the wires, or

really lousy models for the wires. The more information you have about your

gates, the more information you have about your wires.

The more accurate your, static timing analysis can be.

Many big ideas, you know, to get to this point in our lecture.

Gate level delay models matter. And I'm sorry to say, they can just be

really complex in the real world. It's just the way the physics work.

Logical analysis of the graph is not what we're doing.

We're doing topological path analysis. We're assuming all paths are

sensitizable. No paths are false.

Honestly, in real static timing, somebody who's really smart and knows a lot about

the logic, will tell you that they're just some false paths.

And they'll put them in the file, and they'll say, please don't report these.

We both know they're false. You build delay graph calculate the ATs

the RATs and the slacks recursively i gave you a really simple algorithm were

you topologically sort the graph from the source to the sin walk from the source to

the sink calculate the ATs walk from the sink to the source you calculate the rats

and the slacks and then you can do this cool path reporting stuff.

The concept of slack is very, very big it's, you know, it's, it's an incredibly,

incredibly important idea because it drives the path of numeration process and

because it tells you how much trouble you're in on a gate by gate basis in a

gigantic network. All right, so it lets you locate the

worst paths and the problem gates, and an idea surprisingly like maze routing lets

you enumerate the paths in worst delay order, which is really a very, very

pretty algorithm. Finally the static timing world is so

big, and the world of timing is so complicated, I have to admit but just,

you know a few of the things I didn't tell you.

I didn't tell you anything about static timing analysis with sequential elements.

I pretended the flip-flops didn't exist, and that's not really good enough.

how do you model the flip-flops or latches so you can verify, for example,

that the setup in the hole time, are not being violated?

you know, for a design that has a million flip-flops in it?

it turns out there's just some more tricks with the delay graph.

it's very interesting. You can take the delay graph, and you add

some edges to it that represent the timing things.

Turns out you add some loops to the graph, which is a little disconcerting

But it turns out you add some loops to the graph you calculate some ATs and some

RATs. And some things and then you check the

values on some edges and sort of like the slacks if something bad happens to the

number you know you have a problem. there's a whole other kind of analysis

that I didn't show you. What i showed you is technically called

Late Mode Timing analysis, so it is, I worry about the longest path through the

graph. I worry about the fact that my clock is

in gigahertz, I have one nanosecond, one thousand picoseconds to get, you know,

from the input of the logic to the output of the logic and that's what we worried

about... there's early mode timing.

If you have things like transparent latches, where you're worried about the

shortest paths, and, you know, maybe, you know, going around the loop from the

storage element through the logic more than once in one clock period.

There's some fancy kinds of timing that use these sorts of sophisticated

analysis. There's a whole other set of formulas.

They look very much like the late mode, but they're just a little bit harder to

understand if you're not working with some of the more fancy timing problems.

There's also incremental static timing analysis.

So, you know, in practice you have a million logic gates, and you change

10,000 of them because some logic designer fixes something, you really

don't want to redo the whole static timing analysis.

You just want to go in and change the stuff that changes.

there's some very very nice algorithms that can just incrementally look at, what

changed in your logic. And so, what's going to change in your

static timing analysis? You know?

What are the, the ATs that can change? What are the RATs that can change?

What are the, what are the, what are the slacks that can change?

Don't update all 20 million of them if you only have to update 500,000 of them.

so they're very pretty algorithms that do that and again just didn't have time to

talk about it, static timing analysis is a huge, huge part of the way people

actually build, build chips and so you now know the logic side of things.

What we need to talk about next is the geometry side because you know what, you

know what's between all those gates... Wires, and wires have their own

complicated timing behavior. So let's go look at those next.

[SOUND]