[MUSIC].
So, here in Lecture 12.2, we're going to start our exploration of the logic side
of timing analysis. And so, what do we need to know?
We need to know what the basic assumptions are about the timing universe
and so that first assumption is going to be synchronous logic.
Things that leave storage elements, like flip flops, go through a big block of
logic and return to storage elements like flip flops.
And in this lecture, we're really just going to talk about a combinational side
of things. next we have to talk about well, where's
the delay? And the first thing we're going to have
to talk about is, what do delay models for individual gates look like?
And, one of the things that's a little bit surprising is they can be really
complicated. And so we're going to talk about how
complicated they can be but then we're going to restrict things to a sort of a
simple, reasonably realistic universe of things that are, are, in a, in a
commercially viable but, but simple enough that we can actually, actually do
some examples. And then we are going to talk about the
fact that in the actual way we do timing for logic, we stop looking at the logic.
We stop looking at anything that looks like Boolean algebra.
And we just look at these things as big complicated graphs.
And so, we're going to talk about why, from a complexity point of view, we do
something called a topological timing analysis, and not logical timing
analysis. So, let's go start looking at sequential
things, the combinational parts thereof, the delays through the gates, and
topological timing analysis. >> So, when we say that we're
interested in doing timing analysis at the logic level, what are, what are we
actually talking about? Well, our goal is to verify the timing
behavior of our logic design. So, here's the scenario.
I give you a gate-level netlist and I give you some timing models of the gates
and, maybe after the placement and the routing, I give you some timing models of
the wires. And you have tools in place that can tell
me the following answers. when signals arrive at various points in
the network, the longest delay is through the gate level network.
does the network satisfy timing requirements?
So, suppose I tell you that I want this chip to run at 1 gigahertz, which means
there's one nanosecond between the edges of the clock that control the flip flops.
Is it the case that all of the logic, the combinational logic is such that, if a
signal enters a block of combinational logic, it arrives not longer than one
nanosecond later. All right.
That's the kind of questions I want to know.
And if we do this analysis on our logic and it turns out that the answer is, oh
yeah, the logic's too slow I can't get all of the paths to the logic in 1
nanosecond. Some of them are 1.05 nanoseconds.
where do I look? you know, modern design has millions and
millions and millions of gates. It would be great if the analysis
techniques come back and they pinpoint exactly where my problems are.
And so, I'm going to show you some techniques that can answer all of those
questions. And in particular, and maybe in a
surprising way, answer the question exactly where's my problem?
What should I go focus on to fix? Now, the thing that's unfortunate is
that, the, it is the nature of the, the way you know the, the you know, the
electrical and the physical models work that a lot of this delay stuff is just
complicated in the real world. So, we're going to talk about that for a
little bit. And we're going to, you know, talk about
how we sort of simplify that for the purposes of this lecture.
first, however, I just want to do a few acknowledgments.
very early versions of this lecture used some material from, from my friends Karem
Sakallah, who is now at the University of Michigan, and Tom Szymanski, who at the
time was at AT&T Bell labs. and this version has been benefited
extensively from inputs from my friend Dave Hathaway at IBM.
Dave is actually the principal designer of Einstimer, which is IBM's production
static timing tool. So, every, every processor, every ASIC,
every big chip that gets that gets built by IBM runs through Dave's Einstimer
tool, which is doing very, very sophisticated static timing analysis.
And you know, the current version also benefited from versions of this lecture,
these lectures, that were taught by, by John Cohn, my former Ph.D.
student at IBM and Dave who were teaching this material to some, some folks at the
University of Vermont, and also some folks at at IBM.
So, lots of thanks to everybody for actually giving me lots of, lots of
useful feedback, lots of useful criticisms very, very much help, the
believe the quality of this lecture. I just want to acknowledge all of them
for the help. So, lets talk about analyzing the, the
performance of a design. So the first thing where i have to
assume, this is really important is that the design is synchronous.
And so, that means all of this storage is in explicit sequential elements.
So, you know, things like flip flops. So, the simplest way to draw this sort of
thing is that there is whole bunch of flip flops.
and they are at, if you like the start of the combinational logic, the input.
And then, there's a whole bunch of flip flops that are at the outputs of the
combinatinal logic. And there's a common clock that's
connecting all of those things. And you know, the clock edge comes along.
And it, I'm going to write this carefully.
And it launches the data out of the flip flops into the combinational logic and
so, it goes through the delays. Right?
It takes however much time it takes to get through the logic.
And then it arrives at some flip flops, where we hope it is captured.
And so, I'm just writing capture. And you know, we often draw the clock in
you know, kind of a very special way. Right?
So, there's just one cycle of the clock, you know.
We often talk about you know, sort of the launch edge of the clock and the capture
edge of the clock, assuming that we are actually talking about something like you
know, a positive edge-triggered D flip flop.
And although, this is a highly stylized kind of of a diagram, you know, please
just, you know, be aware that, I mean, this just a finite state machine.
Right? You know, logic and some I'm sorry, you
know, flip flops and some logic that goes between the, the flip flops.
It's not necessarily the case that the flip flops on the left are different than
the flip flops on the right. I mean, you know, we really, you know, we
really could just have a flip flop that I can draw over here, you know, with a D
input and a Q output, Q output, you know, and a place where the clock goes in.
And you know, the the output comes from Q and it goes to this, you know, cloud of
logic here, you know, and it goes back in the D input.
you know, those are the kinds of circuits we're actually talking about.
We're just not be talking about any of the subtle timing things that happen
right at the inputs or the output of the flip flop.
I'll mention that again when we get to the end.
There are ways of incorporating all of those all of those effects into the
models that I'm showing you. We just really don't have time to talk
about that stuff. So here's, you know?
A question you're, you're very possibly thinking, if you haven't, you know, kind
of encountered this kind of timing analysis in, in a, you know, in a real
commercial ASIC design scenario. Can we just simulate this stuff?
You know, we have great simulation tools. You know, we, we have you know, we
have[INAUDIBLE] simulation tools, and we have VHDL simulation tools and we have
SystemC and all these other, all these other great things.
You know, if I want to know how fast the logic goes, can’t I just simulate it
really, really, really hard? You know, and run a really, really,
really lot of inputs into it and see how slow it is?
and the problem is that, you know, what, what logic simulation does is it
determines how the system will behave. it, it simulates the logical functions
so, you know, it gives the most accurate answer when you have good simulation
models. But it's practically impossible to give a
complete answer especially with respect to timing.
you know, in order to, to be really confident that I understand what the
worst case delay of a big block of a few million gates is, I'm, I need some
exponential number of inputs, because I don't want to just know that for all of
the inputs I tried. You know, the delay is such and such.
I need to know that under any possible scenario of inputs, the delay will never
be longer than some number. And you just can't get that sort of a
guarantee from simulation, you know, you need, you need a different kind of a
technique. So there's no way I can ex-, examine all
possible input vectors with all possible relative timing.
And there's some, you know, nasty stuff that happens on the nanoscale with how
you know, manufacturing imperfections change the timing behavior of transistors
that are, you know, a couple hundred atoms across.
we just need a whole different solution. So simulation is great.
We rely on it for functional correctness. We cannot rely on it for this kind of
timing. We need a whole different technology.
And it needs to be, you know, not only just different, it needs to be fast
because we're going to do this a lot. So, first the basic model for our timing
is that we know something about the clock cycle, right?
I need to know how fast this thing is supposed to run in order to understand if
it's running fast enough or if it's, you know, got a problem and if it's slow.
[COUGH]. So just, you know, concrete example.
let's say I assert that the clock is 1 gigahertz, which means there is one
nanosecond between the clock edges. And so, I'm just going to draw the little
picture over here. So, here's my clock and you know there's
a positive up going edge and then there's you know, it goes over, it goes down, it
goes back up... And, you know, the difference between
those clock edges is 1 nanosecond. And so, again, I've got my diagram of
the, the kind of the logic that I'm looking to analyze.
There's a bunch of flip flops going in on the left of this logic and a bunch of, of
flip flops on the output of this logic. The flip flops on the left are launching
data into the logic. The flip flops on the right are capturing
data from the logic. Like I said before they, they might not
be the different flip flops but this is just, you know, sort of conceptually a
nice way to think about this. You know, what do I know.
I know that for this logic to work successfully the longest delay through
this network of logic must be shorter than 1 nano second.
And so, I'm just going to you know, put a great big arrow, you know, over the top
of the logic. I know that when things show up at the
output of the flip flops, they better to be able to get through that big gray
cloud of logic in less than 1 nanosecond, because 1 nanosecond later, the positive
edge of the clock comes along again and grabs the output of that logic and
captures it in the flip flops. So, I better be able to get through that
logic in less than a nanosecond. That's the kind of question that we're
going to answer. You give me a million gates of logic.
You ask, so how fast can it go? You tell me actually, I'd like it to go
at a gigahertz, please. I'll analyze it and I'll come back and
I'll be able to tell you things like, Yes, I can qualify this, that all the
paths are shorter than 1 nanosecond. Or, No this 35,000 paths are longer than
one nanosecond and, by the way, here are your problem points.
These are the places in your logic you really ought to go look.
If you fix this, maybe you can fix the whole thing.
So, that's what we're about here. what do we need to do this?
Well the first thing we need are Gate Delay Models, right?
So I've got you know, another picture of a cloud of logic here.
And I just got a bunch of N Gates, kind of in a row.
and I've also got some AND gates, where I've got the wires that are connecting
the AND gates sort of sort of ball. I've got a little fan out here, one AND
gate feeds a couple of AND gates. And, you know, at the top there's a big
question mark that says, so, what's the network delay?
Well, you know, in order to answer that sort of a question the, the, the most
straight forward thing, first, they've got to be able to answer is, what's the
delay of one gate? All righ?
So, I've got one gate here that's sort of highlighted and I'm just gona say, well,
let's say that the delay through that gate is a number and that number is
delta. Right?
That delta is you know, it's probably measured in picoseconds.
You know thousands of a nanoseconds these days.
And I'd like to be able to answer the question.
You know, how, how long does it take to get through one gate?
If you give me a million gates you know maybe I can figure it out.
I can figure out how long it takes to get through one gate.
And you might think, okay, how hard can that be?
And gosh the answer is really just surprisingly hard, really surprisingly
amazingly complex. So I'm going to give you a just sort of
the high level tour of what's going on here without a lot of details.
just so that we can get into the, you know, the interesting heart of the
problem. Okay.
So, you know, what matters when we're talking about logic delay.
Well, the gate type affects the logic delay.
Not all gates are created equal. So, I've got a picture a picture of an
AND gate here and I've got a picture of an OR gate and the little picture says
that the AND gate with the delta, the delay, is not equal to OR gate delta
delay. That, that certainly makes some sense.
You know, different gates have different transistor level electrical components,
contents you know, you expect may be an inverter is pretty quick.
As a gate, you expect maybe you know, a great big exclusive OR gate with a lot of
transistor level contents is kind of slow.
Maybe an And-Or-Inverter, an Or-And inverter is kind of slow.
Yes, correct. All right.
So, you know, what kind of a gate it is, that affects the type of delay.
So, you know, you have a few thousand gates in your technology library.
they've all got potentially different delays.