0:00

[music] So here we are in Lecture 4.2. We're going to continue working on

computational boolean algebra, and we're going to continue our discussion of

boolean satisfiability, or SAT. In the last lecture, we introduced

Conjunctive Normal Form, CNF as a, as a representation scheme.

And now we're going to show how does one actually start with a set of clauses in a

CNF style and answer the question as to whether it's satisfiable or not.

And it turns out, we shouldn't be surprised.

It's another recursive sort of computation.

But more importantly, there's a very powerful and important iterative sort of

computation that says, more or less, let's just assume that a particular variable has

a value, and let's just see what happens. When you have a boolean equation with

many, many variables and many, many, many clauses, it turns out you can get most of

the work done with this simple procedure. And this is called the boolean constraint

propagation, or BCP. And what we're going to talk about next is

how BCP works. So in the last lecture, we said that

boolean satisfiability had two big parts. It had a decision part where we more or

less arbitrarily decide to set some values to variables.

And we go simplify the CNF form and we see what happens.

Can we satisfy it or are we conflicted or what happens?

The problem is that's only going to go so far.

And it turns out that, because some variables have values, you can deduce that

other variables must have values. And it turns out that in a really, really

big instance of a CNF for a really big boolean function, there is an enormous

amount of this deduction that is possible. In fact, this is where really all of the

deep work happens in boolean satisfiability.

And this means stuff has a name. It's called boolean constraint

propagation. So, given a set of fixed variable

assignments, what else can you deduce about the necessary assignments to other

variables? Well, you can do that by propagating

constraints. Now, there is a very famous strategy for

this, called the Unit Clause Rule, and gosh, this is like 50 years old.

A clause is said to be unit if it has exactly one unassigned literal, right?

And so a unit clause has exactly one way that you can satisfy it and so what that

means is that, you have to pick the polarity that makes the clause 1.

You don't have a choice, you don't get to arbitrarily pick the polarity of that

literal and that choice is called an implication.

So here is a little boolean function phi, a plus c, b plus c, not a plus not b plus

not c and let us assume that we have decided to make these choices.

So a is a 1 and b is a 1. So what do we know?

In the first clause, a is 1 and so that clause is satisfied.

I can stop worrying about it. And in the second clause, b is 1.

That clause is satisfied, I can also stop worrying about it.

But in the third clause, the a term becomes a 0.

The b term becomes a 0. And oh look.

What has happened? This clause has become unit.

And it's unit, because there's exactly one unassigned literal, which in this case is

c bar. So what do I know, I know that c must be

0. It has to be a zero, because the only way

that I can satisfy this clause is if c is a 0.

And then that clause becomes a 1. If I really want to satisfy this boolean

function, because of the decisions I have made for a and b, I am, I have deduced

that c, by implication, must be a 0. It turns out that in gigantic CNFs for

real industrial circuits, you get to deduce a zillion of these unit clauses.

You get to deduce a zillion variable values, its really what drives the sat

process forward in any real SAT engine. Let's go take a look at sort of how that

might work. So I have got a little example here, and

this is an example from this paper which is worth checking out by my friends John

Marques-Silva and Karen Sakallah. Done when they were both at the University

of Michigan. Marques is now in Dublin.

So, phi, my little boolean function has 9 clauses.

Omega 1, omega 2, 3, 4, 5, 6, 7, 8, 9. 9 clauses.

And omega 1, not x1 plus x2. Omega 2, not x1 plus x3 plus x9.

Omega 3, not x2 plus not x3 plus x4, omega 4, not x4 plus x5 plus x10, omega 5, not

x4 plus x6 plus x11, omega 6, not x5 plus not x6, omega 7, x1 plus x7 plus not x12,

omega 8, x1 plus x8. Omega 9 not x7 plus not x8 plus not x13.

Gosh. It's not easy to tell if these things are

satisfiable, is it? And imagine if it's hard for a, a function

that has 13 variables imagine what it's like for a function with 13,000 variables.

This is why we need an algorithmic, a computational approach to asking and

solving these problems. So how do you start?

Let's start by just, somewhat arbitrarily, doing a partial assignment x9, 10, 11 and

x12 and 13 are signed values shown here, x9, 10, 11 are 0, x12 and 1 are 13.

We're going to go look at the clauses in the function, and we're going to just

simplify them by inserting those values. And so here we go.

I've inserted values for 9, 10, 11, 12 and 13.

Okay, so what happens and the answer is that, not much happens.

I cannot satisfy any of these clauses, there are conflict in these clauses.

There are, you know, basically there's no unit clauses here which is sort of the

interesting part of this exercise. I can't really do anything so pretty much

I can say now what, and the answer is, I guess I'm going to have to go decide

arbitrarily to set some more variable values because nothing interesting

happened here. I've really haven't you know, haven't I

haven't been able to conclude anything. Alright so, What are we going to do next?

We're going to choose to assign a variable to a value and we're just going to choose

to assign x1 to be a 1. So, I'm going to go back to my boolean

equation here and I've, I've got all the previous assignments from the last round

greyed out and I'm going to assign these values.

And what happens is, in clause omega 1 and omega 2, the x1 term becomes a 0.

And in clause omega 7 and omega 8, the x1 term becomes a 1.

Oh, this is good, actually. This is helpful, because, in these first

two clauses, and I'm going to circle this here, in these first two clauses things

have become unit in clause omega 1 x2 stands alone, its the only variable

without a value. I know what it has to be, its got to be 1

and in x3, I'm sorry in omega 2, x3 is the only variable without a value everything

else this is 0 and so I know x3 is got to be a 1.

So, my unit clause rule tells me, hey I have figured something out.

X2s gotta be a 1 and x3s gotta be a 1 or I can't make any further progress.

And similarly, I've gotta couple of clauses x x1 in omega 7 and x1 in omega 8.

Hey, those two clauses are satisfied. I can stop worrying about them.

That's great. So hey, I feel like I actually made some

progress. Two of my clauses fell off my list of

things I have to worry about. Omega 7 and omega 8.

Two more of my clauses fell off my list because they were unit.

I know what happens to them. X2 and x1 are, or x2 and x3, sorry, are a

1. Let's go forward and see what happens.

So here we are. I have again grayed out all of the

assignments that we got up to this point, and I am about to actually set x2 to 1 and

x3 to 1. So, let's see what happens.

And this is what happens, you know, in clause 1 and clause 2, x2 and x3

respectively become 1 and, you know, it's satisfied, no surprise.

But in clause omega 3, x2 and x3 now take the value 0 and x4 has been exposed as a

unit literal. This clause says, unit, I don't have a

choice. I now know exactly what has to happen to

x4, x4 has to be a 1. This is just an example of this fact that

the, just the fact that I assigned a few values to variables means that a whole lot

of other variables must take values if we are to proceed toward a satisfiable

solution. And in these gigantic industrial scale SAT

problems, the BCP just goes on and on and on and on, it's really the, the, the deep

algorithmic heart of these, of these techniques.

So x4 is a 1 we have discovered the clause omega 3 is unit.

That's great. That tells us something.

We can go forward and we can simplify. So, let's see what happens.

So we're about to set x4 to be equal to 1 and again I have grayed out all of the

previous assignments. And so if I go in and I set x4 to 1, well

clause omega 3 is satisfied, no surprise clause omega 4 is now unit because x4 is a

0. X5 is exposed as, something that must have

a value. And in clause omega 5, x4 is also a 0.

Literal x6 is exposed as a value that I now know.

So I have now discovered, that x5 and x6, we can deduce that they must, by

implication, have values. They have to be 1 and 1 respectively.

X5 has gotta be a 1, and x6 has gotta be a 1.

Great, I feel like I'm making progress, you know?

I continue to uncover clauses that are a unit.

I make the assignment that makes that clause satisfied and I uncover other

clauses that are a unit and it just keeps going.

It's like, wow, this thing is simplifying itself.

So let's go make that assignment. Omega 4 and omega 5 are now unit clauses.

We know what x5 and x6 have to be. Let's go do it.

So, moving forward, we are about to assign x5 and x6 to both values 1, respectively.

And let's see what happens. Well, in clause omega 4, x5 is a 1.

Omega 4 is satisfied. In clause omega 5, x5 is 1.

It's satisfied. And something bad happened.

Oh, look at what happened in clause omega 6.

Not x5 is a 0. Not x6 is a 0.

That's not happy. All right?

Clause omega 6 has now become unsatisfied. Right?

Clause omega 6 is a conflict, because of all of the implications to get to this

point of the other clauses that had to have values of certain variables, in order

for them to be satisfied, clause omega 6 is necessarily conflicted.

Oops. Sorry.

This is the way this works. This is why this sort of problem is

recursive. You go until either, something happy

happens like, you find an assignment that satisfies it or you go until something

unhappy happens like, oops, I got a conflict.

I guess I'm going to have to undo some decision very early on to make some

forward progress. So pretty much, you know, when you run a

SAT solver when BCP in the heart of the SAT engine finishes a pass, there are

really three things that can happen. I could have found a SAT assignment of all

the clauses I was working on so I could return a 1.

12:31

This has a really famous name. This is DPLL.

So this is Davis-Putnam-Logemann-Loveland. So Davis and Putnam published the basic

recursive framework for solving boolean satisfiability in 1960.

Wow, that was a long time ago. And then later Davis, Logemann and

Loveland found this smarter boolean constraint propagation, and they,

basically they proposed the unit-clause rule way back in 1962, so slightly over 50

years ago. And amazingly enough, it's, it's 50 years

old but it's still the, the heart and soul of every modern SAT engine.

And it's often called Davis-Putnam, DP, in honor of the first paper or, inaccurately,

Davis-Putnam-Logemann-Loveland. The four of them never actually did

publish all this stuff together. And the big ideas are a complete

systematic search of the variable assignments, our recursive framework for

doing that, the CNF form is a useful form for efficiency because you can tell

immediately this things can't be satisfied because one of the clauses is a 0.

And BCP with unit clauses to make the search stop earlier resolving more

assignments without recursing more. And so this is really, really famous

stuff. It's so famous, it's got its own Wikipedia

page, hey, hey. So that's definitely a worth checking out

if you're, you want to know little bit about the history and in fact a lot of

these folks, despite the fact that they're publishing the stuff in the early 60's,

are still alive. They're, they're getting up there but

they're still around, you can even go look at their individual biographies on, on

Wikipedia which is kind of interesting. Now, SAC has had huge, just huge, huge

progress over the past 20 years. So DPLL is where the stuff starts.

But SAT has been the subject of intense work.

And unfortunately just giving the sort of the rather compressed time constraints of

Armook. I don't really have the couple of weeks it

would sort of take to talk about all the interesting stuff that goes on.

But there's really four things going on. There are efficient data structures for

the clauses so that you can search them fast.

All right. There are efficient variable-selection

heuristics so that you can decide to select a variable that has a likelihood of

giving you a lot of boolean constraint propagation action, you know, probably

going to create a lot of clauses that go unit and simplify.

And that's sort of like, wow, that's a cool idea, how do you do that?

Well, there's some really interesting heuristics that let you do that.

So you can find lots of implications. And there's some amazing, efficient BCP

mechanisms. And the reason is that SAT spends almost

all of its time inside BCP. Because you spend all of your time finding

the clauses that become a unit, finding the variable in them that are unit,

assigning them their implied values and then going back to the clause database and

figuring out what you have to assign it to next.

And you have, you need a very, very lightweight touch.

You need to make sure you don't look at a clause that can't possibly be unit.

And that you do the least possible computational work on the clauses that are

unit to figure out what to do. There's some really cool data structures

that let you do that. And then there's actually some things

called learning mechanisms. They're called conflict learning.

You can, as you are proceeding down the recursive search find patterns of

variables that can never lead to satisfy-ability.

You can learn them, and what learning them means is that you can actually create a

new clause in the database, that captures the structure of these variables, that you

can add it back to the database. So that you will never try to find that

pattern again, and you can avoid the inefficiency of constantly cycling over

these subsets of variables that must avoid certain patterns of their 1's and 0's.

Very interesting set of algorithms. The result is they are good SAT codes,

they can do huge problems. What's a huge problem?

A huge problem has 50,000 variables, a huge problem has 25 million literals, a

huge problem has 50 million clauses. You wonder why said hey, why can't I build

a BDD for each of the clauses to tell whether its equal to 0 or equal to 1.

This is why, nobody wants to build 50 million BDDs.

All right. That's just not happening.

So the result of these very cool efficiency mechanisms over the last 20

years or so has been some amazing SAT codes.

You can actually just go out and find these things.

There are lots of them. There are many good SAT solvers available

online open source. So many SAT from the two Niklas' in Sweden

very famous tiny little minimalistic solver code.

Very easy to grab and go put on your personal machine if you have one.

We aren't going to use this one in our MOOC.

We're putting this out there on the Coursera servers for you to play with.

Chaff, from Sharad Malik and his students at Princeton University.

Grasp, from Marques-Silva and Sakallah at the University of Michigan.

If you go do a little Googling, you can actually go, you can actually go find all

of these things. And you know, this might be, might be

interesting for you to do so. Boy, there's just lots and lots of

information on SAT out there on the web. I would encourage you to go look around.

There's, you know, a lot of interesting information you can read.

But this is going to be our discussion of, sort of, what happens inside of SAT.

What we're going to do next is ask the question, well, we have an engineering

view of the universe where VLSI people were trying to design, you know, big

blocks of gates. Now what do we do?

How do we take something that looks like a real logic design, and apply SAT to it to

do something useful for us. So, let's go talk about that next.