Turing's work about computability

really taught us about the difference between things that

we can compute and things that we

absolutely cannot compute with any computer we can imagine.

We didn't pay any attention at all to efficiencies,

to how difficult it is to compute things.

Today, we're going to address that part of the situation.

So, we're going to start by just addressing some reasonable questions

about things that we might want to compute the same way that we did with computability.

So, we talked last time about

the fundamental questions that were addressed by Turing's work,

like what is a general purpose computer?

Are there limits on what we can do with digital computers?

And we found that there were with the halting problem and several other examples.

Today what we're going to talk about is,

are there limits with what we can do in

the machines that we can actually build in this universe?

That's the focus of today's lecture.

This question was actually first raised by Gödel,

in a so-called "lost letter" to von Neumann.

And actually John Nash also asked this question

in another so-called "lost letter" to the NSA of

course both these letters were later found.

Then one of the first breakthroughs were due to Michael Rabin and Dana Scott,

who introduced the idea of nondeterminism.

That I will talk about in more detail.

But really the main question was posed in the 70s work by Dick Karp,

Steve Cook, and independently Leonid Levin.

And that's really the question that we're going to talk about.

And unlike in the last lecture,

this question is still unknown.

So, we don't know the answer to it or who's going to provide the answer.

This is a fascinating context for such an important question that's

still unresolved many decades after it was first proposed.

So, to get thinking about it let's start

with a well-known problem that's known to be difficult.

It's called the traveling salesperson problem or TSP.

And the idea is, you're given a set of N cities,

and you're given distances between each pair of cities,

and you given a threshold distance M. And so the problem is,

you want to know is there a tour that goes through all the cities,

going back to the beginning that's of length less than M?

This can be formulated in different ways,

but we're going to think about this particular one just now.

So clearly, one thing that we might do is to so-called exhaustive search.

Where we try all N factorial orderings of

the cities to look for a tour of length less than

M. And you can imagine that's not going to work for

a very large set of cities for N very large.

So that's the kind of problem that we're going to be addressing.

And so how difficult can that be?

This is just excerpts from a blog recently found on the internet.

And someone says, "Well,

if one took the 100 largest cities in the US and wanted to travel them all,

what's the distance of the shortest route?"

Of course if you can solve this one with a threshold of M you

can use binary search to find the shortest one.

So, it's an equivalent problem in many ways.

So, "I'm sure there's a simple answer.

Anyone want to help? A quick Google revealed nothing about this."

And then the next thing is, "Well,

I don't think there's any substitute for doing it manually.

Google the cities, pull out your map and get to work.

It shouldn't take longer than an hour.

Edit: I didn't realize this was a standardized problem."

Well it may be a standardized problem,

but there's another edit needed here.

"Well writing to a program to solve the problem would

take 5 or 10 minutes for an average programmer."

Well, maybe.

Maybe a little more than that,

but certainly all of you could write programs to solve this.

"However, the amount of time the program would need to run it is well a long, long time."

Well, "My Garmin could probably solve this for you."

Or my cell phone nowadays, my maps application.

Then there's an edit. "Well probably not.

Someone needs to write a distributed computing program to solve this in my opinion."

Those are the types of things they well it can't be that difficult, can it really?

Well let's do a little bit of analysis.

So, in order to think about this from the right perspective,

I want you to imagine a giant computing device.

We'll try to make this the most powerful computing device that we can imagine.

So we call it an UBERcomputer.

So, it's got a processor on every electron in the universe.

So, they are trying to get the biggest computer we can imagine.

And let's say that each processor,

there's one on every electron in the universe,

will say that each one of them has the power of the fastest supercomputer we know today.

And that each one of them is working the lifetime of the universe.

So you take your fastest supercomputer,

you put one on every electron in the universe

and you have it work for the lifetime of the universe.

So, and this is just conservative estimates and you can use other estimates if you want,

but let's say 10 to the 79th electrons in the universe.

Maybe a supercomputer does 10 to

the 13th instructions per second or make it 10 to the 20th whatever you want.

Say the age of the universe in second is 10 to the 17th,

so maybe that's an underestimate maybe 10 to the 30th.

But let's stick with these.

And then we say, okay,

so suppose we use this algorithm

of trying every arrangement of the cities and looking for the smallest.

Could our computer solve it using that algorithm?

And the answer is not even close.

100 factorial is more than 10 to the 157th power.

And if you have every electron in the universe with

a supercomputer running for the age of the universe,

then you only get to about 10 to the 109th.

So, not only are you not going to solve it using that method with the UBERcomputer,

but you'd need 10 to the 48th UBERcomputers.

So, this is a difficult problem,

you're not going to solve it with a computer using that method.

And it's really important lesson to have in

mind when you're thinking about the topic of this lecture.

The whole point is that exponential growth

when you have a number to your problem size is the amount of time that you're taking,

that the technological change is irrelevant.

The exponential growth is so huge as to

take anything that you might think about technology totally out of the question.

So, we have questions like,

are we going to be able to solve this problem?

Are there better computers?

And that's the kind of question that we want to address.

It has nothing to do with the technology that we're using really.

So, these are the kinds of questions that come up.

Which algorithms are going to be useful to solve practical problems?

So, we'll just use a very,

very simple model of computation to try to address this question just to get started.

The same way that we did with theoretical.

With the question of whether we can compute or not,

we try to take the most basic model that we can to get going.

And so, what we're going to think about is the idea of a polynomial time algorithm.

So, poly-time means that the running time is less

than a times N to the b for some constants a and b.

And so, we talked when addressing real applications problems

about the need to get solutions that are faster than N squared,

or N cubed, and so forth.

Those are all fine but for this purpose we're saying even if it's N to the hundredth,

we're interested because what we want to avoid is exponential time.

It's definitely not polynomial time where the running time is proportional to C to the N,

for any constant C bigger than one.

So, our separation is between polynomial time and exponential time.

And again, the specific computer is generally not relevant because when you

simulate one computer on another it generally uses only a polynomial factor.

That's called the extended Church-Turing thesis and we'll talk about that in a minute.

So, in the context of this lecture,

what we want to say is that an algorithm is efficient if it's

guaranteed to run in polynomial time no matter what the input is.

We're just going to use the word efficient and we're going to

look for efficient algorithm for

problems that efficient algorithms for solving them.

Now, outside this lecture we might

say guaranteed polynomial time or sometimes they just say polynomial time.

We want to make sure that it's polynomial time for all inputs,

but for shorthand we're just going to use the word efficient.

And understand that we're separating those kinds of

algorithms for ones that might be slow that might take exponential time.

And the basic question we have is we have a lot of

practical problems and we want to know if we can find efficient algorithms.

It seems like a reasonable question that we should be able to answer. That's our topic.

Turn around that and say which problems can we solve in practice

and those are the ones that we know efficient algorithms for those problems.

If we have a problem that we want to solve or we have an efficient algorithm,

fine and we're going to say for

the purposes of this lecture that those are the ones that we can work on,

implementing algorithms and solving and so forth.

On the other hand,

if we can show that there's no efficient algorithm to solve a problem,

we're going to call it intractable.

This is analogous to what we did for computability.

We could talk about the idea of something that can be solved with the Turing machine,

or we talk about something that we can't solve with

any Turing machine that's computability, not computability.

We want to have the same idea,

same kind of separation between problems that are

intractable and problems that we know efficient algorithms for.

I'd like an easy way to know whether a problem's intractable,

that's what we're going to talk about today.

So for example, sorting, that's not intractable.

We have algorithms that are guaranteed to take time proportional to some polynomial,

and it doesn't matter that we have a faster one like n log n,

as long as it's any polynomial at all we're fine for this lecture.

But for the TSP for example,

nobody knows the efficient algorithm,

but we also don't have a proof that there's no efficient algorithm.

That's the issue that we want to talk about.

So to get started,

we're going to talk about four basic fundamental problems.

So first one is probably familiar from secondary school, LSOLVE.

So we want to solve simultaneous linear equations.

So we have some unknown values and we have some equations that

those values have to satisfy simultaneously.

The variables are real numbers and so there's

an example of three equations and three variables.

And you learned in school how to develop a solution for that,

in this case the solution is this one,

x one equals a half,

x two equals a half.

So x one plus x two equals one,

3x one plus 15x two that's 18 and a half that's nine

and then the other one you can check adds up to a half.

So you maybe familiar with Gaussian elimination which can solve that problem.

There's similar one called LP or Linear Programming.

Again the variables are real numbers,

but now we're going to make the equations that they have to satisfy inequalities.

So here's four inequalities and actually six and three unknowns.

So we have a bunch of variables and we have inequalities and we

want them all to simultaneously satisfy those inequalities.

In this case the bottom one says that the solution should all be positive values.

They're all real numbers and then you can plug in these values and you

can check that that solution satisfies those inequalities.

That's called linear programming.

How do we get to that solution.

And then there's one called Integer Linear Programming,

which is the same problem except that the variables have to be either zero or one,

one of the two values,

that's Integer Linear Programming.

And there's another one called Satisfiability.

And now our variables are Boolean values either true or false,

and then we've got simultaneous Boolean sums that have to be satisfied simultaneously.

So that is not x one or x two is true and so forth.

And that's an example and then there's a solution.

The first one is satisfied because x two is

true and the second is satisfied because x zero is false,

so not x zero is true.

And the last one's satisfied because x one is true.

So again given the equations,

develop the solution that's the problem.

Now these problems are very similar and they're actually

all very important problem solving models with lots of practical applications.

You take a practical problem and formulate it very often in one of

these regimes and then the solution is going to tell

us what we need to do in some kind of practical situation.

Many, many applications for each one of these.

So it's a pretty reasonable question to ask is,

do we have efficient algorithms for solving these things?

They look so similar,

what's the story on these things.

And as soon as computation came around,

people started looking at using computers to solving these problems.

And it's very difficult to tell

which ones are intractable and which ones have efficient algorithms for solving them.

So LSOLVE solving simultaneous equations?

Yes, in fact we do.

Gaussian elimination itself doesn't quite work in

this model because the numbers are

real we have to represent them with bits in a computer,

but still with an appropriate modification,

Gaussian elimination can guarantee to solve

these LSOLVE problems simultaneously equations in polynomial time,

no matter what the input is.

Linear Program? The answer is also yes,

but that was decades after the problem was first formulated and was open for

a long time and people debated whether or not there might

exist fission algorithm for linear programming.

And then the ellipsoid algorithm was an amazing tour de force in the 1980s.

Was a proof that linear programming could be solved in polynomial-time.

And actually in the years since,

that's been developed and once the idea was known other polynomial time algorithms

were developed that are actually useful in

practical situations are important part of our computational infrastructure now.

But Integer Linear Programming,

nobody knows polynomial-time algorithm and nobody knows whether one exists.

In Satisfiability the same,

no polynomial-time algorithm known and nobody knows whether one exists.

These are quite reasonable, similar,

interesting, important practical questions and we don't know the answer.

That's the basic question,

can we find efficient algorithms for Integer Programming and Satisfiability?

Or can we prove that no such algorithms exists?

Those are the kind of questions that we're going to be talking about in this lecture.

So the idea is intractability.

Let's do the definitions and then talk about it some more.

So we said that an algorithm is efficient if

its running time is given to be polynomial for all inputs.

We're going to call a problem intractable if there's no efficient algorithm to solve it.

We can prove there's no way that you can solve it a guaranteed polynomial time algorithm,

we're going to call it intractable.

The converse of that is a problem is tractable if we have an efficient algorithm.

So of course we're interested in

knowing whether or not a problem is tractable or intractable.

And again Turing taught us something fundamental about computation

by identifying a problem and we

might want to solve and showing that it's not possible to solve it.

And then by knowing that,

then we could see other problems that were later reduced to

that problem and it helped us understand more about computation.

So it's a pretty reasonable question to say,

can we do something similar for intractability?

So with Turing we had to decidable and undecidable,

wow we have tractable versus intractable.

So the first question is, well,

there's a lot of problems not just Integer Linear Programming and

Satisfiability but many problems that

we don't know efficient algorithms for solving them.

So first thing that comes up is,

can we prove one of them to be intractable?

That's the critical question that we

face and that's what we're going to start laying the groundwork for next.

So this is a profound connection to the real world

that's analogous to the Church-Turing thesis.

It's called the Extended Church-Turing thesis.

And the idea of this is Church-Turing says

that any computing device that we built is going to have same power as a Turing machine.

What this one says is that the resources used by

all reasonable machines are going to be within a polynomial factor of one another.

Of course you can conjure up some machine that

must take a huge amount of time for simple task,

but the kind of reasonable computational models that people have developed,

the machines that have been designed,

always are within a polynomial factor.

Again. This is a thesis,

it's not a theorem, it's not subject to proof.

Somebody maybe, can come up with a computer that can't be

simulated in polynomial-time and on any other computer in it.

But the question is,

is this a new model of computation or is this something new about the universe.

So always, we use simulation to prove a polynomial bound.

So for example, we're going to look later at

a simple machine that is like your computer processor.

And we're going to run a Java program to simulate the operation of that machine,

or you could write a program called the

compiler that translates a Java program into that machine.

So that proves that those things are equivalent and the cost

of these computations is definitely bounded by a polynomial,

it's easy to show.

And that's been true for, again,

every reasonable machine model that people have been develop.

So this validates the focus on

the distinction between exponential in polynomial-time algorithms.

So polynomial-time on one machine,

going to be polynomial-time in all the machines that we can think of,

and that's a whole different thing than exponential time on a machine.

And it really enables us to rigorously study efficiency in

the same way we are able to rigorously study computability.

And that's where we're going to go in this lecture.