0:00

A greedy heuristic for the knapsack problem is pretty good, but let's work a

little harder and do quite a bit better. We're going to accomplish a quite

ambitious goal. The knapsack problem, being NP-complete,

is not we expect going to admit an exact polynomial time algorithm.

Let's shoot for the next best thing, arbitrarily close approximation.

That is, let's accept from a user error parameter epsilon, and, return the

solution whose value is guaranteed to be at least ibe minus epsilon times that of

an optimal knapsack solution. So, a client who wants to be guaranteed

that this heuristic outputs a solution with a 99% of optimal just has to set

epsilon to be 0.01. If this sounds a little too good to be

true, there is a catch. The running time is going to increase as

we take epsilon to be smaller and smaller.

But what this algorithm exports to the client is an accuracy running time

trade-off. That is, it exports via this parameter

epsilon a knob that the user can turn wherever they want.

If you want more accuracy and are willing to wait a little bit longer, you set

epsilon to be epsilon to be smaller. If you want something really fast and

you're okay with a bit of inaccuracy, you take espsilon to be bigger.

And, this is as good as it gets, for Np-complete problems.

Assuming P is not equal to NP, there's no way to solve it exactly in polynomial

time, the next best thing you can have, hope for, is arbitrarily close

approximation, in polynomial time. That is what we will get for the napsack

problem using a dynamic programming base heuristic.

Sadly, the knapsack problem is the exception that proves the rule. Most

NP-complete problems, it is believed you cannot get an arbitrarily close

approximation. One concrete example is the vertex cover

problem, we discussed exact, expnential time

algorithms for vertex cover in a seperate video and we've been talking about

polynominal time heuristics for vertex cover, and there are some good ones. You

can get within a factor two of an optimum vertex cover in polynomial time, but, it

is believed you cannot get arbitrarily close approximation of the optimal vertex

cover in polynomial time. In fact, if you came up with such an algorithm, that

would imply p equals NP. While there are some other problems out

there like knapsack that admit arbitrarily close approximation, there's

many more out there which are like vertex cover where, assuming p not equal to NP,

it is not possible to get arbitrarily close approximation in polynomial time.

But this is an algorithms class, we should stay positive and focus on what

you can do. So let's move on to the arbitrarily close

approximation for knapsack. So the high-level idea is both very cool

and also very natural. What we're going to do is we're going to

take the knapsack instance that we're given, so items that have values and

weights, and some kapsack capacity. And we're going to massage it a little

bit, not too much, just a little bit but we

want to put it squarely in one of our computationally tractable special cases,

and then, we're going to solve this transformed version of the original

input. Now, because we're not solving the

problem that we were given, we're not going to get the right answer.

But as long as we didn't transform it too much, we can hope that the optimal

solution to the transformed problem is a pretty good approximation of the original

problem that we were give. Well, what's a computationally tractable

special case of the knapsack problem? To date, we do know one, but we only know

one so far, which is if we assume that the sizes of the items in the knapsack

capactiy are integers, we can go back to our dynamic programing solution for the

knapsack problem which runs in time, n, the number of items time capital W, the

knapsack capacity. So if it were the case of the knapsack

capacity W wasn't too big, if for example, it was polynomial and the number

of items, n, then this dynamic programing algorithm would run in polynomial time.

For technical reasons, which I'll say a little bit more about later, this

computationally tractable special case is not well-suited for deployment in a

dynamic programming heuristic. Instead, we're going to want to develop a

second similar but different computationally tractable special case.

The second special case is when the sizes and the knapsack capacity can be

arbitrary, but where the item values are integers that are not too big. This

assumption can also be exploited by a dynamic programing algorithm. As we'll

show in a separate video, there is a different dynamic programming

algorithm for the knapsack problem, whose running time is N squared times the

largest item value, N squared times Vmax.

So from this second dynamic programming algorithm, we can extract a second

computationally tractable special case of the knapsack problem.

Namely if all of the item values are small integers, say a polynomial in the

number of items n, then this dynamic programming algorithm already gives us a

polynomial time algorithm for instances of that form.

Armed with the second computationally tractable special case, we can now return

to the high-level idea at the beginning of this slide and execute it.

Here's how we do it. Our knapsack algorithm is given some

knapsack instance, the item values need not be small integers.

Now we massage the instance a little bit. We transform it, so that it becomes an

instance that we know how to solve in polynomial time.

How do we do that? We force the item values to be small integers.

How do you take arbitrary numbers and make sure that they're small numbers?

Well, the first thing to try is drop the low order bits and that's essentially

what we're going to do. Our algorithm then, solves this

transformed instance, by construction, the instance has small item value so we

can solve it in polynomial time using our second dynamic programing algorithm.

Now, we're probably not going to get an optimal solution to the original

instance. After all, we solved the wrong instance.

Unless we transform the values before invoking our dynamic programming

algorithm. But, the hope, and this is just a hope,

this is definitely going to require a proof. The hope is that we didn't loose

too much information just by throwing out some lower order bits.

So, by virtue of computing a solution which is 100% optimal, for the almost

correct problem, maybe that solution is also almost optiomal for the fully

correct correct problem. So here is the algorithm in full detail.

It really only has two steps. Fist, we do the transformation.

We round the item values to the small integers,

then we invoke the second dynamic programming algorithm on the transformed

instance. Precisely, here is how we round each item

value v sub i. To begin, we decrease vi to the nearest

multiple of a parameter m. I'm going to leave this parameter m

unspecified at the moment and therefore this algorithm will be undetermined.

Once we start analyzing the algorithm, we'll see what the parameter m has to be

as a function of epsilon. So don't forget that epsilon is the

accuracy parameter supplied by our client.

If the client sets a given value of epsilon, what it means is that our

responsibility is to output a feasible solution with value at least one minus

epsilon times that of an optimal solution.

So the smaller the epsilon, the more accurate our algorithm needs to be.

When we round an item value vi down to the nearest multiple of m, we're

decreasing it by an amount somewhere between 0 and m.

If vi was already a multiple of m, we don't decrease it at all.

If vi was just below a multiple of m, then we wind up decreasing it by

essentially m. Therefore, the bigger the m, the more

information we're throwing out about our instance and the less we should expect

our accuracy. That is, the smaller value of epsilon the

more accuracy demanded by our client, the smaller we're going to have to take our

value of m. Now, once we've made everything a

multiple of m, we may as well just scale down,

we may as well divide everything by m, we're going to get integers.

The integers that we get after rounding down to the nearest multiple of m and

then dividing by m are going to be called the V hats.

An alternative succinct way to describe the V hats is V hat i is by definition,

Vi/m rounded down. And so these funny brackets in blue on

the right, that's called the floor operator,

that takes in a real number and returns the biggest integer that's less than or

equal to the input. Thus, what is the transformation? In

effect, we take the item values that were given, we divide them by this parameter m

and then we also round down just to make sure we're dealing with integers.

Step 2 of our heuristic is to just invoke the second dynamic programming algorithm

to solve exactly the transformed instance.

That is, we feed into that second dynamic programming algorithm, the instance with

N items, the sizes are exactly the same as in the knapsack instance we were given

originally. The knapsack capacity, capital W, is

exactly the same we were given originally, but we feed in the values, V

hat one through V hat n. That is, we solve it with respect to the

transformed values, not with respect to the original values,

the Vi's. Let's make two quick observations before

we conclude. So first of all, remember the dynamic programming algorithm, the

second one for knapsack, the running time is n squared times the largest item

value, and of course, here we're running the dynamic programming algorithm with

these transformed item values the V hats. So therefore, if the V hat are big, then

this running time is slow, if the V hats are small, then this

running time is going to be reasonable. Now, let's remember the definition of the

V hats. They're essentially the original item

values divided by this magical parameter m, which we haven't yet specified.

But qualitiatively, if m is big, the V hats are going to small and the running

time is going to be fast. On the other hand, if m is small, the V

hats are going to be big and this algorithm is going to be slow.

That's how m controls the running time of this dyanamic programming base heuristic.

Also, remember we argued how intutitively it seemed to be controlling the accuracy.

The more you jack up m, the more information you lose when you round the

Vs and transform them into the V hats, and so, the less accuracy you're going to

have. So bigger m means, on the one hand,

faster running time, but at the cost of worst accuracy.

So this parameter m, really a knob that we can tune to figure out whether we want

a faster algorithm or higher accuracy. The non-trivial statement which we need

to prove in the next video is that there's a sweet spot for m.

That you can simultaneously get a pretty good running time, poloynomial running

time and excellent accuracy. So one trivial observation, but that's

still really nice, is that this dynamic programming based heuristic is guaranteed

to produce a feasible solution. Whatever output of items we get in step

2, it's guaranteed to fit inside the knapsack.

The reason for that is we pass to the dynamic programming algorithm in step 2,

fully accurate sizes, the original Wi's, and similarly a fully accurate knapsack

capacity, capital W. Therefore, the feasible solutions for the

transform knapsack instance are exactly the same as they were in the original

knapsack instance. This explains why we needed to develop a

second dynamic programming algorithm for knapsack, rather than trying to use the

the first one we developed. Thankfully there is this second

computation retractable special case where the items have small sizes and that

turns out to be exactly what you want to make this two-step recipe work.

The analysis is coming right up.