0:00

Here is the analysis. Step 1 of our algorithm sorts the items

in non-increasing order of value per size.

Step 2 of our greedy heuristic picks the maximum prefix of items who fit into the

kapsack. Let's say that maximum prefix comprises

the first k items. The k plus first item is the first one

that does not fit in the knapsack given the previous selections.

So here is the cool part. Remember we added this step 3 to our

greedy algorithm so that it considered two different candidate solutions and

picked the better of them. The first candidate solution was just the

outcome of step two. So therefore, we can definitely say the

output of our three-step greedy algorithm is at least as good as the total value of

these first k items chosen in step 2. The second candidate considered in step

three of our greedy algorithm is the maximum value item all by itself.

So one such item all by itself that would be considered by the greedy algorithm was

this k plus first item, the item on which we got stuck in step 2.

So therefore, whatever the value of our output of our three-step greedy algorithm

is, it's definitely at least the value of item k plus one in isolation.

So let's add these two inequalities together.

On the left hand side, what do we get? We get double the value of our three-step

greedy algorithm. On the right hand side, we get the first

k items and also the item k plus one. That is, we get the total value of the

first k plus one items. So now, we're in a position to connect

this inequality to the fractional greedy solution.

So remember, what is the greedy fractional solution, well, it packs the

first k items, and then the item in which it gets stuck, namely, item k plus one,

it fills up the knapsack fully with a suitable fraction of item k plus one, and

the value that the algorithm gets is prorated according to the fraction that

it fits in. But what we have here on the right-hand

side in green is even better than the greedy fractional solution.

This is the first k plus one items, 100% of them.

The greedy fractional solution has the first k times, 100%, and some fraction of

item k plus one. So, the value of the fractional greedy

solution is even worse than the right-hand side here.

And the whole point of our thought experiment was to argue that the greedy

fractional solution is even better than optimal,

that has value at least as much as every feasible knapsack solution.

Divide both sides by two and you'll see that we proved the theorem.

The output of the three-step greedy algorithm is guaranteed to be at least

50% that of an optimal solution. That's cool.

We have a non-trivial worst-case performance guarantee for our simple and

blazingly fast three-step greedy heuristic for the knapsack problem.

But, maybe you were hoping to do a little bit better than within 50% of an optimal

solution. Maybe in the back of your mind, you were

really rooting for something like 90% or even better.

There's a few roads we can take that might lead to better performance

guarantees. The best case scenario is if we could

just sharpen our analysis of this greedy heuristic, that is, don't change the

algorithm, don't make any new assumptions, just have a better proof and

maybe the 50% one prove to some other number.

That would be the best case. The second thing we could do is if we

really love this algorithm, for example, it being so fast, we could try to

identify additional assumptions on instances that allow us to prove better

performance guarantees than this 50% guarantee which holds for all instances.

The third thing we're going to try to do is just come up with a better algorithm,

that in fact, has better performance guarantees, so what I want to show you on

this slide is that the first case, the best case scenario of just sharpening the

analysis of this greedy algorithm, is not feasible The analysis cannot be

sharpened. [INAUDIBLE] the 50% can be realized for

certain instances. Then, we'll tackle the other 2

approaches, and we will in fact get better performance guarantees, either

under extra assumptions about the instances, or alternatively, using a more

sophisticated algorithm. So here's an example that shows that

there really are knapsack instances on which this 3 step greedy algorithm might

obtain merely 50% of the value of an optimal solution.

In this example, we're going to set the knapsack capacity, capital W, to be 1000.

There's going to be three items. Item one will have a value of 502 and a

size of 501. The second and third items are going to

be identical, both are going to have value 500,

both are going to have size 500. So what is the greedy algorithm going to

do? Well, in step 1, it sorts the items according to the value for, per size

ratio, and you'll notice that the first item has a ratio slightly bigger than

one, so it's going to be considered first

before items two or three. In step 2 of the greedy algorithm, it

packs item one in the knapsack, that leaves only 499 units of residual

capacity, not enough room for either item two or item three.

So step 2 of the greedy algorithm just outputs the first item by itself.

Step three of the algorithm considers the max value item in isolation,

that again is item one. For this reason, the greedy algorithm

will output the solution which is just item one for a value of 502.

And I'm sure you've noticed that there's a better solution if you [INAUDIBLE] item

one and instead choose items two and three.

They both fit in a knapsack, they fill it fully,

for a combined value of 1000, which is essentially twice as big as the value of

the greedy solution. So what does this example teach us? Well,

it argues that if we want perfomance guarantees better than 50%, we have one

of two options. First, if we really want to keep

analyzing our three-step greedy algorithm, we're going to have to make

extra assumptions about the instances. In order to prove performance guarantee

is better than 50%. There are some instances out there where

the performance is bad as 50% of optimal. The second approach we can take is to

look at better algorithms which might have better guarantees on worst case

instances. So in the next slide, I'll show you a

refined analysis of the three-step greedy algorithm, conditions under which it

performs much better than 50%. In a separate sequence of videos, we'll

develop a dynamic programming base heuristic,

which is guaranteed to have much better performance, on all instances.

So there's a few different assumptions you can impose on knapsack instances that

will enable you to prove better performace gurantees for greedy

heuristics. Let me just show you one, that's both

simple and also it's a condition that's met in many knapsack instances that arise

in practice. So let's focus on knapsack instances

where no item is especially big compared to the knapsack capacity.

For concreteness, let's say that the size of every item is at most 10% of the

knapsack capacity, capital W. As you'll see, there's nothing special

about the number 10%. I'm just using that to keep the

discussion concrete. So I claim that without changing our

algorithm at all using exactly the same three-step greedy algorithm.

Actually, here, we can even get away with the original two-step greedy algorithm.

We can prove performance guarantee is much better than 50% as long as the

instance satisfies this property. Why is this assumption useful? Well, lets

think about step 2 of our greedy algorithm.

So we've sorted the items in non-increasing order, of value per size.

We're packing the items one at time. Now, at some point we get stuck, there's

some item k plus one where if we tried to put it in the knapsack, we would overflow

the capacity. Well, by assumption, this k plus one'th

item, its size is at most 10% of the original knapsack capacity.

So if sticking in item k plus one would cause it to overflow, that means there's

less than 10% of the knapsack capacity currently available.

That is, the knapsack is currently 90% full, or more with the items that we've

already packed in so far. So this is sounding pretty good.

Our greedy criterion ensures that whatever fraction of knapsack we wind up

using, we're using in the most valuable possible way and we just noticed that

this assumption on the instance, implies that we use almost all of the knapsack.

To make this intutition precise, we comapre the output of our greedy

algorith, even just step two, to our favorite hypothetical benchmark,

mainly, the greedy fractional solution. What's the difference between these two

solutions? Well, they're almost the same, the only difference is that the greedy

fractional solution gets to pack that last bit of the knapsack, which we know

is at least 10%, with a suitable fraction of item k plus one.

So, the value that we're missing in our solution is that final 10% at most of the

greedy fractional solution. Now in the greedy fractional solution,

the items are packed in decreasing order of bag per buck and a value per size.

So, this last 10% of the greedy fractional solution is also the least

important. It's making the least valuable use of capacity.

So this final at most 10% of the greedy fractional solution can account for, at

most, 10% of its overall value. So whatever we're missing in our solution

for our greedy algorithm, it's at most 10% of the overall value of the greedy

fractional solution. That is, we're getting at least 90% of

the value of the greedy fractional solution. And we know from our thought

experiment that the greedy fractional solution is even better than optimal, is

at least as good as any feasible knapsack solution, therefore, the output of our

greedy algorithm, even if we don't use the third step, is 90% as good as the

value of an optimal solution. All of this reasoning holds equally well

with a number different than 10%, so for example if you only knew that

every size was at most 20% of the knapsack, then you'd get that the greedy

solution would be at least 80% of optimal.

On the other hand if you knew that every item had size at the most 1% of the

knapsack capacity then just the two-step greedy algorithm would get you within

99%.