And by the way, just as a sider, when I talk about this to other students,

I've been told before, it's pretty amazing,

some of my students say, is how I can tell the story both ways.

Why we might want to have higher precision or

higher recall and the story actually seems to work both ways.

But I hope the details of the algorithm is true and

the more general principle is depending on where you want,

whether you want higher precision- lower recall, or higher recall- lower precision.

You can end up predicting y=1 when h(x) is greater than some threshold.

And so in general, for most classifiers there is going

to be a trade off between precision and recall, and

as you vary the value of this threshold that we join here,

you can actually plot out some curve that trades off precision and recall.

Where a value up here, this would correspond to a very high value of

the threshold, maybe threshold equals 0.99.

So that's saying, predict y=1 only if we're more than 99% confident,

at least 99% probability this one.

So that would be a high precision, relatively low recall.

Where as the point down here,

will correspond to a value of the threshold that's much lower,

maybe equal 0.01, meaning, when in doubt at all, predict y=1, and if you do that,

you end up with a much lower precision, higher recall classifier.

And as you vary the threshold, if you want you can actually trace of a curve for your

classifier to see the range of different values you can get for precision recall.

And by the way, the precision-recall curve can look like many different shapes.

Sometimes it will look like this, sometimes it will look like that.

Now there are many different possible shapes for

the precision-recall curve, depending on the details of the classifier.

So, this raises another interesting question which is,

is there a way to choose this threshold automatically?

Or more generally, if we have a few different algorithms or a few different

ideas for algorithms, how do we compare different precision recall numbers?

Concretely, suppose we have three different learning algorithms.

So actually, maybe these are three different learning algorithms, maybe

these are the same algorithm but just with different values for the threshold.

How do we decide which of these algorithms is best?

One of the things we talked about earlier is the importance of a single real number

evaluation metric.

And that is the idea of having a number that just tells you how well is your

classifier doing.

But by switching to the precision recall metric we've actually lost that.

We now have two real numbers.

And so we often,

we end up face the situations like if we trying to compare Algorithm 1 and

Algorithm 2, we end up asking ourselves, is the precision of 0.5 and

a recall of 0.4, was that better or worse than a precision of 0.7 and recall of 0.1?

And, if every time you try out a new algorithm you end up having to sit around

and think, well, maybe 0.5/0.4 is better than 0.7/0.1, or maybe not, I don't know.

If you end up having to sit around and think and

make these decisions, that really slows down your decision making process for

what changes are useful to incorporate into your algorithm.