This is actually a key public health issue, so you've probably

seen it in the news that there's been questions about how,

what's the value of mammograms in detecting disease, and detecting the

value of disease versus detecting cases that aren't necessarily life threatening.

Similarly, you've probably heard about it for prostate

cancer screening, and in both of these cases.

You have a fairly rare disease, and even though the

screening mechanisms are relatively good, it's very hard to know whether

you're getting a lot of false positives that are, as

a fraction of the total number of positives that you're getting.

For continuous data, you actually don't have quite

so simple a scenario, where you only have

one of two cases, and one of two types of errors that you can possibly make.

The goal here is to see how close you are to the truth.

And so, one common way to do that, is with something called mean squared error.

And so the idea is, you have a prediction that

you have from your model or your machine learning algorithm.

And so, you have a prediction for

every single sample that you're trying to predict.

And you also maybe know the truth for those samples, say in a test set.

So what you do is, you calculate

the difference between the prediction and the truth.

And you square it, so the numbers are all positive.

And then you average the total number of, sort

of total distance between the pre, prediction and the tree.

The one thing that's a little bit difficult about

interpreting this number is that you squared this distance,

and so, it's a little bit hard to interpret

on the same scale as the predictions or the truth.

And so what people often do is they take the square root of that quantity.

So here, underneath the square root sign, is the same number, it's just the average

distance between the prediction and the truth, and you just sum it and square it.

And then you take the square root in that number,

and that gives you the root, root mean squared error.

And this is probably the most common error measure that's used for continuous data.

So for continuous data, people often use either

the mean squared error, or the mean squared error.

But if often doesn't work when there are a lot of outliers.

Or the values of the variables can have very different scales.

Because, it will be sensitive to those outliers.

So, for example, if you have one really, really large value.

It might really raise the mean.

Instead, what we could use is often the median absolute deviation.

So in that case, they take the median

of the diff, distance between the observed value,

and the predicted value, and they do the

absolute value instead of doing the squared value.

And so again, that requires all of the distances to be positive,

but it's a little bit more robust to the size of those errors.