案例学习：预测房价

Loading...

来自 University of Washington 的课程

机器学习：回归

3575 个评分

案例学习：预测房价

从本节课中

Assessing Performance

Having learned about linear regression models and algorithms for estimating the parameters of such models, you are now ready to assess how well your considered method should perform in predicting new data. You are also ready to select amongst possible models to choose the best performing. <p> This module is all about these important topics of model selection and assessment. You will examine both theoretical and practical aspects of such analyses. You will first explore the concept of measuring the "loss" of your predictions, and use this to define training, test, and generalization error. For these measures of error, you will analyze how they vary with model complexity and how they might be utilized to form a valid assessment of predictive performance. This leads directly to an important conversation about the bias-variance tradeoff, which is fundamental to machine learning. Finally, you will devise a method to first select amongst models and then assess the performance of the selected model. <p>The concepts described in this module are key to all machine learning problems, well-beyond the regression setting addressed in this course.

- Emily FoxAmazon Professor of Machine Learning

Statistics - Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering

[MUSIC]

So, instead of using training error to assess our predictive performance.

What we'd really like to do is analyze something that's called generalization or

true error.

So, in particular, we really want an estimate of what the loss is

averaged over all houses that we might ever see in our neighborhood.

But really, in our dataset we only have a few examples of houses that were sold.

But there are lots of other houses that are in our neighborhood that we don't have

in our dataset, or other houses that you might imagine having been sold.

Okay, so to compute this estimate over all houses that we might see in our dataset,

we'd like to weight these house pairs, so the pair of house attributes and

the house sale's price.

By how likely that pair is to have occurred in our dataset.

So to do this we can think about defining a distribution and

in this case over square feet of houses in our neighborhood.

So what this little cartoon is trying to show

is a distribution over the real line of square feet.

But you can think of it as just a really dense, in a sense histogram,

counting how many houses that we might see with a given square feet for

every possible square feet value.

Okay and so what this picture is showing is a distribution that says

we're very unlikely to see houses with very small or

low number of square feet, very small houses.

And we're also very unlikely to see really, really massive houses.

So there's some bell curve to this, there's some sweet spot of kind of typical

houses in our neighborhood, and then the likelihood drops off from there.

Likewise what we can do is define a distribution that says for

a given square footage of a house,

what's the distribution over the sales price of that house?

So let's say the house has 2,640 square feet.

Maybe I expect the range of house prices to be somewhere

between $680,000 to maybe $950,000.

That might be a typical range.

But of course, you might see much lower valued houses or higher value,

depending on the quality of that house.

And that's what this distribution here is representing.

Okay, so formally when we go to define our generalization error,

we're saying that we're taking the average value of our loss

weighted by how likely those pairs were in our dataset.

So specifically we estimate our model parameters on our training data set so

that's what gives us w hat.

That defines the model we're using for prediction, and

then we have our loss function, assessing the cost of predicting f,

this f sub w hat at our square foot x when the true value was y.

And then what we're gonna do is we're gonna average over all possible xy's.

But weighted by how likely they are according to those distributions over

square feet and value given square feet.

Okay, so let's go back to these plots of looking at error verses model complexity.

But in this case let's quantify our generalization error

as a function of this complexity.

And to do this, what I'm showing by this crazy blue region here.

And, it has different gradation going from white to darker blue,

is the distribution of houses that I'm likely to see in my dataset.

So, this white region here, are the houses and

now we just made it not white, but hopefully we still see.

These are the houses that I'm very, very likely to see, and

then as I go further away from this I get to less likely

house sale prices given a specific square foot value.

And so what I'm gonna do when I look at thinking about generalization

error is I'm gonna take my fitted function where remember this green

line was fit on the training data which are these blue circles.

And then I'm gonna say, how well does it predict houses in this shaded blue region,

weighted by how likely they are, how close to that white region.

If you imagine in 3D, there are these distributions popping up

off of this shaded grey and shaded blue area.

Maybe I can try and draw it.

Maybe the distribution at a given square foot,

okay that doesn't look good at all, let me try and do it again.

Then it looks something like this, the houses with xt square feet.

And so when I think about how well my prediction is doing at xt, this x here,

I'm looking at the difference between this and all points along this line.

Weighted by how likely they are in the general population of houses I might see.

And then I do that across this entire region of possible square feet.

Okay, so what I see here is this constant model who

really doesn't approximate things well except maybe in this region here.

So overall it has a reasonably high generalization error and

I can go to my more complex, just fitting a line through the data.

And I see I have better performance, but still not doing great in these regions.

So my generalization error dropped a bit, but when I get to this higher complexity

quadratic fit things are starting to look a bit better, maybe not great out in these

regions here, so again, the generalization error drops.

Then I get to this much higher order polynomial, and

when we were looking at training error, the training error was lower, right?

But now, when we think about generalization error, we actually see that

the generalization error is gonna go up relative to the simpler model,

because if we look at this region here, it's doing really horribly.

So, we might get a generalization error that's actually larger than the quadratic,

and then we can fit even a higher order polynomial, and we get this really,

really crazy fit.

And it's doing horrible basically everywhere, except maybe at these very,

very small little regions where it's doing okay.

So in this case we get dramatically bad generalization there.

Okay, so this is starting to match a lot more of our intuition

behind what might be a good fit to this data.

So, let's think about just drawing the curve over

all possible models now that we've fit these few specific points.

So our generalization error

in general will have some shape where it's going down.

And then we get to a point where the error starts increasing.

Sorry, that should have been a smoother curve.

The error starts increasing because we're getting to these

overly complex models that fit the training data really well but

don't generalize to other houses that we might see.

But importantly,

in contrast to training error we can't actually compute generalization error.

Because everything was relative to this true distribution,

the true way in which the world works.

How likely houses are to appear in our dataset over all possible square feet and

all possible house values.

And of course, we don't know what that is.

So, this is our ideal picture or our cartoon of what would happen.

But we can't actually go along and compute these different points.

[MUSIC]