案例学习：预测房价

Loading...

来自 华盛顿大学 的课程

机器学习：回归

3443 评分

案例学习：预测房价

从本节课中

Assessing Performance

Having learned about linear regression models and algorithms for estimating the parameters of such models, you are now ready to assess how well your considered method should perform in predicting new data. You are also ready to select amongst possible models to choose the best performing. <p> This module is all about these important topics of model selection and assessment. You will examine both theoretical and practical aspects of such analyses. You will first explore the concept of measuring the "loss" of your predictions, and use this to define training, test, and generalization error. For these measures of error, you will analyze how they vary with model complexity and how they might be utilized to form a valid assessment of predictive performance. This leads directly to an important conversation about the bias-variance tradeoff, which is fundamental to machine learning. Finally, you will devise a method to first select amongst models and then assess the performance of the selected model. <p>The concepts described in this module are key to all machine learning problems, well-beyond the regression setting addressed in this course.

- Emily FoxAmazon Professor of Machine Learning

Statistics - Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering

[MUSIC]

And I wanna talk about this notion of overfitting because this is something that

we've talked about before in the course.

I wanna formalize it, and

we're gonna discuss it a lot more in the remainder of this course.

Okay, so the notion of overfitting is if you have some model,

let's say a model here with parameters W hat,

so this model has some complexity and some associated estimated parameters, W hat.

Well, this model is overfit,

if there exists a model with

estimated parameters,

I'll just call them w prime.

So let's just say some other point here.

Let's say these have parameters w

prime such that two conditions hold.

The training error, so one is training, can't spell right now.

Training error of w hat is less than

the training error of w prime.

But on the other hand, the true

error of w hat is

greater than the true error of w prime.

Okay, so this might not seem that intuitive,

but let me go through it in terms of this picture here,

which is exactly what these points, one and two, are saying.

Which is there are a wide range of models that have

true error larger than for example, this w prime here.

But the ones that are overfit are the ones that have smaller training error.

These are the ones that are really, really highly fit to the training data set but

don't generalize well.

Whereas the other points on the other half of this space are the ones that

are not really well fit to the training data and also don't generalize well.

Okay, so this is formally our notion of what an overfitted model is.

[MUSIC]