案例学习：预测房价

Loading...

来自 University of Washington 的课程

机器学习：回归

3654 个评分

案例学习：预测房价

从本节课中

Ridge Regression

You have examined how the performance of a model varies with increasing model complexity, and can describe the potential pitfall of complex models becoming overfit to the training data. In this module, you will explore a very simple, but extremely effective technique for automatically coping with this issue. This method is called "ridge regression". You start out with a complex model, but now fit the model in a manner that not only incorporates a measure of fit to the training data, but also a term that biases the solution away from overfitted functions. To this end, you will explore symptoms of overfitted functions and use this to define a quantitative measure to use in your revised optimization objective. You will derive both a closed-form and gradient descent algorithm for fitting the ridge regression objective; these forms are small modifications from the original algorithms you derived for multiple regression. To select the strength of the bias away from overfitting, you will explore a general-purpose method called "cross validation". <p>You will implement both cross-validation and gradient descent to fit a ridge regression model and select the regularization constant.

- Emily FoxAmazon Professor of Machine Learning

Statistics - Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering

[MUSIC]

[MUSIC]

So, in particular we an also face this issue of overfitting when we get lots and

lots of inputs.

[MUSIC]

That represents a very flexible model that can run into the same issues that we saw

in our demo for polynomial regression.

Or more generally, we can say just if we have lots of features.

So we'll say that capital D is very large.

And this could be different functions of our input.

But when you include lots and lots of these functions of our inputs,

in our regression model then again we're in this place where the model has

a lot of flexibility to explain the data and we're subject to becoming overfit.

But this issue of overfitting with respect to increasing model complexity

is really relative to how much data that we have.

So let's talk about overfitting as a function of the number of

observations that we have.

As well as a function of the number of inputs.

Or the complexity of the model.

So in particular if we have very few observations and

it's small, then our models can rapidly become overfit to the data.

Because we have only a few points and as we're increasing in

our model complexity like the order of the polynomial,

it becomes very easy to hit all of our observations, but

in between where we have those observations, things can go very wild.

On the other hand, if we have lots and lots and lots of observations, even with

really, really complex models, we're not gonna as quickly become

overfit because we have dense observations across our input,

so the function is pinned down basically everywhere.

In this example as a function of square feet.

And it's not able to hit every observation,

it's not able to do these really crazy wiggly things.

Okay.

So, on the other hand when we have just one input

like number of square feet of a house in order to avoid overfitting,

we need to have observations that are very dense across number of square feet.

So we need to have lots of representative examples of square feet and

house value pairs.

So this is actually pretty hard to do, to have lots of

examples of houses of every possible square feet that you might see.

So this is already a hard problem, but

it becomes even harder when I increase the number of inputs in my model.

So, for example, just think of a model where I have square feet and

number of bathrooms.

And I want to cover all possible combinations of those two inputs

in order to provide representative examples and avoid overfitting.

Well that's really really hard.

[MUSIC]

[MUSIC]

[MUSIC]

[MUSIC]

[MUSIC]