案例学习：预测房价

Loading...

来自 University of Washington 的课程

机器学习：回归

3577 个评分

案例学习：预测房价

从本节课中

Ridge Regression

You have examined how the performance of a model varies with increasing model complexity, and can describe the potential pitfall of complex models becoming overfit to the training data. In this module, you will explore a very simple, but extremely effective technique for automatically coping with this issue. This method is called "ridge regression". You start out with a complex model, but now fit the model in a manner that not only incorporates a measure of fit to the training data, but also a term that biases the solution away from overfitted functions. To this end, you will explore symptoms of overfitted functions and use this to define a quantitative measure to use in your revised optimization objective. You will derive both a closed-form and gradient descent algorithm for fitting the ridge regression objective; these forms are small modifications from the original algorithms you derived for multiple regression. To select the strength of the bias away from overfitting, you will explore a general-purpose method called "cross validation". <p>You will implement both cross-validation and gradient descent to fit a ridge regression model and select the regularization constant.

- Emily FoxAmazon Professor of Machine Learning

Statistics - Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering

[MUSIC]

>> Okay, let's talk about this in the context of the bias variance trade-off.

And what we saw is when we had very large lambda,

we had a solution with very high bias, but low variance.

And one way to see this is that, is thinking about when we're cranking lambda

all the way up to infinity, in that limit, we get coefficients shrunk to be zero,

and clearly that's a model with high bias but low variance.

It's completely low variance, it doesn't change no matter what data you give me.

On the other hand, when we had very small lambda,

we have a model that is low bias, but high variance.

And to see this think about setting lambda to zero, in which case, we get out just

our old solution, our old lee squares or minimizing residual sum of squares fit.

And there we see that for

higher complexity models clearly you're gonna have low bias but high variance.

So what we see is this lambda tuning parameter controls our model

complexity and controls this bias variance trade-off.

Okay, so let's return to our polynomial regression demo, but

now using ridge regression and see if we can ameliorate the issues of

over-fitting as we vary the choice of lambda.

And so we're going to explore this ridge regression solution for

a couple different choices of this lambda tuning parameter.

[MUSIC]