你是否好奇数据可以告诉你什么？你是否想在关于机器学习促进商业的核心方式上有深层次的理解？你是否想能同专家们讨论关于回归，分类，深度学习以及推荐系统的一切？在这门课上，你将会通过一系列实际案例学习来获取实践经历。在这门课结束的时候，

Loading...

来自 University of Washington 的课程

机器学习基础：案例研究

7232 个评分

你是否好奇数据可以告诉你什么？你是否想在关于机器学习促进商业的核心方式上有深层次的理解？你是否想能同专家们讨论关于回归，分类，深度学习以及推荐系统的一切？在这门课上，你将会通过一系列实际案例学习来获取实践经历。在这门课结束的时候，

从本节课中

Regression: Predicting House Prices

This week you will build your first intelligent application that makes predictions from data.<p>We will explore this idea within the context of our first case study, predicting house prices, where you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). <p>This is just one of the many places where regression can be applied.Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression.</p>You will also examine how to analyze the performance of your predictive model and implement regression in practice using an iPython notebook.

- Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering - Emily FoxAmazon Professor of Machine Learning

Statistics

[MUSIC]

Okay, so the issue we're facing here with this crazy 13th order polynomial fit

is something called overfitting.

So in particular, what we've done is we've taken a model and really,

really, really honed in to our actual observations, but

it doesn't generalize well to thinking about new predictions.

And so the issues actually go beyond just making really crazy predictions.

And we're gonna discuss this in a lot more detail in the regression course.

But I wanna mention that this is a real problem with any machine learning model or

statistical model that you might consider.

So in these cases, we wanna fit a model to data, but

we don't want that model to be so specified exactly to the one data set that

we have that it doesn't generalize well to new observations we might get.

Okay, so let's go back to this 13th order degree polynomial fit.

And a question is, do we actually believe this?

Do we believe that this might be a reasonable fit to the data?

And I think as I alluded to before, probably not.

So although it minimizes the residual sum of squares,

it ends up leading to very bad predictions.

Because I'm sitting here and thinking, well, this quadratic fit that we had,

even though it didn't minimize my residual sum of squares as much as that 13th

order polynomial, it still, my gut feeling is, somehow this is a better model.

Okay, so a question is, what's going on here, and

how do we think about choosing the right model order or model complexity?

Well, what we want, is we want good predictions.

Of course, that's what we're aiming for.

But we can't actually observe the future.

Right, so we can't actually observe that prediction that we want to make and

say did we do a good job or not until we actually go ahead and do it.

So when we're thinking about choosing our model,

somehow we have to work just with the data that we have.

So how can we think about trying to choose a good model in this case?

Well, what we can do is we can think about simulating predictions.

So we're gonna take our data set that we have,

and we're gonna remove some of the houses.

So those are the grayed-out houses here.

These are gonna be removed temporarily.

And we're gonna fit our model on the remaining houses.

So all of these guys we're gonna use to fit our model using

exactly the kind of methods that we talked about before.

And then what we're gonna do is we're gonna predict.

So I'll go through an erase these x's now and put question marks.

And say from the model that I just learned on the circled houses,

what values do I predict for these question marks?

And then I can compare to the actual observed values,

because these houses are in my data set.

Okay, so I can use this as a proxy for doing the types of

real predictions that I wanna do on data that I haven't yet collected.

Of course, this type of method only is gonna work well if I have enough

observations to think about fitting on versus testing my predictions on.

Okay, so let's introduce a little bit of terminology.

Well, the houses that we use to fit our model, we call that the training set.

And the houses that we're using as a proxy for

our predictions, those that we're holding out, we call the test set.

Okay, so let's dig a little bit more into how we're gonna do this analysis.

And the first thing that we can do is look at something called the training error.

So we're gonna examine every house in our test data set.

So let's look at this red color here.

So all of our training houses are represented

with these blue circles here, and these are the only houses

we're gonna look at when we're thinking about defining our training error.

So in particular,

we're gonna look at what are the errors that we make on these houses?

So this is just the residual sum of squares on the houses in our training

data set, and that's called the training error.

So in particular, the training error looks exactly like what we had for

our residual sum of squares calculation, but

we're only including the houses that are present in our training data set.

Okay, so then for any given model, such as a linear fit through the data,

quadratic fit, or so on, what we can do is we can think about estimating our model

parameters as those that minimize the training error.

So that's equivalent to what we talked about before of minimizing the residual

sum of squares.

But again, here we're only looking at the houses in our training data set.

Okay, so then that's how we get our estimated w hat,

our estimated model parameters.

But then what we wanna do is we wanna take these estimated model parameters, and

we wanna say how good are we doing?

And remember what we said, what we're gonna do,

is we're gonna look at our held out observations, okay?

So here, these gray circles are the houses that are in our test set.

So these are houses that were not used to fit this model.

And we're gonna say how well are we predicting these actual house sales?

Okay, and so what were our predictions?

Well, remember when we thought about making a prediction,

we just used the value of the fit, so just the point on the line.

So to assess how well we're predicting these held-out observations,

our test data, we're gonna look at something that, again,

looks exactly like residual sum of squares.

But it's called our test error,

where we take these estimated model parameters w hat, and we sum

over our residual sum of squares over all houses that are in our test data set.

Okay, so that's our test error.

But what we can think about is we can think about how does test error and

training error vary as a function of the model complexity?

[MUSIC]