案例学习：预测房价

Loading...

来自 University of Washington 的课程

机器学习：回归

4074 个评分

案例学习：预测房价

从本节课中

Multiple Regression

The next step in moving beyond simple linear regression is to consider "multiple regression" where multiple features of the data are used to form predictions. <p> More specifically, in this module, you will learn how to build models of more complex relationship between a single variable (e.g., 'square feet') and the observed response (like 'house sales price'). This includes things like fitting a polynomial to your data, or capturing seasonal changes in the response value. You will also learn how to incorporate multiple input variables (e.g., 'square feet', '# bedrooms', '# bathrooms'). You will then be able to describe how all of these models can still be cast within the linear regression framework, but now using multiple "features". Within this multiple regression framework, you will fit models to data, interpret estimated coefficients, and form predictions. <p>Here, you will also implement a gradient descent algorithm for fitting a multiple regression model.

- Emily FoxAmazon Professor of Machine Learning

Statistics - Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering

[MUSIC]

Well, up to this point in this course we've assumed that we just have

a single input, like square feet, and

we're trying to use that to predict the output, like the value of the house.

But often we have lots of different attributes, different inputs,

that we might wanna use for this task of predicting some output.

And we talked about this in the first course, where we said, even

if we have a more complicated function of the relationship between square feet and

house value, like this quadratic function instead of the simple line.

Well that still might not be a great predictive model, because you might go in

to the data set and say, oh, well there's this other house, that even though it has

very similar square footage to my house, it's just fundamentally a different house.

In this example, this house only has this pink house that I'm showing here,

only has one bathroom but my house has three bathrooms.

So my house of course should have a higher value than this house

with just one bathroom.

So what we're saying is that we need to add more input to our regression model.

So instead of just recording square feet and using that to predict the house value,

well I'm going to record other inputs as well.

For example, I'm going to record the number of bathrooms in the house and

I'm going to use both of these two inputs to predict my house price.

In particular, in this higher dimensional space, I'm going to fit

some function that models the relationship between number of square feet and

number of bathrooms and the output, the value of the house.

And so in particular one simple function that I can think about is just

modeling this function as w0 plus

w1 times the number of square feet plus w2 times the number of bathrooms.

So in that picture, we just talked about square feet and

number of bathrooms as the inputs that we're looking at for our regression model.

But of course, associated with any house, there's lots of different attributes and

lots of things that we can use as inputs to our regression model.

Well this notion of having multiple inputs in our regression model

is not just useful for housing data.

Like we mentioned at the beginning of this module, there are lots and

lots of applications that use this idea of multiple regression.

And I'm just gonna talk through one that I think is really cool, and

that's reading your mind.

What could be cooler than that?

So, you get some brain scan, maybe it's MEG or FMRI, but

the point is that there's some recording of your brain activity and

in this case it's going to be in response to seeing some word or some image or

something like this and what we get out is an image of your brain.

And this brain is divided into what are called voxels,

you can think of them as pixels but they're these little volumetric regions

where we have intensities associated with each one of those different voxels and

we're going to use these intensities as our multiple inputs.

So these are going to be the different features that are going into our

model to predict whether you felt very sad or

very happy, in response to whatever you were shown.

So we're going to have some data where people were shown an image and

they respond on the scale about whether they were very sad or very happy and

there is some scale in which they respond to this question.

And so what we have are these pairs of brain images and happiness responses.

So the happiness responses are output, the brain image is our input to this

regression model, and again, just to emphasize, the place where this multiple

regression comes in, is the fact that we don't have just one input.

We have a whole collection of inputs,

the whole set of different voxels in your brain.

The intensities associated with those are what we're using to try and

find the relationship between this brain image and the response of happiness.

Okay so to be clear, the way in which we read your mind is we can think about

taking your brain scan and predicting whether you feel happy or sad.

[MUSIC]