案例学习：预测房价

Loading...

来自 华盛顿大学 的课程

机器学习：回归

3449 评分

案例学习：预测房价

从本节课中

Simple Linear Regression

Our course starts from the most basic regression model: Just fitting a line to data. This simple model for forming predictions from a single, univariate feature of the data is appropriately called "simple linear regression".<p> In this module, we describe the high-level regression task and then specialize these concepts to the simple linear regression case. You will learn how to formulate a simple regression model and fit the model to data using both a closed-form solution as well as an iterative optimization algorithm called gradient descent. Based on this fitted function, you will interpret the estimated model parameters and form predictions. You will also analyze the sensitivity of your fit to outlying observations.<p> You will examine all of these concepts in the context of a case study of predicting house prices from the square feet of the house.

- Emily FoxAmazon Professor of Machine Learning

Statistics - Carlos GuestrinAmazon Professor of Machine Learning

Computer Science and Engineering

[MUSIC]

Okay, so that's one way to find the maximum or

minimum of a function depending on which scenario we're in.

It's just taking the derivative of the function, setting it equal to zero.

And that solution will be unique assuming we're either in this convex or

concave situation.

But there are other methods for finding the maximum or minimum.

So, if we're looking at these concave situations and

our interest is in finding the max over all w of g(w) one thing we can look at is

something called a hill-climbing algorithm.

Where it's going to be an integrative algorithm where we start

somewhere in this space of possible w's and

then we keep changing w hoping to get closer and closer to the optimum.

Okay, so, let's say that we start w here.

And a question is well should I increase w,

move w to the right or should I decrease w and

move w to the left to get closer to the optimal.

Well, what is the optimal?

The optimal is this point here.

So, if I had this picture in front of me,

I know that I should be moving to the right.

But mathematically what's a way to say that?

What's a way to say that where increasing that we know.

Let me write this down.

How do we know whether

to move w to the right or left

and equivalently meaning increase or

decrease the value of w.

Well, what I can do is I can look at the function at w and

I can take the derivative and if the derivative is positive like it is here,

this is the case where I want to increase w.

But what if on the other hand I had started the algorithm over here

at this value of w?

Would I want to move increasing w or move to the left decreasing w?

Well, in this case, I'd want to decrease w.

And again, if I look at the derivative, I see in this case,

the derivative is negative.

So, we can actually divide the space.

Where on the left of the optimal, we have that

the derivative of g with respect to w is greater than 0.

And these are the cases where we're gonna wanna increase w.

And on the right-hand side of the optimum we have

that the derivative of g with respect to w is negative.

And these are cases where we're gonna wanna decrease w.

And what about if I'm exactly at the optimum, which maybe I'll call w Star?

Do I want to move to the left or the right?

Well, neither.

I want to stay exactly where I am.

That's the maximum.

And what happens in this case?

Well, if I look at the derivative at the optimum,

we know that the derivative is 0.

So, again, the derivative is telling me what I wanna do.

I don't wanna change w at all.

So, this hill climbing algorithm, the way we can write it is, we say.

While our algorithm has not converged.

So, while not converged, if I can spell converged.

I'm gonna take my previous w, where I was at iteration t.

So this is an iteration counter.

And I'm gonna move in the direction

indicated by the derivative.

So, if the derivative of the function is positive,

I'm going to be increasing w, and if the derivative is negative,

I'm going to be decreasing w, and that's exactly what I want to be doing.

But instead of moving exactly the amount specified by the derivative at that point,

we can introduce something, I'll call it ada.

And ada is what's called a step size.

It says when I go, so let me just complete this statement here.

So, it's a little bit more interpretable.

So, when I go to compute my next w value,

I'm gonna take my previous w value and I'm going to move and

amount based on the derivative as determined by the step size.

Okay so let's look at a picture of how this might work.

So, let's say I happen to start on this left hand side at this w value here.

Compute the derivative, and I take a step.

Determined by that step size.

And at this point the derivative is pretty large.

This function's pretty steep.

So, I'm going to be taking a big step.

Then, I compute the derivative.

I'm still taking a fairly big step.

I keep stepping increasing.

What I mean by each of these is I keep increasing w.

Keep taking a step in w.

Going, computing the derivative and

as I get closer to the optimum The size of the derivative has decreased, and so

if I assume that eda is just some constant we'll get back to this in a couple slides.

But if I assume that there is just a fixed step size that I am taking.

Then, I'm gonna be decreasing how much I'm moving.

As I get to the optimal.

And if I end up on the other side of the optimum, then

the derivative there is negative, it's going to push me back towards the optimum.

So, eventually I'm gonna converge to the optimum itself.

And note that if I'm close to the optimum, I'm not gonna jump very far.

Again because when you're close to the optimum, that derivative Is really,

really small.

So this term here is gonna be really small and I'm not gonna be changing w very much.

[MUSIC]