0:25

Let's take the example of predicting the price of the house.

Suppose you have two features,

the frontage of house and the depth of the house.

So, here's the picture of the house we're trying to sell.

So, the frontage is

defined as this distance

is basically the width

or the length of

how wide your lot

is if this that you

own, and the depth

of the house is how

deep your property is, so

there's a frontage, there's a depth.

called frontage and depth.

You might build a linear regression

model like this where frontage

is your first feature x1 and

and depth is your second

feature x2, but when you're

applying linear regression, you don't

necessarily have to use

just the features x1 and x2 that you're given.

What you can do is actually create new features by yourself.

So, if I want to predict

the price of a house, what I

might do instead is decide

that what really determines

the size of the house is

the area or the land area that I own.

So, I might create a new feature.

I'm just gonna call this feature

x which is frontage, times depth.

This is a multiplication symbol.

It's a frontage x depth because

this is the land area

that I own and I might

then select my hypothesis

as that using just

one feature which is my

land area, right?

Because the area of a

rectangle is you know,

the product of the length

of the size So, depending

on what insight you might have

into a particular problem, rather than

just taking the features [xx]

that we happen to have started

off with, sometimes by defining

new features you might actually get a better model.

Closely related to the

idea of choosing your features

is this idea called polynomial regression.

Let's say you have a housing price data set that looks like this.

Then there are a few different models you might fit to this.

One thing you could do is fit a quadratic model like this.

It doesn't look like a straight line fits this data very well.

So maybe you want to fit

a quadratic model like this

where you think the size, where

you think the price is a quadratic

function and maybe that'll

give you, you know, a fit

to the data that looks like that.

But then you may decide that your

quadratic model doesn't make sense

because of a quadratic function, eventually

this function comes back down

and well, we don't think housing

prices should go down when the size goes up too high.

So then maybe we might

choose a different polynomial model

and choose to use instead a

cubic function, and where

we have now a third-order term

and we fit that, maybe

we get this sort of

model, and maybe the

green line is a somewhat better fit

to the data cause it doesn't eventually come back down.

So how do we actually fit a model like this to our data?

Using the machinery of multivariant

linear regression, we can

do this with a pretty simple modification to our algorithm.

The form of the hypothesis we,

we know how the fit

looks like this, where we say

H of x is theta zero

plus theta one x one plus x two theta X3.

And if we want to

fit this cubic model that

I have boxed in green,

what we're saying is that

to predict the price of a

house, it's theta 0 plus theta

1 times the size of the house

plus theta 2 times the square size of the house.

So this term is equal to that term.

And then plus theta 3

times the cube of the

size of the house raises that third term.

In order to map these

two definitions to each other,

well, the natural way

to do that is to set

the first feature x one to

be the size of the house, and

set the second feature x two

to be the square of the size

of the house, and set the third feature x three to

be the cube of the size of the house.

And, just by choosing my

three features this way and

applying the machinery of linear

regression, I can fit this

model and end up with

a cubic fit to my data.

I just want to point out one

more thing, which is that

if you choose your features

like this, then feature scaling

becomes increasingly important.

So if the size of the

house ranges from one to

a thousand, so, you know,

from one to a thousand square

feet, say, then the size

squared of the house will

range from one to one

million, the square of

a thousand, and your third

feature x cubed, excuse me

you, your third feature x

three, which is the size

cubed of the house, will range

from one two ten to

the nine, and so these

three features take on very

different ranges of values, and

it's important to apply feature

scaling if you're using gradient

descent to get them into

comparable ranges of values.

Finally, here's one last example

of how you really have

broad choices in the features you use.

Earlier we talked about how a

quadratic model like this might

not be ideal because, you know,

maybe a quadratic model fits the

data okay, but the quadratic

function goes back down

and we really don't want, right,

housing prices that go down,

to predict that, as the size of housing freezes.

But rather than going to

a cubic model there, you

have, maybe, other choices of

features and there are many possible choices.

But just to give you another

example of a reasonable

choice, another reasonable choice

might be to say that the

price of a house is theta

zero plus theta one times

the size, and then plus theta

two times the square root of the size, right?

So the square root function is

this sort of function, and maybe

there will be some value of theta

one, theta two, theta three, that

will let you take this model

and, for the curve that looks

like that, and, you know,

goes up, but sort of flattens

out a bit and doesn't ever

come back down.

And, so, by having insight into, in

this case, the shape of a

square root function, and, into

the shape of the data, by choosing

different features, you can sometimes get better models.

In this video, we talked about polynomial regression.

That is, how to fit a

polynomial, like a quadratic function,

or a cubic function, to your data.

Was also throw out this idea,

that you have a choice in what

features to use, such as

that instead of using

the frontish and the depth

of the house, maybe, you can

multiply them together to get

a feature that captures the land area of a house.

In case this seems a little

bit bewildering, that with all

these different feature choices, so how do I decide what features to use.

Later in this class, we'll talk

about some algorithms were automatically

choosing what features are used,

so you can have an

algorithm look at the data

and automatically choose for you

whether you want to fit a

quadratic function, or a cubic function, or something else.

But, until we get to

those algorithms now I just

want you to be aware that

you have a choice in

what features to use, and

by designing different features

you can fit more complex functions

your data then just fitting a

straight line to the data and

in particular you can put polynomial

functions as well and sometimes

by appropriate insight into the

feature simply get a much

better model for your data.