And so that's gradient descent, and we've just run it and

gotten a good fit to my data set of housing prices.

And you can now use it to predict, you know,

if your friend has a house size 1250 square feet,

you can now read off the value and tell them that I don't know maybe they could

get $250,000 for their house.

Finally just to give this another name it turns out that the algorithm

that we just went over is sometimes called batch gradient descent.

And it turns out in machine learning I don't know I feel like us machine learning

people were not always great at giving names to algorithms.

But the term batch gradient descent refers to the fact that

in every step of gradient descent, we're looking at all of the training examples.

So in gradient descent, when computing the derivatives,

we're computing the sums [INAUDIBLE].

So ever step of gradient descent we end up computing something like this that sums

over our m training examples and so the term batch gradient descent refers to

the fact that we're looking at the entire batch of training examples.

And again, it's really not a great name, but

this is what machine learning people call it.

And it turns out that there are sometimes other versions of gradient descent that

are not batch versions, but they are instead.

Do not look at the entire training set but

look at small subsets of the training sets at a time.

And we'll talk about those versions later in this course as well.

But for now using the algorithm we just learned about or using batch gradient

descent you now know how to implement gradient descent for linear regression.