Okay, so that was just a little detour into contour plots.
So that I can talk about the gradient descent algorithm, which is the analogous
algorithm to what I call the hill decent algorithm in 1D.
But, in place of the derivative of the function,
we've now specified the gradient of the function.
And other than that, everything looks exactly the same.
So what we're doing, is we're taking
we now have a vector of parameters, and we're updating them all at once.
We're taking our previous vector and
we're updating
with our sum, adda times our gradient which was also a vector.
So, it's just the vector analog of the hill descent algorithm.
But, if I wanna show this a little bit in pictures here.