So far in this module I've introduced you to the concept of Power Series, and
then shown you how we can use them to approximate functions.
We've also seen how Power Series can then be used to give you some indication
of the size of the error, that results from using these approximations,
which is very useful when applying numerical methods.
The last idea that we're going to very briefly look at on this topic
is upgrading the Power Series from the one-dimensional version that we've seen so
far to its more general multivariate form.
Just to recap on the notational options from the previous video.
We saw that we can re-express the Taylor series from a form that emphasises
building the approximation of a function at a point p to a totally equivalent form
that emphasises using that function to evaluate other points
that are a small distance delta x away.
Just to make sure you can see that they're the same, we can write out the first four
terms, one above the other, to make the comparison clear.
In this video, we are going to continue using the delta x notation
as it's more compact, which will come in handy when we're in higher dimensions.
We won't be working through the multivariate Taylor series derivation
in any detail, as all I really want you to take away from this video is what
a multidimensional Power Series will look like, both as an equation and in a graph.
So keeping in mind our one-dimensional expression, let's start by looking at
the two dimension case, where f is now a function of the two variables x and y.
So our truncated Taylor series expressions will enable us to approximate the function
at some nearby point x plus delta x, y plus delta y.
Let's look at the simple two-dimension Gaussian function, which many of you may
be familiar with from studying the normal distribution in statistics.
It's a nice, well-behaved function with a simple maximum at the point (0,0).
Now in a one dimension analysis, our Power Series will give us an expression for
a one dimensional function.
However, in a two dimensional case our power series would of course now give us
a two dimensional function, which we would more intuitively refer to as a surface.
Just as with the one dimensional case, our zeroth order approximation
are simply the value of the function at that point applied everywhere,
which means in 2D, this would just be a flat surface.
So a zeroth order approximation at the peak would look like this.
And is zeroth order approximation somewhere on the side of the bell
would look a bit more like this.
This is fairly straightforward, but
now let's think about the first order of approximation.
drawing the analogy from the 1D case again,
the first order approximation should have a height and a gradient.
So we're still expecting a straight surface, but
this time it can be at an angle.
Now if we are calculating it at the peak, it wouldn't be a very exciting case
as it's a turning point and so the gradient is zero.
However, let's look again at the point on the side of the slope.
Taking the first order approximation here gives us a surface with the same height
and gradient at the point.
Finally, let's look at the second order approximation for these two points.
We're now expecting some kind of parabolic surface.
However, if I calculate it at the peak nothing seems to appear.
But that's just because we've created a parabola inside our bell curve, so
we need to look from underneath to even see it.
Finally, if we put up the second order approximation at a point at
the side of the bell curve,
you can see that we end up with some kind of interesting saddle function.
But it's still fairly clear even by eye that the gradient and
the curvature are matching up nicely at that point.
Although, this approximation is evidently not useful
outside of a fairly small region around the point.
We're now going to take a look at how we would write expressions for
these functions.
So if we want to build a Taylor series expansion of the two dimensional function
f at the point x, y.
And then use it to evaluate the function at the point x plus delta x,
y plus delta y, our zeroth order approximation
is just a flat surface with the same height as the function
at our expansion point.
The first order approximation incorporates the gradient information
in the two directions.
And once again, we are thinking about how rise equals the gradient times the run.
Notice here for compactness, I'm using the partial symbol with the subscript
to signify a derivative with respect to a certain variable.
If we look now at the second order approximation, we can see
that the coefficient of one-half still remains as per the one dimensional case.
But now we have three terms, all of which are second derivatives.
The first has been differentiated with respect to x twice.
The last with respect to y twice.
And the middle term has been differentiated with respect to each
of x and y.
Now there are of course higher order terms, but
we've already got more than enough to play with here.
So let's look again at the first order term.
It's the sum of the product of the first derivatives with the step sizes, but
hopefully this will remind you of the discussion our Jacobian.
So we can actually re-express this as just the Jacobian multiplied by a vector
containing delta x and delta y, which we can write as delta bold x,
where the bold x signifies a vector containing those two terms.
Finally, if we look at the second order term in the same way and
notice that these second derivatives can be collected into a matrix,
which we've previously defined as our Hessian.
Then to make the sum we need,
we now have to multiply our delta x vector by the Hessian, and
then again by the transpose of the delta x vector, and that's it.
So we now have a nice compact expression for
the second order multivariate Taylor series expansion, which brings together so
much of our calculus and linear algebra skills, and makes use of the Jacobian and
Hessian concepts which we defined earlier in the course.
And of course, although we've been talking about the two dimensional case
in this video so far, we could actually have
any number of dimensions contained in our J, H, and delta x terms.
So this immediately generalises from 2D to multi-dimensional hyper surfaces,
which is pretty cool.
See you in the next one.
[MUSIC]