Hello, everyone. And welcome back. This is Thistleton and Sadigov. Today, we're going to continue with ARIMA processes. So, until now we have seen autoregressive processes, we have seen moving average processes and we also have seen mixed ARIMA processes which means that in those models, in those processes, there are some autoreggressive terms and some moving average terms. And today, we are going to describe autoregressive, integrated, moving average models. In other words, we will have one more addition to the-- one more update to our previous model. Autoregressive part, moving average part, and there will be some integrated part in the middle. We will learn how to rewrite autoregressive, integrated, moving average models, in other words, ARIMA processes using backshift and difference operators. Let's remember, ARMA processes. ARMA (p,q) process is defined as the following. There are P terms. These P terms are autoregressive terms because XT is regressed on previous P values of the same time series, and also XT depends on previous Q noises and then there is a noise for the current time. So this part is the moving average terms and this part is autoregressive terms. And we learn how to write this as a polynomial, in polynomial location, so we can put autoregressive terms in the left and keep the moving average terms on the right, and we will have this notation phi(B)Xt, beta(B)Zt. Phi(B) here is autoagressive polynomial, beta(B) here is moving average polynomial. Now, if you think of a Z as a complex number. We would like to have beta(z) and phi(z) has roots, complex roots that lie outside of the unit circle so that our process will be stationary and invertible. Of course, not every real life dataset time series is stationary. Sometimes, we do have some non-stationary time serieses and it is possible that this non-stationarity comes from the systematic change in the trend of time series. So what we would like to do first before we try to fit ARMA process, ARMA models into our times series, I would like to somehow remove the trend. So how are we going to remove the trend? We are going to use difference operator, which is basically one minus backshift operator B. So, let me just go back to the random walk model that we have seen at the beginning of the course. So, if I subtract Xt minus Xt minus one, that's the difference operator. I'm basically, I can basically write Xt minus one as (B)Xt. So this can be written as one minus (B)XT. So for example, if you look at the random walk which is basically previous step plus some noise then we can take this Xt minus one to the left and we can write this as delta Xt equals to Zt which becomes a stationary process. So, if the process Xt is autoregressive, integrated, moving average of order (p,d,q) realize that we have now a new parameter D. Then, what we mean is the following. We have this Yt which is Xt applied with the difference operator D many times. In other words, 1 minus B to the dXt. This Yt, after we difference the time series, D many times then Yt is an ARMA process with order p and q. So whenever Yt here is ARMA(p,q) then Xt, the original time series is ARIMA(p,d,q) and d is the number of times that we take the difference. In other words, we can write this Arima process in the polynomial notation. We're going to have phi(B), instead of Yt we have delta d, difference operator d many times equal to beta(B)Zt. Or we can write this, instead of the delta as a difference operator, we can write this as one minus B, B being the backshift operator. Now, usually this differencing-- order of differencing is not too much. You usually take one or two differences. Taking over differencing may introduce the artificial dependence which were not existed in the first place. We're going to look at our ACF. ACF will itself might also tell us that maybe differencing is needed. Realize that if you look at the polynomial phi(z) one minus z equal to D. Even though phi(z) might not have a complex root inside the unit circle including the boundary, one minus z to the d has a unit root with multiplicity of D. So in other words, ACF of this process, if there has to decay very, very slowly, so once you see very slow decaying ACF that is also a suggestion that maybe we have to do some differencing. Now, later on, we will going to actually modeling real life data sets. So, basically, we're going to go this checklist in a way. This is going to be our guide. If there is a trend that will suggest a differencing, if there's a variation in the variance. Right. If the variance is different in one part of the time series from the other part of that time series which means it's not stationary. We have to use some kind of transformation to stabilize the variance and the common transformations are lower at them. And then, sometimes, we will need the differencing. So if you take the logarithm and then differencing, that whole thing is called, in financial time series they call it a log-return of the time series. We're going to look at ACF, autocorrelation function which might suggest [inaudible] for us. We're going to look at PACF of the difference or transformed time series. And that might suggest for order p of autoregressive terms. Then once we have a lot of models that we can play with then we will somehow have to choose one of those models. Right. And then what are going to be our criteria, we're gonna have more than one criteria. I know that you have seen already Akaike Information Criteria. So AIC is one of the things we're going to look at. We are trying to hope to get the least AIC. We also want to get the least sum of the squared errors, we are also going to look at that. And then I'm going to introduce one more measure in a way to select a model, and this measure is going to be basically Ljung-Box Q-statistics. So Ljung-Box Q-statistics, I'm going to talk about them next lecture. Basically by looking at these three, we'll try to select our model. Once we have our model, we're going to go through the estimation and try to fit the model to our time series. So what have you learned? We have learned how to describe autoregressive, integrated, moving average models. And we learned how to rewrite autoregressive, integrated, moving average models using backshift and difference operators.