Next, we are going to learn about how to analyze time series data. One way to do that is to exploit an inherent property of almost all time series data. That is, at any point in time, a data point is slightly or highly dependent on the previous value or values. Let's see what that means next. Now, let us look at autocorrelation which is an important property of almost all time series data. Unlike data that we use in linear regression, time series data occur at different times. One may wonder for time series has some relationship to previous versions of itself. Correlation is a great way to measure this relationship. Before we run a linear regression, we can tell if two variables are related. We do this by calculating the correlation coefficient between them. In linear regression, the observations are paired so there is just one way to compute the classical correlation coefficient. This correlation is known as the Pearson correlation coefficient named after the statistician Karl Pearson. The same idea applies to time series, a series can correlate with itself. You may have seen the movie called The Truman Show starring Jim Carrey. In this movie, he is the star of an around-the-clock reality television show but he doesn't know it. In one classic scene, he sits in his car and observes all the events that occur around him. He notices that after several minutes, all the activities repeat themselves. The same cars drive, the same people biking, the same people talking. This is an example of a time series correlating with itself. How does this work with a set of training data? Let's do the following experiment. Suppose we have the return of apple for six months. We would have about a 125 data points. We will form two series from these. We form the first series by excluding the last five points, we'll call this X. We formed the second series by excluding the first five points, we'll call this Y. That is, we make X by taking the first 120 data points. We make Y by taking the last 120 data points. We call this lagging. For computing the five-day autocorrelation, we must exclude the first five points of one series. Remember, to correlate, we always have to have the same number of points. When we calculate a five-day autocorrelation using 125 days of data, we have 125 minus 5 or 120 total pairs. We then proceed to compute the correlation just to see these were entirely different series. We may wonder though, why not try an eight-day autocorrelation? Here, we form our X by taking the first 117 points, we form our Y by taking the last 117 points, then we correlate these. We come to realize that there are many different autocorrelations we can calculate. We can calculate the one-day autocorrelation, two-day auto, and three-day and so on, all the way up to a number of days that slightly smaller than the number of training points you have. Practically speaking, let's calculate the autocorrelation of daily SPY returns from lag 0 to lag 10. This is real data. You can see from the graph that there is a correlation of about 30 percent for a seven-day lag. We can even say that the series has a zero correlation. Why would we do that? We would emphasize that the correlation of a series with itself is one. We can then draw a plot of this autocorrelations. This is known as the ACF chart as you can see on the slide. In the last slide, we discussed autocorrelation. Correlation has no direction. When we say the correlation between X and Y, it is the same as the correlation between Y and X. Regression is however different. When we run a regression, there is a specific direction. Why regress on X? This differs from the regression of X on Y. In regression, direction matters. In ARIMA modeling, we will start by using an order of lag to regress. We can gain insight from our autocorrelation plot. We can get a sense of lag that if we get a sense of the autocorrelation, an AR process is where auto regression occurs. Our goal is to find the correct time lag that best captures the order of such an AR process. This is not a one-step procedure but it's an iterative process. It's not very clear, isn't it? Let's look at a specific example. Here's a sample time series with the time period of lag 1 to the right of the screen. We don't know what lag means yet, but the next step will make that clear. The variable here on the y-axis is what we are trying to model. The x-axis as usual is a time period which is what we will expect in a time series. Here, we show you six charts of the same chart previously. The first chart on the top left shows you each value of Y shown plotted against the value of Y one period previously. Similarly, the second chart shows Y against the Y values two periods prior and so on. Notice how tightly packed Y is to Y values one time period ago on the top left corner. This tight correlation starts to slowly disperse as we move along the right and down to the lower bottom. In the final chart at the bottom right, we see that there is a lot of dispersion. This means that the first chart shows a high correlation of Y to Y values one period prior and that later periods don't have such a high correlation. This means that this time series was generated by an auto regressive process of lag 1. Wow, that wasn't so hard, wasn't it? Let's look at some more examples. Here are two more examples. The chart on the left represents a time series with lag equals 1 and the chart on the right represents one with lag equals 2. Notice that you can't tell much from looking at the charts themselves. But you have to do something similar to what we did earlier that is correlation to Y values of lag 1, lag 2, etc., in order to know what the lag in the chart is going to be. Knowing the lag tells us that we can use prior values to predict future values. What else can we use to predict future values?