So, we can figure out our weights matrix and so now,

I ranges from zero to nine,

j ranges from zero to nine.

And so basically, we're going to have I is one,

I'm sorry it ranges from one to nine because it's counties alpha is one and j is one.

Then we fill out our formula.

We do the estimations and we just walk through

this value and what this is going to do is Moran's I going to give

us a global measurement

of whether the pattern expressed in the underlying dataset for example,

average income is clustered, dispersed or random.

So, weather a map might look like this versus this.

And given a set of features and an associated attribute,

this global Moran's value I is going to indicate that.

So, values near one indicate clustering,

values near negative one indicate dispersion,

and then we can also calculate a Z score to indicate whether we

can reject the null hypothesis of there's no spatial clustering.

So, we can even determine if there is spatial clustering in the data just by

using this sort of formula and we don't have to necessarily even program this by hand.

This is available in Python,

R, and these things.

It's just a matter of thinking about,

how do we fill in all of the different variables?

And the most important one is the weights matrix.

You have to decide, we want to use

Queens continuity for connection is one connected to two,

four, and five or is one only connected to two and four.

Do I want to use distance from my weights matrix?

So I have some sought of decay function so the further away the centroid are.

So one could still be connected to nine but it may be a weight of

point five as opposed to a weight of one in the case of five.

So, we have to think about how to calculate weights matrix and fill in all these values.

Now, I mentioned a Z score.

So, Z score is a statistical test to

identify a null hypothesis associated with the normal distribution.

Basically the Z score is a measure of standard deviation.

So, we're trying to determine whether this distribution of

patterns could have occurred with sort of a 95 percent confidence interval.

What I mean is, if I had a bunch of income data distributed randomly,

how likely is it that this pattern I'm seeing would have occurred by chance or not?

And so Z scores allow us to test this,

get a measure on how critical this

Moran's I might have been or how unlikely it might have been to occur.

And so, if we're given a bunch of regions with a bunch of measurements,

we can calculate our Moran's I.

So, here's an example of the spatial distribution in X and Y.

So, at the location this is one,

two, three, four.

One, two, three.

So X is one Y is one we had no values there.

At X is one Y is two. We have 4.55.

At X 1, Y 3 we have 5.54 and we can walk through

our Moran's I calculation to do this.

And we also have to calculate the spatial weights matrix as well.

So, we have 10 different spatial regions.

So we have a 10 by 10 weights matrix.

So, box number two is next to number one,

is next to number four, I'm sorry.

Box number one is next to two and is next to four.

Notice, we didn't make it next to three so we're doing what we call a Rook's continuity.

Okay. Now, let's walk through number two together.