Then each one of the rows has the same discussion, and the same decision,
based on whatever that factory is the way it is subgrouped differently.
So depending on what you have on your inputs and your outcomes,
it only will allow you a certain set of tools that fit those data items.
For example, if I have a variable's outcome and a variable's input, think back
to a scatter plot where it's just bunch of dots and it looks like the Milky Way.
Now correlation and
regression are the perfect tools to an analysis on that type of data.
Each one of these tools has its own concepts of hypothesis testing.
There is going to be a null hypothesis or an assumption.
For example, correlation regression, the null hypothesis, again,
nothing is going on, nothing is happening,
would be that there may look like there is a picture there.
It may look like there is a belt on a huntsmen up in the nights sky.
But the correlation regression, null hypothesis is no,
there is nothing going on.
Or lower in the left-hand corner t-test and F-test.
Those are individual tools looking at is there a difference between
means in subgroups, or the spread in variation in subgroups?
Some of them have p-values and
others have tests that give you outcomes and concerns on that data.
And we're going to go through one of them right now.
If you see control chart as showing up in both the top two boxes,
it doesn't matter whether you have variable or
attribute data you still can have a set of control charts
that look for trends and behaviors in that type of data.
What a control chart is,
is I'm just going to plot each one of the data items spaced over time.
And when you go left to right they have to be time ordered.
It's not going to be like a Pareto where you can just have the tallest to smallest.
No, it's very important that not only are they time ordered, but
that they are spaced properly or uniformly between them.
If you look closely at this chart, after you've plotted all the points,
typically, the systems or you might do it by hand,
the center line is going to be the average of all of those data points.
What is the central behavior of that data set.
But then, you have upper and lower bounds where you're looking at how far out
would a new data point be compared to all the others that it would be too unique?
That it would violate that null hypothesis of nothing's going on.
In control chart, the null hypothesis, it's very stable, and
it's not going to exceed more than plus or minus three standard deviations.
When you have three standard deviations plus or minus, that's greater
than 99% of the expected behavior within those lines and boundaries.
So anything that exceeds on the top or the bottom, it doesn't mean that it was
particularly good or bad, all it really says is this is unique.
This is different.
This is something special.
And this is something that will violate my concept of the null hypothesis,
nothing is going on that's particularly different.