In 1988, Creative EU Bureaucrats in Brussels, created EU resolution 1677-88. The resolution said, that for a cucumber, in order to be of the highest quality standard, its curvature should not exceed ten millimeters. Was a cucumber, An eleven millimeter curvature, it was no longer in the highest quality level; nine millimeter, however, was fine. Now this little anecdote clearly tells us something about the creativity of politicians to come up with stupid laws. But more to the point of this session, it reminds us that often a matter of good or bad is not as binary as it sounds. Often times, there's an underlying parameter that can be measured. Be at the geometry, or weight or a temperature. And this underlying measurement has to be compared with some specification. This is the basic idea behind Six Sigma. Now, I have to confess that I felt it was below my academic dignity of going to the local market and purchasing 50 cucumbers only to take out a ruler and measure the curvature. Instead to illustrate the concept of Six Sigma, I did something much sweeter. Sweeter though potentially unhealthier. I went to the local grocery store and bought 50 bags of M and M. I then took out a high-resolution scale to measure the weight of these 50 bags. Here's the data that I got. Take a look at this data. . For example, we can compute the average of the weights in the sample. . This was what 50 gram, and I have to confess, was significantly higher than what the company has labeled on the back. That was about 47. You also see that the standard deviation is about 1.1. To get a visual of the distribution, we can use the histogram analysis in Excel. For that, go to data analysis. Click on histogram. Enter the data here as the input range. Enter the data. As within range, Enter a number that is slightly below the lowest value and the number that is slightly above the highest value. And then, click on okay. Now, you get a chart that looks like this. This is a histogram of the weights, and surprise, surprise, it's interesting to see how the laws of statistics kick in. The distribution here is almost normal-like, with most of the values being here, somewhere in the middle around the, the mean, and some off the back being extremely light or extremely heavy. We review set statistical distribution in our calculations in a moment to figure how many defects there will be in a very large sample. Now how should we define an M and M bag as defective? Having eating them all, I have to confess that all of them were quite yummy. And so I have a hard time speaking about defects. But let's say for the sake of argument, that we have a specification that a bag should have at least 48 grams of chocolate in them. We refer to this number as the LSL which stands for the lower specification limit. Similarly, we speak of the USL, the Upper Specification level as the number. That if exceeded we would call the bag defective. So, let's say for our sake of argument that the bag is defective if it has more than 52 grams of chocolate in there. With these two numbers, I can compute a number that is known as the capability score. The capability score, also known as the CP score, looks the ratio between the width of the specification level relative to six times the standard deviation in the process. This number here tells us the capability of the process. Notice that the way that we can increase process capability, is by either making the specifications more forgiving. It's a higher capability if you would assume that specifications were 47 to 53. All we use is a standard deviation. Both of these would reduce the likelihood of a defect. Let me illustrate the idea of the capability score on the following slide. Just as a reminder, this is the definition of the capability score. The upper specification limit minus the lower specification limit, divided by six times the standard deviation. Now, imagine two distributions, One of them has a density function that has a little larger variance. And you notice here, how you can go three standard deviations from the mean of the distribution before you hit the specification limit. Now in the lower case, you see a distribution that has a lower variance, and you need to go six standard deviations before you going to incur a defect. Now clearly defects are less likely in the lower case. There is simply less probability mess at the tails here. So, this suggest that we can compute, or we can translate the capability scores of a distribution into the probability of defects. Let me illustrate this calculation by going back into our spreadsheet. So how likely is it going to be that we can encounter a bag of M and M's that are heavier than 52 grams? So what's the probability that the bag is too heavy? I can get to this by using the normal distribution function in Excel, and looking at the 52 grams, relative. To a distribution with 50 as a mean, and 1.1 as a standard deviation. That probability is 96% that it stays below this. Or, the probability that this is too heavy is simply one minus it, which is 3.4%. Now next, ask ourselves, what's the probability that this is too small of a bag, or that this is too light. Well for that, I have to look at the normal distribution. This time with the lowest specification limit. 50. And 1.1 is the standard deviation. And this is equal to 3.3%. Now for a defect, I either need to have the bag. Be too heavy or too light and so the sum of those two is simply the probability of a defect. Now, I can take this number, and I can multiply this with, say, a million units. In the production run to get a number that is known as the ppm, the parts per million. So we have 67,818 parts defective per million parts. So you notice that the capability score for around 0.6 as we just saw, saw or, in the M and M example. This equating to a defect probability of around 0.0. 67 percent. Or, put differently, 67,000 defects in a million parts. Now, in this table, we show the relationship between the capability score and defect probabilities. For example, at a capability score of one, you can go three sigmas from the mean to either side of the specification levels. And we have a defect probability of 0.027. Put differently, you're gonna have 2,700 defects per million parts. Now, where do these numbers come from? Now, let's first look at the three sigma process. What's the defect probability? Well, I have to go three standard deviations before I hit the specification limit. Let's assume the underlying distribution is a standard normal distribution which is, having a mean of zero and a standard deviation of one. So for a defect I have to hit the number three. So I'm looking at the normal distribution, mean zero, standardization of one, which is a value of 99.865%. So one minus that probability, tells me the probability of the part being too large. This is also the probability of the part being too small, so assuming the symmetry, I can just double this number and that gives you the 0.027 that you all saw on the earlier table. Now lets move this further and look at just six sigma process. With six sigma, we have to go six standard deviations, and you notice that this number here becomes ridiculously small. So a little hotter in temperature, so let not look at it as a probability, but as a defect for a million parts. So we have to multiply this, with a million and we see that the number here is roughly 0.002. In other words, you're gonna have two defects for billion dot billion parts. So a quality target is typically expressed in defect probabilities of parts per million. We see in this table, that, that can be matched to a capability score. This allows me to ask myself, for a given specification limit, what is the amount of variability in the process that I can tolerate, before violating my quality goal? Let me go back to the example of the M and M's. So we said we had the USL, minus an LSL. Divide it by six sigma and that had to be, if I'm aiming for six sigma operation, that would have to equate to two. Now, in our example, the difference here between the USL and the LSL was four. So that gives me an equation that I can solve. Four divided by six sigma is equal. To two. And so, in other words, sigma is equal to one-third. So just to gain some confidence in our calculations. Let's go back to our Excel spreadsheet and take the current empirically observed standard deviation of 1.1 and replace it by our new goal of a standard deviation of one third. All the numbers recompute and you see that the parts per million go down to 0.002 which is a two parts for billion that we talked about. In 2009, the EU bureaucrats finally decided to cancel the resolution 1677-88. That makes you hopeful that one day they can also resolve the euro crisis. But more to the point of this session, we saw that variation exists almost everywhere, even in a highly industrialized product, such as packaged M and M chocolate. We saw considerable amount of variation from one package to the other but would you really call the package a defect, just because it has one extra gram of chocolate in that? That is a matter of the product specification. We saw how the CP score measures the process capability by really looking at the variation that you have in the process relative to the allowable variation that you get out of the specifications. We also saw that the capability score is something that tells you how many defects you are likely to make in a thousand or in a million parts, and that's a really good metric that you want to track over time to see are you are getting better or across your suppliers to see who is giving you the highest quality product.