Now in this next lecture section, we're going to concern ourselves with summarizing binary data. Binary data are data that take on only one of two values, essentially yes or no. Does a person have a certain disease, yes or no? Does the subject have a certain characteristic, yes or no? Do they engage in a certain behavior like smoking, yes or no? The numerical analog to this is we tend to assign a value of one for those values that have a yes outcome and a value of zero for no. When we're going to see when summarizing binary outcomes in any single sample, it's going to be a lot easier than when we were summarizing continuous data. When we summarized continuous data we had different numbers as representing different characteristics of the sample: numbers for center, numbers for spread, numbers for other locations. We're going to see that all that information is contained in one single summary statistic for any single sample of binary data, and the summary statistic is the sample proportion which will show as essentially the mean of values that can only take on values of zero or one. So, you might be deceived into thinking that binary data is easier to deal with than continuous data, but when we start comparing binary outcomes between two or more samples we're going to see that the situation gets more difficult, we're going to see that based on only two numbers when we're comparing two samples, the proportion, we have the outcome in one sample and the proportion of the outcome in the separate sample, in the second sample. There are several different ways we can compare these numbers, and well although all give the same general result, they can look very different numerically. So, one of the things we'll do is compare the proportions on the absolute scale. We'll take the difference of the two proportions. Another measure will have us take the two proportions we've summarized on two samples, and instead of taking the difference, we'll take the ratio. We'll see that, well, both of these measures will agree in terms of the direction of association, they can look very different numerically, and we'll talk about situations where one is reported and the other is not and it can be misleading in terms of what the story is. Then finally, we'll talk about a third relative comparison measure, something that's seen in a lot of epidemiological studies calls the odds ratio, and well, that really isn't the favored measure of association except for a very specific type of study called case-control studies, it still will rise in work we do later in the course, so I'd like to define it here as it syncs up with these other measures in terms of summarizing binary data. So, I look forward to doing this with you onward and upward.