In this next set of lectures we'll take on the method called simple logistic regression, and we'll try to give it as similar treat is as we did to that of simple linear regression from the last section. So, let's first give an overview of this method called Simple Logistic Regression, and you've already seen some illusion to it in the first section of lecture set one, where we tried to unify all of the regression methods we'll be using in this course, but now we're going to drill down in much more detail. So, in this set of lectures we will develop a framework for simple logistic regression, which is a regression model for relating a binary outcome as opposed to a continuous measure when we had linear regression to a single predictor. Just like we saw in linear regression, the predictor can be binary, categorical or continuous, so the only difference between the choice of linear regression versus logistic is what kind of outcome we have, are we measuring the mean of a continuous outcome? In which case we'd use linear regression. Are we summarizing a binary outcome via a proportion in which case we'd use logistic regression? So, for logistic regression we don't necessarily grapple with the most useful summary measure though as our left hand side of the equation, the equation is a bit more convoluted than we have with linear regression, what we have to do is we have to ultimately convert the binary outcome from one zero's in through a few steps into the log odds of this for different groups, a group by the grouping factor of our predictor x1. So, what we're modeling with logistic regression is on the left-hand side we're modeling the log odds, that our outcome or binary outcome is equal to one and that's going to be a linear function of our predictor x1, so an intercept plus a slope times our predictor x1. So, log odds that y equals one over the log of p over one minus p where p is the proportional probability that y equals one. It's noted in the previous section x1 can be binary, nominal, categorical, or continuous. So, let's just talk about this idea of the outcome being a one, zero variable, we're modeling the log odds that this outcome is equal to one. So, if y equals one indicates that a child is breastfed at the time of the study and zero if they're not then our outcome is breastfeeding status and what we'd be modeling is the log odds that y equals one, in other words, the log odds of being breastfed. If y equals one is the one indicator whether the subject has the disease or not at the time of the study, and so, it's a one if they have the disease and zero if not, then the outcome we'll be modeling is transformed to the log odds of having disease, where the log odds that y equals one and that's what we would model as a linear function of our predictor. So, if y is equal one when a person is readmitted to the hospital within one month of being discharged, and zero if they are not readmitted, then the log odds that y equals one is equal to log odds of being readmitted within one month. So, as everything else we have done thus far, we will only be able to estimate the regression equation from a sample of data, and to indicate that we have estimates that we're working with. They will ultimately need to make inferential statements about via confidence intervals and hypothesis tests use which I'm going to put hats over the beta naught and beta one hat just like we did when we had linear regression. Again, I keep alluding to the scale we put things on to the log odds of y equal one and it sounds a little bit of a strange scale, we start with something that's a one, zero, you can turn it into a proportion which is the most common summary measure of proportion of persons with y equals one for different levels x1, but then that's converted to the odds are the proportion over one minus itself, and then we take the log of that ultimately model is a linear function of our predictor. So, in the subsequent section in this lecture set we will talk about the reason for this choice of scaling, but for now, I'm just going to ask you to take it on faith. So when we have this equation once the equation's been fed, if you give a value of x1, if you specify a value of x1, the resulting logistic regression equation can be used to estimate the log odds of our binary outcome y for a group of subjects with the same value of x1. So, I have an equation here, I have an intercept value plus a slope times x1. If I plugged a specific value of x1 into that, and do the computation I'll get a single number, and that will be the estimated log odds of the outcome occurring log odds that y equals one for that given value of x1. Once we have an estimated log odds we can then convert this into an estimated probability or proportion and we'll show how to do that in the last section of this lecture set. So this equation will give us the predicted log odds for any groups given their x value, but that can be converted back into an estimated probability or proportion. So let's talk about the slope in intercept here, how do we interpret those in a scientific context? Well, like we saw with other types of regression namely linear, the intercepts still estimates the value of whatever's on our left-hand side when x1 or our predictor is equal to zero and then it'll have different meaning in different situations depending on whether our predictor is continuous, binary, categorical, but we can say this generically for now. So the intercept is the estimated log odds of the outcome occurring, the log odds that y equals one when our predictor is equal to zero. The slope continues to be what we've talked about it generically being needs to change in whatever's on the left hand side for a one unit change in one or the difference for a one unit change in the predictor x1 or in other words is the difference in the left hand side for two groups of subjects who differ by one unit x1. But left-hand side is the log odds. So, beta one hat is the change in the log odds for a one unit change in x1 or in other words, the difference in the log odds for a one unit difference in x1. So, what does this mean? What is the difference in log odds, how can we interpret that? That doesn't sound like a very useful measure, but let's look at this mathematically for a moment. Think about this. So just generically going to specify two values of x1 that differ by one unit x1 equal to a and x1 equal to a plus one, a could be any number here. If a were 10, then we're comparing two groups with an x1 value of 11 to an x1 value of 10, for example. So plugging this generically given the general logistic regression equation the log odds of the outcome occurring when x1 is equal to eight plus one is equal to beta naught hat plus beta one hat times that value a plus one. Similarly for the group with the next one value one unit less and x1 value of a, the log odds for that group is equal the intercept beta naught hat plus beta one hat the slope times the x1 value of a. If we actually took the difference in these two things because this turns out appear to be beta one hat times a plus another beta one hat, if we take the difference between these two things the intercept cancels, beta one hat times eight cancels and all we're left with in the differences that single value of beta one hat, so that is the difference in the log odds, but we know that a difference in log on the log scale so beta one hat technical equals the log odds that y equals one for the x1 equal to the a plus one minus the log odds at y equals one for x1 equals a. The difference in log odds for that one unit difference in x1, but they know by the property of logarithms the difference in the log or one thing minus the log of the other can be re-expressed as the log of the ratio of those two things. So, beta one hat itself can be re-expressed as the log of the ratio of the odds at y equals one for the first group divided by the odds that y equals one for the second group with an x1 value of one less. In other words, it's a ratio log of the ratio of odds or a log of naughts ratio. So, the slope is very close to a measure of association we've worked with before it's just one scale removed, it's the natural log of an odds ratio. So we just laid this out here in generic terms, we're going to be looking at multiple examples in the next several sections and focusing on how to interpret the slopes intercepts and they're transformed version to the exponentiated scale with real data examples, we'll also show how to handle the uncertainty in the estimates and create confidence intervals and such for odds ratios and odds and also how to convert the results from the resulting equation back to the probability proportion scale. But just to start simple logistic regression is when we reiterated, so method for relating a binary outcome to a single predictor that can be binary, categorical, or continuous, and again, we'll consider all three situations with examples from real data in the upcoming sections. The slope estimate, beta one hat, and I also say estimates because when we have a categorical predictor there'll be more than one slope, but the slope estimate has a log odds ratio interpretation and this slope can be exponentiated or anti-log to estimate an odds ratio which compares the odds that the outcome y equals one for two groups who differ by one unit x1. The slope estimate beta naught hat is the estimated log odds that y equals one when for the group when x1 equals zero, and if we want to put this on the odds scale we can exponentiate this result.