In this video, we discuss the widely used logistic regression model. As we will see, logistic regression generalizes the idea of linear regression to situations where the outcome variable is categorical. We will focus on the case with binary outcomes typically denoted by zero, one. We also call the two outcomes, failure, typically denoted by zero, and success, typically denoted by one. The failure, success terminology is used for convenience and their specific meaning depends on the problem context. Logistic regression model can be generalized to handle situations with more than two outcomes also. Before discussing logistic regression, we would like to understand whether we can use linear regression for classification. Mathematically, we can model the probability of success for the outcome variable Y, as a linear function of predictive variable X. Even though this model is sometimes used in practice, it has a fundamental problem that the predicted probability can be above one or below zero. Therefore in general, linear regression is not a good model to predict binary outcomes. In logistic regression, instead of modeling the probability as a linear function, we model it using the logistic function. Here, E is the lateral number. In both the numerator and the denominator, we use an exponential term of the predictor variables. Note that the exponential term is always positive. This guarantees that the ratio is always positive. Furthermore, when the exponential term is small, the ratio is close to zero. As the value of exponential term increases the ratio increases. When the value of the exponential term is very large, the ratio approaches one. In this manner, we ensure that the function properly models that probability. Here is an illustration of a logistic function which is S shaped. Note that the function is increasing in the value of predictor variable X. For small and large values of X, the curve is rather flat and the values are close to zero and one. For X values in the middle range, the value increases more steeply in X. After a little algebra, we can show that the exponential term in the logistic function corresponds to the ratio of the probabilities of the two outcomes. This ratio is also called odds in statistics. Odds is the ratio between the probability of success or the target variable equal one and the probability of failure, or the target variable equal to zero. Therefore, by using the logistic function to model the probability of a particular outcome, we're effectively modeling the odds of the event using exponential term. To understand the concept of odds, let's discuss two simple examples. When we say that a football team has a 50-50 odds of winning the game, we're saying that the chance of winning and losing are the same. Therefore, the probability of winning is a half. Now, suppose that we believe the team has a three to one odds of winning a game, what is the probability of winning then? Based on the definition of odds, the probability of winning is three times that of losing. Therefore the probability of winning is three divided by three plus one, which is 0.75. When we take the log on both sides, we arrive at a linear function. The logarithm of odds is often called log odds, therefore we model the log odds as a linear function. Based on this understanding, the coefficients can be properly interpreted. In particular, Beta one is the increase in log odds for a unit increase in X. Logistic regression should be contrasted with linear regression. In linear regression, the target variable is directly related to the predictor variables. In a logistic regression, we take a probabilistic view of the two outcomes of a binary variable and relate the log odds of one outcome with a linear function of the predictor variables.