The last part of this module, I will talk to you about Bayes rule. Probably you know Bayes rule. We are saying where is Bayes rule going to? I thought we're done with that, no statistics anymore. Turns out most of the time, what we are intuitively doing is applying some prior beliefs and then one way is to start with a completely clean slate. The other one says okay, most of the times I drive in Champaign, I see more of Hondas these days, and after Hondas I see Jaguars. No, I'm kidding. So basically, we have an idea of what kind of cars we see on the road. That's a kind of prior probability. Now, then we see something flashing past us at the high speed or the low speed and say, "Hey, let me update it." I know maybe these are the cars which are more popular, but nothing can move that fast and make that kind of a noise. I think this is a Maserati and therefore, I start putting a different probability, the probability there's a Honda is low. Maybe it's a BMW, but probably it's maybe a Mustang. I don't know. You car lovers shouldn't blame me about it if I'm comparing a Mustang to a Maserati, okay? So the usual decision-making is that we have prior ideas, use these prior ideas, and we collect some data, we update it, then we come up with a new belief on what is the classification. A very classic example we go is this door, this game, right? Then we say the prize is behind one of these three doors. So your prior probability is one per chance, if you knew nothing, right? Now, you get new data. So you select this door and the game show host says, "Just wait a minute, I'll give you some more information", and the game show host of course knows which door has a prize behind it and only one of them has remember, right? He or she opens this door and shows it's empty, there's no price and says, "Would you like to change your mind?" Don't believe me if you will do your math, but based on this new data, it's twice as likely that the prize is behind this door than the door which you chose, simply because the game show host didn't have a choice. If there was really a prize here, the show host couldn't open it and show it to you, right? So here's the way to remember it. We had a prior probability, we got some new data, and we updated it over probability. So that's what Bayes rule is all about, using data which we have to home down and narrow down and improve our classification. Let's apply it to very simple exam. So this is an example and I have to tell you that I made it up. The company is called DH and it offers service contracts for home appliances. So it has many campaigns in a year and my friend Deepak, the Vice President for Marketing, has been looking at the data and he wants to fine tune the next campaign. My research assistant, Hema, is Data Science consultant actually suggest, "Let's use a Bayesian approach saying which prospects are more likely to accept the plan." Deepak says, "I don't understand how do I use a Bayes approach? So she says let me illustrate. Let's take a small data-set, 20 rows, five features. So 20 rows, each one is accustomed. So I have data about 20 customers in the past. I know what job they had, so they could be unknown blue collar, white collar retired, students, etc. The little matter do not matter if they're divorced. Education; secondary, tertiary, primary. Whether they have ever defaulted on a loan, and you will notice a few people have defaulted on a loan, whether they own a house and based on this, not based on this. Along with this data, we have information whether they bought the product, bought the service plan which you were trying to sell. Yes or no. So then Deepak says, "What can you do with this?" Well, what can you do with it? Well, many things you could do. You would run one of the lessons that we already thought off. You can try to think about which a lot of them would fit this well. But let's use a Bayesian. So he says okay. Deepak says, "Okay. So here is a new person. Okay. I don't know whether they'll buy a product or not but I want to know what's the probability this person will buy a product." Okay. So he gives his data o Hema and says, "Hema, can you figure it out?" So this person is blue collar I know, they're filled in a form with all their preferences, whether this person is single, has tertiary education, probably a PhD, has never default in a loan, and still rents. Okay. Fine. So one way you say, it's a simple way, look at all the people who have bought the product and see if there is somebody similar to this person who has bought this product. You go down the table and lo behold that is such a person; blue collar, single, tertiary. So where is this person? This person is a blue collar, single, tertiary, has never defaulted on the loan, doesn't own a house and has bought the product. So wow. There's only one person who's bought this product of the same characteristics. So I think I should offer it to this person. Well, I could do this right? But if your data is very, very large, this kind of a matching is going to take a long time and maybe you don't have that time. So in the example I'm going to give you, one of the uses of this would be to classify an email as whether it is spam or not spam.