So, stick a quiz again. Which of these features are numeric? Note that non-numeric features can't be used, it's just that we need to find a way to represent them in a numeric form. So, here again, we're trying to predict the number of coupons that are going to be used when we looking at different features of that different discount coupon. So, the percent value of the discount, for example, say, you have 10 percent off, 20 percent off, is this numeric? Yeah, sure. And as a meaningful magnitude, a 20 percent coupon is worth twice as much as a 10 percent discount coupon. So, this is not a problem at all and the percent value is a meaningful numeric input as well. Now, the size of the coupon, number two. Suppose I defined it as four square centimeters, super small, twenty 24 cents, two square centimeters, and then 48 square centimeters. Is this numeric? Sure and you can relate the different sizes, potentially for the magnitude. But it's also unclear whether or not the magnitudes are meaningful, so this was an ad we were placing like a banner ad. Larger ads are typically better and you could argue that that would make sense for magnitude. But if it's a physical coupon like something that goes out in your newspaper, then you have to wonder whether or not a 48 square centimeter coupon is actually twice as good as the 24 square centimeter coupon. So, let's change the problem a little bit. Suppose we defined the coupon as small, medium, and large. At this point, are small, medium, or large numeric? No, not at all. So, look. I'm not saying they can't have categorical variables as inputs to neural networks, you can. It's just that you can't use small, medium, or large directly. You have to do something smart to them and we'll look at this in a little bit. So, you just have to find different way to represent them in numeric form and we'll take a look at how to do that surely. First off, let's take the third. The font of an advertisement, Arial 18, Times New Roman 24, is this numeric? No. How do you convert Times New Roman to numeric? Well, you could say that Arial is number one, Times New Roman is number two, Rubato is number three, Comic Sans is number four, etc., etc., but that's a number code. They don't have meaningful magnitudes. If we said Arial is one and Times New Roman is two, Times New Roman isn't twice as good as Arial. So, the meaningful magnitude part is really, really important. Next up, the color of the coupon, red, black, blue, green, et cetera. Again, those aren't numeric, saying they don't have meaningful magnitudes. Now, we could come up with numbers like an RGB value to make some meaningful numbers or hex codes, but they're not going to be meaningful numerically. If I subtract two colors and the difference between them is three, does that mean if I subtract two other colors, the difference between them is also three, that these two are equal? No. And that's a problem. Next up, item category, one for dairy, two for deli, three for canned goods. No. Again, these are categorical. It's not numeric. So again, here, we are not saying that you can't use non-numerical values, we're just saying that we need to do something to them and we look at things that we'll need to do to them shortly. So, as an example, suppose you have words in an NLP or Natural Language Processing system, and the things that you do to the words to make them numeric is that you could typically run something like word2vec or word to vector. It's a very standard technique, and you basically take your words, and apply this technique to the word vectors, so that each word becomes a vector. And at the end of the word2vec process, when you look at these vectors, these vectors are such that if you take the vector from man and you take the vector from woman, and you actually subtract them, subtract those words, the difference that you get is going to be a very similar difference, is if you took the vector for king, and the vector for queen, and subtracted them. Interesting, right? That's exactly the Word2vec does. So, changing an input variable that that's not numeric to be numeric, it's not a simple matter, it's a lot of work, but it can be done. Well, you could just go ahead and throw some random encoding in there like one, two, three, four, five, but your ML model is not to be as good as if you started with a vector encoding that's nice enough to understand the context of like male, female, man, woman, king, and queen. So, that's what we're talking about when we say that you need to have numeric features and they need to have those meaningful magnitudes. They need to be useful. You need to be able to do eras, matic operations on them. You need to find a vector representations in such a way that these kinds of qualities exist for you. And one of the ways you can do these things automatically using processes called auto-encoding or embedding. Or oftentimes, for example, if you're doing natural language processing, word2vec already exists and you already have dictionaries that are already available to you. And more commonly, that's what you'll use. And when you go ahead, and use one of these dictionaries to take your text, and convert them into vectors, you'll go off and use them. No problem. You won't actually have to build the mapping yourself for something that's non-numeric into numeric. These things already exists. So, if they don't exist, you may have to build that yourself.