[MUSIC] So what's the algorithm here? Okay so to summarize this algorithm which is called the ID3 algorithm, assume all attributes are discrete. And we can discretize continuous attributes by bending them into particular intervals, and I'll give you a method to do that that use lines. And then you're gonna choose the attribute with the highest information gain, and create branches, one for each value of the attribute. And then partition the examples based on the selected attributes so far. And you repeat with our main attributes on the remaining set of examples. Okay, and so you recursively apply this. And then the stopping conditions are well when there are not more examples left, or all the examples along a branch are assigned the same label, cuz there's no more decisions to make at that point. Okay, and so in this way, you can apply this algorithm, and build up a decision tree on your data set. And as you can see, this isn't too hard to implement. Maybe trickiest thing, if you're not familiar with it, is this notion of information gain which we went through. But other than that, I hope you can imagine trying to write a program that would do this. There's a little bit of bookkeeping to do about building up this data structure, but it's not too complicated. Okay, so what are some problems with this decision tree algorithm? Well one is that it's expensive to train. You have to keep inspecting this big chunks of the data set when you make every decision. It's also prone to overfitting, which we haven't drilled into. But this is. In general, what overfitting means is you do really, really, well on your training data, but you do poorly on test data. Which is to say that the model doesn't generalize, okay. And so you can imagine that you get these decision trees that are very, very good at making very precise decisions about the training set. But you give it one new example it hasn't seen before, and it puts it into some bucket that may or may not be right. Okay, so there's ways around this, you can try to prune branches of the tree that are over specialized to particular conditions. Okay, and so remove or aggregate subtrees that don't provide much discriminatory power. But there's another way that we're gonna talk about in the next segment that's very, very powerful, okay. [MUSIC]