[MUSIC] Welcome to class. This week's topic is trees. Now we're not talking about biological trees, like the larch or the horse chestnut. We're talking about computer data structures called trees. In this lecture, I'll walk you through the recursive definition of what a tree is, and I'll show you some examples of some Python code that build various types of trees. The thing to take away from this lecture is that trees exploit the power of recursion. In previous lectures, we've talked about recursion. We've seen it. We can, kind of, approach some problems either iteratively or recursively. For trees, recursion is going to be the to go. It's going to make our code simple and clean and, hopefully, that'll point will come across as we go through this lecture. So, let's define a tree. So a tree is going to be a data structure that consists of a collections of nodes and edges. We're going to define it recursively. The tree has one particular node called the root node that's distinguished and it has an associated value and also a list of references to collection of sub trees and thus the recursion. A tree is defined in terms of subtrees. Then the root nodes of those sub-trees are the children of the root note. Likewise, those child notes have a parent, the parent is the root. So there is a parent child relationship here and those correspond to edges that join various nodes in the tree. There's a second condition that's a little more subtle, and that says that each node in the tree has exactly one parent, unless it's the root, in which case it has no parent. That's going to guarantee that our tree is hierarchical; that none of the sub-trees overlap. Let's look at an example real quick here. So here's an image of a tree. Rooted trees are typically drawn with the root at the top. So here A is the root. And A has two sub trees, the tree BCD and the tree EFGHI. Notice this tree with route E has three sub trees, FG, the tree H and the tree I. An algorithmic thinking you'll also see trees as an instance of a graph, we have nodes and we have edges. This particular case, these are called un-rooted trees, and they typically have in nodes and in minus one edges with the property that any pair of nodes was always connected by a sequence of edges. We're going to look at them exclusively as recursive data structures here. So, we're going to use rooted trees for the rest of this class. All right, I promised you some python code that build trees, so here it is here's a class definition for a tree, and it's going to be recursive. Here's the initializer it takes. Two things. It takes a value. It's the data we're going to attach to the root node and a list of children. Those children are going to be trees. We store them in two class fields here. The string method is interesting to look at. Notice what we're going to do is we're going to produce kind of a string representation for the tree and we're going to kind of draw it somewhat like a list and the way this works is. That we compute as a string representation for the value. And then we go through and take each of the, children, and we compute their string representation. And notice that those are trees this is a recursive algorithm. So were going to recursively compute string representations for the sub trees, then we're going to simply go through kind of a pin all these together to form a string representation for our tree. We also have, let's see, we have a getter here that's going to give us back the value and then we have a generator that gives us access to all the children. So let's go back now and look at a few more properties of trees and then we'll look at methods that can actually computer those properties. Our code built a tree recursively. Our tree consisted of a root node which had a value. Plus a list of sub trees. And those trees themselves have sub trees, and so forth. So where does that recursion stop? Well, recursion stops when we have a tree whose root has no children. So there's a no, there's a name for a node which has no children, it's called a leaf node. If a node has one or more children, it's called an internal node. So, you should remember those two names. If you go back to our example up above here, we have, this tree has internal nodes a, b, e, and f. And its leaf nodes are c, d, g, h, and i. With this distinction between leaf. In internal nodes, we can now go through and construct some recursive methods for computing important properties of the tree. For example, we would like to know how many nodes are in a tree. Well, let's see. If the root node of the tree is a leaf, then there's one node in the tree. Otherwise. Well, there's the root node plus all the number of nodes of all the subtrees. What if we just wanted to count the number of leaves. We could go through and say, well if the, if this root node of the tree is a leaf it's one. Otherwise, we just ignore the current internal node and take the sum of the leaves for each the subtree. Probably even something more important is we can talk about the height of the tree. What is the height? It's simply the length of the longest sequence of edges from the root to a leaf. You can compute the height recursively. If the root note is a leaf, then the height is zero. Otherwise it's an internal node, well we can return one plus the maximum the heights of the sub-trees, so for example we could scroll up here and we would see that the height of this tree is three because we could go a, e, f, g. Alright, let's go back and look at our python code now. Alright let's return back to our python implementation of the tree class. So, here I have implementations of the three methods that I just talked about, and here's a method for computing the number of nodes in a tree. Notice it initializes the answer to one and then just simply implements the answer by the number of children each of the subtrees. A simple recursive algorithm. The same thing for computing the number of leaves. We check to see if the tree has no children, if it does, we return one. Otherwise we just return the sum of all the number of leaves for the sub-trees. hide, again, has a very nice, recursive definition. Let's look at some examples here. I've got some example code that we can run, so. Let's run those. So we can see here that I've built a tree called A, and a tree called B. I've printed those out, printed those out. Here's the list representation. I've gone through and I've built a new tree CAB, b creating a route node labeled C. And then having two sub-tree's labeled A and B. And here you can see what the string representation looks like and develop a tree the five nodes. Probably more interesting this tree here my tree it's actually the tree that corresponds to the example in the notes. It has nine nodes and most importantly we can test out the various methods here. This tree has nine nodes it has five leaves. And it had hide three. Probably from your perspective, more interestingly, I can go through and uncomment this. And I have some code, that I've written in Code Sculptor, that actually draws trees. And surprise, surprise, my drawing code is recursive. If you're interested in how to draw tress, you can look at this week's practice exercise and I'll walk you through that process. OK, let's look at a more interesting examples of tree that should actually strike close to home to you. So trees are clearly an elegant recursive data structure. But, maybe at this point you're asking, what are they good for. We've done some kind of maybe mathematical things here. What can we use a tree to accomplish? In the homework we'll look at an application from genealogy, and this weeks mini project you'll look at an application from game search. But let me just show you a really simple example that kind of comes up in computing. here, this is idle. We're sitting here at a prompt and we can type in an arithmetic expression and it'll evaluate it for us so we could say. One plus two times three, hit Enter, and voila, it came back with nine. So, how did Python know to give back the answer nine? Somehow it took the string that you types in, the string... Open paren, open paren one plus two close paren times three close paren. And it did something with you to give back the number nine. What it did is it parsed this expression to create a tree. A tree that represents this arithmetic expression. It then evaluated that tree using a very simple recursive algorithm. We're not going to talk about how to do this parsing, it turns out the parsing actually involves recursion, but it's a little bit more advanced than we're ready for at this point in the class, but what I will do is I'll show you a way to build a sub class of tree that corresponds to an arithmetic expression. And I'll show you how to evaluate arithmetic expressions recursively. All right, let's look at another piece of Python code that allows us to represent and evaluate arithmetic expression. Here it is. The core of it is, class definition, for an arithmetic expression. It's a subclass of tree. We'll get back to this dictionary in just a second here. The trees that we're going to build here are going to be a special type of tree. They're going to be full binary trees. That means that every node in the tree has either zero children or two children. If it's a leaf, it has zero children, then it's going to always have a value corresponds to a number. Okay, if it's an internal node, it's going to always have two children and it's going to have a value that corresponds to an arithmetic operator. And we can actually use this dictionary operators up here to look up the corresponding function that would take the values attached to the sub trees and compute the value attached to the root. So there really kind of two interesting methods attached to this. So we can just click both of them here. There is a string method. This string method is designed to print things out not the way we did for the previous tree class. But to print things out the way arithmetic expressions are printed out. So those here we use opening and closing parenthesis. Probably more importantly, what we do is we compute the string representation for the first the left child. That's done recursively. Then we basically build the string representation for the arithmetic operator. Then we build the string representation for the second to right sumptering. So notice this recursive algorithm is going to produce an expression that's going to look exactly like the way we're used to seeing arithmetic expressions printed. Probably the more interesting method is evaluate. Evaluate is very simple here. It simply says if we're at a leaf node check to see if the numbers has actually a dot in it. If it's a dot it's a floating point number so we cast it to be a float. Otherwise we cast that stream to be an int. Here's the case where a net internal node. The internal node has two children, kind of a left subtree and a right subtree. We use evaluate to compute, convert those into a value recursively. Then, we look up the function corresponding to the operator at that node, and apply that function to the left and right value. So this returns back the value of the tree. Let's do some examples very quickly here. So I'm going to fire it up, and if you see what comes out here. I've got a tree, this is a tree corresponding to value one. This corresponds to value two, this corresponds to value three. If I evaluate those I get back the numbers one, two, and three. I can build. A tree corresponding to one plus two by just using our initializer here. And then finally, I can build a tree corresponding to one plus two times three. For example, here is the string representation for that tree. If you notice, the value for that is one plus two times three. It actually comes out to be exactly nine. We could draw that tree, and you'll see exactly what you would expect. So this is a simple example of how we can use recursion in trees to model arithmetic expression. It turns out that trees are a very important data structure inside of computer science. Okay, I'll see you next week.