0:01

Now, we're going to take a look at ordered symbol table operations using the binary

search tree data structure as the underlying implementation. Well, each one

of these operations are fairly straight forward but just to check our ability to

manipulate this data structure, we'll take a look at each. Suppose, we wanted to find

the minimum key in a binary search tree, or the maximum key. Well, just looking at

one example you can see almost immediately what to do to find the minimum, we move

left from the root until we find a null key, that's where the smallest key in the

data structure is. To find the maximum, we move right from the root, until we find a

null key. What about floor and ceiling? Well, those are a little bit more

complicated and we'll have to, not quite the same as in the ordered array for the

binary search so we have to do a little bit more work for that. So just for

example, let's take a look at the problem of computing the floor. So, what we want

to find is so say, we're seeking the floor of G. So, that's the largest key in the

data structure that is less than G. In this case, the answer is E. So let's just

take a look at what we have to do in the tree, the path we have to take in the tree

to figure that out. Well so, we are looking for the largest key that's less

than G. And have S well, that key is definitely going to be in the left

subtree. Its not going to be bigger than S because S is bigger than G so we go to the

left. So now, we are sitting at E. And so what's the largest key that's less than G

in this, in this tree here. Well, it might be E but there's no way it's to the left

of E because those keys are all smaller than E and therefore smaller than G. So, E

is a key candidate. But it might also be in the right so we move to the right in

this case. Alright [cough]. So that's [cough] if K is equal to the key at the

root, the floor of K is K. If K is less than the key, it roots i n the left

subtree. That's the one we just did. If it's greater than the key at the root. The

floor of K is in the right subtree, if there is any key smaller than K in the

right subtree. So, in this case, there's no key smaller than G in the right

subtree. So therefore, the answer is E. So, our code has to check for these three

cases. And here's the code that does it. It's not that much code. It's just

complicated code to understand. So if we find our key, that's the floor. If we're

going to the left we find the floor, the one on the left. And in on the right we

have to do a, a little bit of tricky code to make sure that we return the floor on

the right subtree, if there's some tree there. If there's, if there's no node

there then, then, then we, we return the root itself. So, that's a, a

implementation that, that code is definitely tricky and a similar code for

ceiling. So now, what about operations like rank and select? How many keys are

there less than a given key? And, give us the seventh largest key to facilitate

implementing those operations and also size all we do is keep an extra field in

each node, which is the number of the nodes in the subtree rooted at that node.

So, this tree has got eight nodes in it. This subtree has six nodes in it and so

forth. And those counts are going to not only enable us to immediately implement

the size function, just return the count at the root but also, they'll give us good

implementations of rank and select. So, let's look at those now. So, we add

account field to every node and then to implement the size function well, if it's

null, we return zero. So a client might call us for a null tree or [cough] or an

empty tree. Otherwise we return, x.count, which is the number of nodes in that, in

that subtree by definition. The way we maintain, there's a number of ways we can

maintain the thing but the one that we'll adopt un iformly because it adapts to more

complicated situations is just before we're done with the put operation we'll

say, okay we've done all our work and before we return the pointer to the given

subtree we're going to take the size of what's on the left and the size of what's

on the right and add one for us and that's going to be our count. So, whether or not

there was a new node added we don't have to test for that this recursively takes

care of the problem of maintaining the size in every node when there's a new node

inserted. And, it also handles more general situations, as we'll see later on.

So, that's how to maintain size. So now, how do we implement rank? Well, it's a

little like floor. It's an easy recursive algorithm, but there are three cases. So

let's look at the, at the three cases. So, we want to know the number of keys less

than K. So [cough] we're going to have a recursive algorithm for our given key. So,

let's, one of the easy ones is, if our key is equal to the, if were [cough] to the,

the key at the current node then the number of keys less than our key is the

size of the left subtree of that node. So, if we're looking for the rank of E say,

how many keys are there less than E there's exactly two, that's by definition

in the data structure that's the number of keys that are less than E. So, that's that

one for rank. What about the [cough] starting at the root if we have the case

where E is less than S. So, the rank of E in this whole tree is the same as the rank

of E in the left subtree. So, there's that case and then if we're going to the right,

then we have to add one for the root and one for the left subtree of the root and

then find the rank of us on the right. So, that's an easy recursive algorithm for

finding out the rank. And it's definitely an instructive exercise to check that you

believe that, that method works. The other thing we h ave to do is iteration. And

iteration is a fundamental operation on tree structure. And it's based on so

called in-order traversal. And that's also a simple recursive algorithm. Traverse the

left subtree enqueue the key, traverse the right subtree. So, to iterate we're going

to maintain a queue of keys. And then, we're going to call this recursive

in-order [cough] method. And that method is going to add all the keys in the tree

to the queue. And then we'll just return that queue. And that's, a queue is an

iterable data structure, and the client can iterate that. And, in order, it's just

a simple recursive method. Put everybody to the left on the queue then put the root

on the queue, then put everybody to the right on the queue. And to believe this

method, you just have to think recursively and prove by induction that this in order

method puts all the keys in the data structure on the queue in their natural

order. First, it puts all the ones to the left on the queue. If that, that happens

in their natural order, then the next thing that has to appear is the key at the

root. And then if the ones on the right go in their natural order, and then, by

induction, they're all in their natural order. That's a very simple implementation

of an iterator for these symbol table with comparable keys. So we have to again prove

that property by induction. And that's easy to do. The diagram at the right gives

another simple way to look at it pictorially. All the keys that are smaller

on the left we are going to put them out, and then we put the key at the root and

then we put all the keys on the right out in order. And then that key is going to

have all those keys in order by induction. So, here's the operation summary for

ordered symbol table. And the quick summary is that every one of those

operations, while ordered iteration is optimal, it just gets them in linear time.

And all the res t of'em take time proportional to the height of the tree.

Now, if the, the keys are inserted in random order, we know that height by

analysis, is going to be proportional to log N. Or if it's some application where

the order of insertion of the keys is well modeled by random order and that's not

unusual at all. A binary search tree is a simple and extremely effective data

structure that can support all of these operations in a quickly, much better than

binary search in an ordered array which is not dynamic and slow for insertion. So,

that's a look at binary search tree implementations of ordered operations when

keys are comparable.