So now we're gonna look at a very general paradigm called the Merkle-Damgard paradigm, that's used for constructing collision-resistant hash functions. Before we do that, let me just remind you what our goals are. So as usual we say that H is a hash function that maps some large message space into a small tag space. And we say that a collision for a hash function is basically a pair of distinct messages that happen to map to the same value under this hash function. And our goal is to build collision-resistant hash functions namely functions where it's hard to find even a single collision. Even though we know many collisions exist. No efficient algorithm can even output a single collision. So we're gonna construct these hash functions, these collision resistant hash functions, in two steps. The first step, which we're gonna do in this segment, is to show that if you give me a collision resistant hash function for short messages, we can extend it and build a collision resistant hash function for much, much, much longer messages. In the next segment, we'll actually build collision-resistant HASH functions for short messages. Okay so let's look at the construction. The construction is actually very general and in fact all collision-resistant hash functions follow this paradigm. It's actually a very elegant paradigm and let me show you how it works. So here we have our function H which we're gonna assume is a collision-resistant hash function for small sized inputs. So the way we're gonna use this little function h, this h is sometimes called a compression function, is we're gonna take a big message M and break this message in to blocks. And then we use a fixed value called the IV. Here is the case the IV is fixed forever. And it's basically embedded in the code and in the standards, it's just a fixed ID that's part of the fin-nation of the function. And then what we do is we apply the small compression function H to the first message block along with this ID. What comes out of that is what's called a chaining variable that's gonna be fed into the next compression function which compresses the next block along with the previous chaining variable and out comes the next chaining variable, and the next message block is compresses, and so on and so forth until we which the final message block And then the final message block, the one special thing that we do, is that we must append this padding block, this PB, which stands for padding block (and I'll explain what the padding block is in just a second). But after we append the padding block we again compress the last [inaudible] variable with the last message block, and the output of that is the actual output of the hash function. So just to summarize, in symbols, we have this compression function that elements in T. This is the opposite of the hash function. Plus message blocks, this x corresponds to message blocks, and outputs the next chaining variables. So as I said, this is what the compression functions do. And then from this compression function we're able to build a hash function for large inputs, namely inputs is the space x to some power of l namely up to l blocks of x. And of course these are variable lengths so we could have different length messages that are being given input to this function h and what comes out of it is basically a tag in tag space. So the only thing left to define is the padding block. So the padding block is actually very important as we're gonna see in just a minute. What it is, is basically, well it's a sequence of 1000 that denotes the end of the actual message block. And then the most important part of the message block is that we encode the message length In this padding block. And the message length field is basically fixed to be 64 bits. So in all the [inaudible] hash functions, so in all the [inaudible] hash functions the maximum message length is two to the sixty four minus one so in fact the message length fits into a 64 bit block. An upper bound of two to the sixty four minus one bit on the message length is actually sufficiently long to handle all of the messages we're gonna throw at it. Okay, so that's the padding block, and of course the question is: what do we do if the last block really is a multiple of the compression function of block length? Where are we gonna fit the padding block? And the answer, as usual, is basically if there's no space for the padding block in the last block of the message, then we're gonna have to ass another dummy block and stick the padding block in there. And of course put the one zero, zero, zero in the right place. Okay so the point is that it's very, very important that the padding block contains the message length as we'll see in just a minute. So this is the Merkle-Damgard construction, the last piece of terminology I'll say is that we have these chaining variables. So H0 is the initial value. H1, H2, H3, up to H4 which is the actual final output of this hash function. So as I said, all standard hash functions follow this paradigm for constructing a collision resistant hash function from a compression function. The reason that this paradigm is so popular is because of the following theorem, which says basically that if the little compression function is collision resistant, then the big Merkle-Damgard hash function is also collision resistant. So, in other words, if we're going to build collision resistant functions for large inputs, all we have to do is just build compression functions that are collision resistant. So let's prove this theorem. It's a elegant proof and it's not too difficult. So the way we're gonna prove it is using the contrapositive, that is, if you can find me a collision on the big hash function then we're gonna deduce a collision on the little compression function. Therefore, if little h is a collision resistant, so will be the big H. So suppose you give me a collision on the big compression function, namely two distinct messages M and M', that happen to hash to the same output, we're going to use M and M' to build a collision on the little compression function. So the first thing is we have to remember how the Merkle-Damgard construction works and, in particular, let's assign names to the chain variables when we hash M versus when we hash M'. So here are the chain variables that come up when we hash the message M, so H0 is the fixed IV that starts the whole process, H1 would be the result of hashing the first message block with the IV, and so on and so forth, until H sub t+1 is the final chain variable, which is the final output of the Merkle-Damgard chain. And then similarly for M', let's define H0' to be the first chaining variable, H1' to be the result after compressing the first message block of M', and so on and so forth, up until we get to H' of r+1, which is the final hash of the message M'. Now you notice the length of the messages M and M' don't have to be the same. In particular, M has length t, whereas M' has length r. Now because the collision occurred, we know that these two values here are the same. H(M) is equal to H(M'). In other words, the last chaining variables are actually equal to one another. So now let's look carefully how H t+1 and H' r+1 were calculated. Well H t+1 is calculated as the compression function applied to the previous chaining variable along with the last message block of M, including the padding block. You'll notice here I'm assuming that the padding block fits with the last message block of M even if the last padding block is in its own block, it's not going to make any difference. So for simplicity, let's assume that the padding block fits with the last message block with capital M. So, by hashing the last message block with a padding block and the last chaining variable, we obtain H t+1 and, similarly, the same thing happens with M'. By hashing the last message block, the last chaining variable, we obtain H' r+1. And we know that, since these two values are equal, all of a sudden we have a candidate collision for the compression function. Here, let me circle the candidate collision, this is one part of it and this is the other part of it. Okay, so we have an equality between two arguments given to the compression function that happen to produce the same value. The only way we would not get a collision is if the arguments happen to be the same. In other words, what we're seeing here is if the arguments to the compression function are distinct, then we get a collision for little h. So let's write it out more precisely: so if H t is not equal to H' r, or Mt is not equal to M' r, or the padding blocks are distinct, then we have a collision for the compression function h and we're done, we're done and we can stop. So, the only way we need to continue is if this three-way disjunction doesn't hold. So what does it mean that this disjunction doesn't hold? well, so the only reason we need to continue is if the second-to-last chaining variables are equal, the last blocks of the messages are equal and the two padding blocks are equal. So what does it mean that the two padding blocks are equal? Remember that the message lengths were encoded in the padding block. So, in particular, this means that the length of M and the length of M' is the same, namely the t is equal to r. So from now on I can assume t is equal to r. Whenever I wrote r, I'm just gonna write t. But now we can apply exactly the same argument to the second to last chaining variables. In other words, how was H t computed? Well H t is computed by hashing the previous chaining variable, namely H t-1, with the second to last message block. And similarly, H' t was computed, you notice I replaced r by t, so H' t was computed by hashing the previous chaining variable along with the second to last message block of M'. And because these two are equal, now we get another candidate collision for the compression function. In other words, if H t+1 is not equal to H' t-1, or M t-1 is not equal to M' t-1, then basically we have a collision for h, and we can stop, we're done. So now, the only case when we need to continue is if this condition over here doesn't hold, namely, so let's suppose that H t-1 is equal to H' t-1 and we already know that, let's see, so M t is equal to M' t and M t-1 is equal to M' t-1. Suppose it so happens that these two conditions hold, well, you can see that now we can continue to iterate. In other words, we can now apply exactly the same argument to H t-1 and H' t-1 and so iterating this again and again, we can iterate all the way to the beginning of the message. Iterate to the beginning of the message, and one of two things must hold, either we find the collision for the compression function or it so happens that all the message blocks of M and M' are the same. In other words, for all i, M i is equal to M' i. But this means, because the messages, we'd reprove the messages are the same length, this means that M is actually equal to M'. But then, this contradicts the fact that you gave me a collision to begin with, so, in other words, this condition over here, can't actually happen because it contradicts the fact that M and M' are actually a collision for the big Merkle-Damgard function H. And, as a result, this means that we have to find a collision for the compression function, so that, as we work our way from the end of the message, all the way to the beginning, at some point we must find the collision for little h. Okay, so this basically completes the proof of the theorems, I can put a little QED block here, so basically what this proof is, that if the little compression function h is collision resistant, then the big Merkle-Damgard function H must also be collision resistant. So, again, the reason why people like this construction is it shows that to construct big hash functions, it suffices to construct just compression functions for small inputs, and we're gonna do that in the next segment.