0:00
So our goal for this segment is to build secure compression functions. So
compression functions are collision resistant. So just to remind you where we
are we looked at this Merkle Damguard construction which takes a little
compression function and builds from it a hash function for much, much larger
inputs. We proved this cute theorem that in fact if a little compression function h
is collision resistant then plugging in into the Merkle Damguard construction
gives us a collision resistant hash function for larger inputs. So now in this
segment our goal is basically to build a compression function that is collision
resistant. So we're going to see a couple of constructions. And so the first
question that comes to mind is well, can we build compression functions from
primitives that we already have? In particular, we spent a lot of work to
build block cyphers and the question is can we build compression functions from
block cyphers. And the answer is yes. And let me show you how to do it. So assume we
have a certain block cypher here that operates on N bits blocks, so the input is
an N bits, output is N bits. And then there's this classic construction called a
Davies-Meyer construction which I wrote down in symbols here. Basically says that
what you would do is, given the message block and the chaining variable, all we do
is we encrypt the chaining variable using the message block as the key. And then we
kind of do one more xor on the output. So this might seem a little bizarre, because
remember the message block is something that's completely under the control of the
adversary. He's trying to find the collision so he can choose the message
blocks however he wants. And yet we're using this message block as a key into a
block cypher. But nevertheless, we can argue that this construction, at least
when E is what's called an ideal cypher, we can argue that this construction is in
fact as collision resistant as possible. So let me state the theorem. The theorem
states that, as I said, if E is an ideal block cypher, meaning that it's a random
collection of K random permutations from 01 to the N to 01 to the N. Then under
that assumption that E's an ideal block cypher, in fact, finding the collision for
this compression function H takes time, two to the N over two. In particular, we
can show that anyone who is finding collisions has to evaluate the encryption
and decryption functions at least 2 to the N over 2 times. And if you think
about what that means, since the output of this compression function is only N bytes
long, we know that there's always a generic birthday attack that finds
collisions in time 2 to the N over 2. So basically this theorem says that this
collision resistant function is as collision resistant as possible. We can
find the collision in time 2 to the N over 2 using the birthday attack but
there is no algorithm that will do better than 2 to the n over 2. So this
is actually a very common compression function used in collision resistance
hashing in fact of a SHA functions all used Davies-Mayer. It turns
out that there is something special about the Davies-Mayer construction that
makes collision resistant. If you just try to guess the construction very likely you
will end up with something that is not collision resistant. And so let me ask you
the following puzzle. Suppose we actually define the compression function as
follows, namely all we do is we encrypt the chaining variable H using the current
message block as the key. So the difference is that we dropped this 'xor' H
in Davies-Mayer, we simply ignored it. So it's not there. And the puzzle for you
is show me that this compression function then is actually not collision resistant.
So, let's see, so we're trying to build a collision, namely a distinct pair of HM
and H' M' that happen to collide under this later function H. And my
question to you is how would you do it? So I'm already going to tell you that you're
going to choose H, M, and M' arbitrarily. The only requirement is that
M and M' are distinct. And then my question is, how would you construct an H'
that would cause a collision to pop out? So the answer is the first choice and
an easy way to see it is if we apply the encryption function using M' to both
sides. Then we get that this is basically E of M' applied to H', right.
this is what we get by applying encryption using M' to the left hand side. And
if we imply encryption using M' to the right hand side, the decryption
operator cancels out and we simply are left with the encryption of M, H, which is
exactly the collision that we wanted to find. So there. You can see that it's
basically by using the decryption function D, it's very easy to build collisions for
this compression function. Now I should tell you that in fact Davies-Meyer is not
unique. There are other ways to build collision resistant compression functions
from block ciphers. For example, here's a method called Miyaguchi Preneel. Miyaguchi
Preneel actually is used in WHIRLPOOL hash function that we saw earlier. Here is
another method that happens to result in a collision resistant compression function.
And there are twelve variants like this playing with XORs and placing the
variables in different slots that will actually give a collision resistant
mechanism. But there are also many, many variants of this like we saw in the
previous puzzle that are not collision resistant. So here's. Another example,
that's not collision resistant. And I'm gonna leave it as a homework problem to
figure out a collision for this construction. So now, basically, we have
all the ingredients to describe the [inaudible] 256 hash function. I'll tell
you that it's a Merkel-Damgard construction, exactly as the one that we
saw before. It uses a Davies-Mayer compression function. And so the only
question is, what's the underlying block cipher for Davies-Mayer? And that block
cipher is called SHACAL-2. And I'll just tell you it's parameters. It uses a
512 bit key. And remember the key is taken from the message block. So, this is really
what the message block is. And it so happens to be 512 bits. Which means the
SHA-256 will process its input message 512 bits at a time. And in the
block size, for this block cipher is 256 bits. And these are the chaining variable.
So this would be H i-1. And this would be successive chaining variable.
So now you have a complete understanding of how SHA-256 works.
Module of this cipher SHACAL-2 I'm not going to describe here.
So as I said, one class of compression functions is built from block cyphers. It turns out there's another class of
compression functions that's built using hard problems from number theory, and I
want to very briefly show you one example. And we call these compression functions
provable because if you can find the collision on this compression function
then you're going to be able to solve a very hard number theoretic problem which
is believed to be intractable. And as a result, if the number theory problem is
intractable, the resulting compression function is provably a collision
resistant. So here's how this compression function works. Basically we're going to
choose a large prime piece, so this is a gigantic prime, something like 700 digits,
2,000 bits. And then we're going to choose two random values, U and V, between
one and P. And now let's define the compression function as follows. It takes
two numbers between 0, and p-1, and it's gonna output one number between
0, and p-1. So it's compression ratio is 2 to 1. And takes two
numbers. And outputs one number. In the range 0 to p-1.
And it does it basically by computing double exponentiation. It computes u to the H times v to the n.
And the theorem, which, right now, I'm just gonna state as a fact. This fact actually
turns out to be fairly straightforward to prove, and we're gonna do it later on when
we get to the number theoretic part of the course. The fact is, that if you can find
a collision for this compression function, then you can solve a standard heart
problem in number theory called a discreet log problem. Everyone believes discrete
log is hard, and as a result, this compression function is provably collision
resistant. So you might ask me why do people not use this compression function
in practice. Why would we not use this for SHA-256? And the answer is that this
gives very slow performance in comparison to what you get from a block cipher. So
this is not really used for any compression functions. However, if for
some reason you really only want to, say, MAC or sign. Just one long message, and
you have a whole day to do it, then certainly you can use this type of
compression function. And even though it's slow, you'll get something that's provably
collision resistant. Okay, so that's the end of this segment. And now we're finally
ready to talk about HMAC, which we're gonna do in the next segments.