The next thing I want to do is show you the general attack on collision resistant hash functions. If you remember when we talked about block cyphers. We saw a general attack on block cyphers which we called exhaustive search. And that attack forced the key size for a block cypher to be 128 bits or more. Similarly on collision resistance there is a general attack called the birthday attack which forces the output of collision resistant hash functions to be more than a certain bound. So let me show you the attack and we will see what those bounds come out to be. So here's the general attack that can work on arbitrary collision resistant hash functions. So here we have our collision resistant hash functions, supposedly, but lets suppose that it outputs N bit values. In other words, the output space is roughly of size two to the N. Now, the message space is gonna be much, much larger than N bits. Let's just say that the messages that are being hashed are say, you know, a hundred times N bits. I wanna show you an algorithm that can find the collision for this hash function H in time roughly two to the N over two. Okay, so roughly the square root of the size of the outputs space. So here's how the algorithms gonna work: what we'll do is we'll choose random two to the N over two messages in our message space that's called an M-one to M-two to the N over two. Now because the messages themselves are much bigger than N bits, they're hundred times N bits, it's very likely that all these messages are distinct. So they'll be distinct with high probability. But for each one of these messages we're gonna apply the hash function and obtain the tag T sub I. So this is of course the T sub I's are N-bit long strings. And now we're gonna look for a collision among the T sub I's. In other words, we're gonna find an I and a J such that T sub I equals to T sub J. And once we've done that we've basically found the collision because, as we said, with very high probability, M-I is not equal to M-J. But the hash of M-I is equal to the hash of M-J and therefore we found the collision on the function H. Now if it so happens that we looked through all the two to the N over two T sub I's and we don't find the collision, we go back to step one and try another set of two to the N over two messages. So the question is how well will this work, in other words how many times do we have to iterate this process until we actually find the collision? And I wanna show you that in fact the number of iterations is gonna be very, very small which means that this algorithm will find the collision in time that's roughly proportional two to the N over two. So to analyze this type of attack, I have to tell you a little bit about the birthday paradox. I imagine some of you have already heard about the birthday paradox. Here stated as a theorem, and I wanna prove it to you because everybody should see a proof of the birthday paradox at least once in their lives. So here it is, so imagine we have N random variables R-one to R-N in the interval one to B. And the only thing I'm gonna assume about them is that they're actually independent of one another. That's crucial that these N samples R-one to R-N in this interval are independent of one another. And they also happen to be distributed identically. So for example, they might all be uniform in the interval one to B, but again these would be independently uniform variables. Now it so happens that if we set N to B about the square root of B, in other words if we sample roughly the square root of B samples from the interval one to B, And to be precise, it's one point two times the square root of B. Then the probability that two of those samples will be the same is at least a half. Okay, then it turns out in fact the uniform distribution is the worst case for the birthday paradox. In other words, if the distribution from which the R-I's are sampled from is non-uniform, that in fact fewer than one point two times the square root of B samples are needed. The uniform distribution is the worst case. So we're gonna prove this for the uniform distribution and that basically also proves it for all other distributions. But the proof that I wanna show you here will hold just for the uniform distribution. Okay, so let's do the proof that's actually not difficult at all. So we're asking what is the probability that there exists an I that is not equal to J such that RI is equal to RJ. Well, let's negate that so it's basically one minus the probability that for all I not equal to J we have that RI is not equal to RJ. This basically means that no collision occurred among the N samples that we chose. Well let's try to write this out more precisely. Well we're gonna write this as one minus. And now when we choose R-one, basically it's the first one we choose so it's not gonna collide with anything. But now let's look at what happens when we choose R-two, when we choose R-two, lemme ask you, what is the probability that R-two. Does not collide with R-one. Well, R-one takes one slot so there are B minus one slots that if R-two falls into it's not gonna collide with R-one. So in other words the probability that R-two does not collide with R-one is B minus one slot divided by all B possible slots. Similarly, when we pick R-three What is the probability that R-three does not collide with either R-one or R-two? Again, R-one and R-two take up two slots. And so there are B minus two slots that remain for R-three if it falls into either one of those B minus two slots it's not gonna collide with either R-one or R-two. So I imagine you see the pattern now, so R-four it's probability of not colliding with R-one, R-two, or R-three is B minus three over B. And so on and so forth until we get to the very last R-N and the probability that R-N does not collide with the previous R-I's, well, there are N minus one slots taken up by the previous R-I's. So, if R-N falls into any of the remaining B minus N plus one slots It's not gonna collide with any of the previous R-one to R-N minus one. Now you notice that the reason I was able to multiply all these probabilities is exactly because the R-I's are all independent of one another. So it's crucial for the step That the R-I's are independent. So let me rewrite this expression a little bit. Let me write it as one minus the product of I goes to N minus one of one minus I over B. Okay. All I did is I just rewrote this as a big product as opposed to writing the terms individually. So now I'm gonna use the standard inequality that says that for any positive X, one minus X is less than E to the minus X. And that's actually easy to see because E to the minus X, if you look at it's Taylor expansion, is one minus X plus X squared over two minus and so on and so forth. And so you can see that we're basically ignoring this latter part of the Taylor expansion, which happens to be positive and as a result the left side is gonna be smaller than the right-hand side. Okay, so let's plug this inequality in, and what do we get? We get that this is greater than one minus the product of I goes from one to N minus one of E to the minus I over B okay, simply plugged in X equals I over B for each one of those terms. And the next thing about exponentials is that we multiply them, the exponents add. As a result this is simply equal to one minus E to the power of, here let me take the one over B out of the parentheses, sum of I goes from one to N minus one of I. Okay. So, all I did was I took the minus one over B out of the parentheses and we're left with simple sum of one to N minus one. And so the sum of the integers from N to N minus one is simply N times N minus one over two which it can bound by N squared over two. And so really what I get at the end here is one minus E to the power of minus N squared over two B. Okay, I literally downed it to sum here by N squared over two. Okay, very good. So now what do we do about N squared over two B? Well, we can derive exactly what N squared over two B is from the relationship here. So if you think about it, let's look at N squared over two. Well, N squared over two is 1.2 squared over two. 1.2 squared is 1.44 divided by two is .72 times the square root of B squared which is B. Okay so N squared over two is .72B, and as a result, N squared over 2B is just .72. So we get 1-E, which is a power of minus 0.72. Well so now you just plug this in to your calculator and you get 0.53, which as far as i know, is always bigger than F. So this proves the Birthday Paradox and you notice that it was crucial to a string that the samples are independent of one another, we only proved that for the uniform distribution. But as i said it is actually fairly easy to argue now that any distribution that is away from the uniform distribution will satisfy this with even a lower balance. So if you have. 1.2 squared of B, the theorem will certainly hold for none uniform distributions. The reason it is called a paradox is because it is very paradoxical that the square root function grows very slowly. In particular if you try to apply this to birth dates, so lets assume that we have a number of people in a room, and lets assume that their birth dates are independent of one another and are uniform in their interval one through 365. Then what the Birthday Paradox says is that we need roughly 1.2 times the square root of 365. Which i believe is something like 23, which says we need roughly 23 people in a room, and then with probability one half, two of them will actually have the same birth date. The reason it is called a paradox is because the number 23 seems really small to people, and yet by this theorem we just proved, with probability one half, there will be two people with the same birth date. By the way the intuition for why this fact is true is because really what the collision is counting is it is looking at all possible pairs of people. And if you have a square root of B people, then there are square root of B squared pairs. Roughly, Which is roughly B pairs total and so it's very likely that each pair collides probability one over B and if you have B pairs, it's likely that one of the pairs will collide. So hopefully this gives the intuition for where the square root comes from. Its basically from the fact that if you have N people in the room, there are N squared possible pairs. I should say by the way that even though the Birthday Paradox is often applied to birth dates, birth dates are actually not uniform at all. If you actually look at the birth dates of people, you see that there is a very clear bias towards being born in September, and for me bazaar reason there is also a biased towards being born on a Tuesday. And so when we apply the birthday paradox to birthdays, in fact the actual bound is going to be smaller than one minus two square root of B because we said that for non even form distributions you need even fewer samples before you get a collision. So let me show you how to graph the Birthday Paradox, its quite interesting how it behaves. So here set B to be a million, in other words we are picking random uniform samples in the range one to a million. And the X axis here, is the number of samples that we are picking, and the Y axis, is the probability that we get a collision among those N samples. So we see that the probability of one half happens around 1.2 square root of B. Roughly twelve hundreds, 1.2 square root of B. And you see that if we look at exactly the square root of B, the probability of a collisions is around .4 or .41. And you notice that the probability goes up to one extremely fast. For example, already at roughly two square root of B, but the probability of a collision is already 90%. Similarly, when we go bellow square root of B, the probability of a collision very, very quickly drops to zero. Okay, so there is kind of a threshold phenomena around square root of B, where the probability of a collision goes to one very quickly, above square root of B drops to zero very quickly below square root of B. So now we can analyze are attack algorithms, so let me remind you, here we chose, two to the interval two random elements in the message space, we hashed them. And so we started off with a random distribution in the message base, after we hashed them, we end up with some distribution, this distribution over tags might note be uniform, but we don't care, the point is that all these tags T1 to T2 to the N over two, are independent of one another. If we choose. Two to the N over two or 1.2 to the N over two, tags, the probability that the collision will exist is roughly one half. So let me ask you then, in that case, how many times do we have to iterate this algorithm before we actually find a collision? Well then since each iteration is going to find a collision with probability one half, we have to iterate about two times in expectation. And as a result the running time of this algorithm is basically two to the N over two evaluations of the hash function. Okay so notice also this algorithm takes a lot of space but we're gonna ignore the space issue and we're just gonna focus on the running time. Okay so this is kinda cool, this says that if your hash function outputs N-bits outputs there will always be an attack algorithm that runs in time two to the N over two. So for example if we output 128 big outputs Then a collision could be found in time two to the sixty four, which is not considered sufficiently secure. Okay, this is why collision resistant hash functions, generally are not going to output 128 bits. So let me show you some examples. The first three are actually federal standards, SHA-1, SHA-256, and SHA-512 and the fourth one is an example from the designers of AES, called Whirlpool, and so you can see SHA-1 outputs 160 bits and as a result there is a generic attack on it that runs on time two to the eighty. SHA-256 outputs 256 bits so the generic attack will run in time two to the 128 and so on and so forth. I also listed the speed of these algorithms and you notice that the bigger the block size actually the slower the algorithm is [and?] so there's a performance penalty but nevertheless there's quite a bit of benefit in terms of security. Now you notice the SHA function is actually greyed out. The SHA function although nobody has found collisions for SHA-1 yet It is still recommended that in a new project, and hopefully programing projects in this class, you don't use SHA-1, instead you use SHA-256. In particular there is actually a theoretical collision finder on SHA-1 that works in time two to the 51. So it is probably just a matter of time until someone finds a collision for SHA-1, and just kills altogether, but the current advice is that SHA-1 is still a collision resistant hash function because nobody has found collisions for it, but it is probably just a matter of time, just a few months, or few years, until a collision is found. It is a result a new product and new projects SHA-1 should not be used and only use SHA-256 or if you wanna be even more cautious then use SHA-512. So this is the end of this segment, and now we are going to turn building collision resistant hash