Now that we know what MACs are, let's go ahead and build our first secure MACs. First I want to remind you that a MAC is a pair of algorithms. The first is a signing algorithm that's given a message and a key will generate a corresponding tag And the second is verification algorithms are given a key message and a tag while outputs zero or one depending on whether the tag is valid or not. And we said that a MAC is secure if it is existentially unforgeable under a chosen message attack. In other words, the attackers allow to mount a chosen message attack where he can submit arbitrary messages of his choice and obtain the corresponding tags for those messages, and then despite the ability to generate arbitrary tags The attacker cannot create a new message-tag pair that was not given to him during the chosen message attack. Okay so we've already seen this definition in the last segment and now the question is how do we build secure MACs? So the first example I want to give you is basically showing that any secure PRF directly gives us a secure MAC as well. So let's see how we do it. So suppose we have a pseudo random function so the pseudo random function takes and inputs X and outputs in Y. And let's define the following MAC. So the way we sign a message 'M' is basically by simply evaluating the function at the point 'M' So the tag for the message M is simply the value of the function at the point M and then the way we verify a message to that pair is by recomputing the value of the function at the message M and checking whether that is equal to the tag given to us. We say yes if so and we reject otherwise. So here you have in pictures basically that when Alice wants to send a message to Bob she computes a tag by the value of PRF and then she appends this tag to the message, Bob receives the corresponding tab pair, he recomputes the value of the function and tests whether the given tag is actually equal to the value of the function at the point M. So let's look at a bad example of this instruction. And so suppose that we have a pseudo-random function that happens to output only ten bits. Okay, so this is a fine pseudo-random function and it just so happens that for any message 'M' it only outputs ten bits value. My question to you is if we use this function 'F' to construct a MAC, is that going to be a secure MAC? So the answer is no, this MAC is insecure. In particular because the tags are too short. So consider the following simple adversary. What the adversary will do is simply choose an arbitrary message M and just guess the value of the MAC for that particular message. Now, because the tag is only ten bits long, the adversary has a chance of one in two to the ten in guessing the MAC correctly. In other words, the advantage of the guessing adversary, one distinctly guesses a random tag for a given message. That adversary is going to have an advantage against the mac that's basically one over two to the ten which is one over a thousand twenty four and that's definitely non negligible. So the adversary basically will successfully forge the mac on a given message with probability one on a thousand which is insecure. However it turns out this is the only example that where things can go wrong, only when the output of the function is small can things go wrong. If the output of the PRF is large Then we get a secure MAC out of this function. And let's state the security theorem here. So suppose we have a function 'F' that takes messages in 'X' and outputs tags in 'Y', then the MAC that's derived from this PRF in fact is a secure MAC. In particular, if you look at the security theorem here, you'll see very clearly the era bounds, in other words since the PRF is secure we know that this quantity here is negligible. And so if we want this quantity to be negligible, this is what we want, we want to say that no adversary can defeat the MAC 'I sub F', that implies that we want this quantity to be negligible, in other words we want the output space to be large. And so for example, taking a PRF that outputs eighty bits is perfectly fine. That will generate an eighty bit MAC and therefore the advantage of any adversary will be at most one over two to the eighty. So now the proof of this theorem is really simple, so lets just go ahead and do it. So in fact lets start by saying that suppose instead of a PRF we have a truly random function from the message space to the tag space so this is just a random function from X to Y that's chosen at random from the set of all such functions. Now lets see if that function can give us a secure mac. So what the adversary says is, 'I want a tag on the message M1'. What he gets back is the tag which just happens to be the function evaluated at the point M1. Notice there is no key here because F is just a truly random function from X to Y. And then the adversary gets to choose a message from M2 and he obtains the tag from M2, he choose M3, M4 up to MQ and he obtains all the corresponding tags. Now his goal is to produce a message tag pair and we say that he wins, remember that this is an existential forgery, in other words first of all T has to be equal to F of M This means that 'T' is a valid tag for the message 'M'. And second of all, the message 'M' has to be new. So the message 'M' had better not be one of M-one to M-Q. But let's just think about this for a minute, what does this mean? So the adversary got to see the value of a truly random function at the points M-one to M-Q and now he?s suppose to predict the value of this function as some new point, M However, for a truly random function, the value of the function at the point M is independent of its value at the points M-1 to M-Q. So the best the adversary can do at predicting the value of the function at the point M is just guess the value, because he has no information about F of M. And as a result his advantage if he guesses the value of the function at the point M he'll guess it right with probability exactly one over Y. And then the tag that he produced will be correct with probability exactly one over Y. Okay, again he had no information about the value of the function of M and so the best he could do is guess. And if he guesses, he'll get it right with probability one over Y. And now of course, because the function capital F is a pseudo random function The adversary is gonna behave the same whether we give him the truly random function or the pseudo random function. The adversary can't tell the difference and as a result even if we use a pseudo random function, the adversary is gonna have advantages at most one over Y in winning the game. Okay, so you can see exactly the security theorem, let's go back there for just a second. Essentially this is basically why we got an error term of one over Y because of the guessing attack and that's the only way that the attacker can win the game. So now that we know that any secure PRF is also a secure MAC, we already know that we have our first example MAC. In particular, we know that AES, or at least we believe that AES is a secure PRF, therefore, AES since it takes sixteen byte inputs, right the message space for AES is 128 bits, which is sixteen bytes. Therefore the AES cipher essentially gives us a MAC that can match messages that are exactly sixteen bytes. Okay, so that's our first example of a MAC. But now the question is if we have a PRF for small inputs like AES that only acts on sixteen bytes, can we build a MAC for big messages that can act on gigabytes of data? Sometimes I call this the McDonald's problem. Basically given a small MAC and we build a big MAC out of it. In other words, given a MAC for small messages and we build a MAC for large messages. So we're gonna look at two constructions for doing so. The first example is called a CBC MAC that again takes PRF for small messages as input and produces a PRF for very large messages. As outputs. The second one we'll see is HMAC which does the same thing again takes a PRF for small inputs and generates a PRF for very large inputs. Now the two are used in very different contexts. Cbc-mac is actually very commonly used in the banking industry. For example, there's a system called the Automatic Clearing House, ACH, which banks use to clear checks with one another and that system, CBC-MAC, is used to ensure integrity of the checks as they're transferred from bank to bank. On the Internet, protocols like SSL and IPsec and SSH, those all use HMAC for integrity. Two different MACs and were gonna discuss them in the next couple of segments. And as I said were also gonna start from a PRF for small messages and produce PRF for messages that are gigabytes long and in particular they can both be substantiated with AES as the underlying cipher. So the last comment I want to make about these PRF based MACs is that in fact their output can be truncated. So suppose we have a PRF that outputs N bit outputs. So again for AES this would be a PRF that outputs 128 bits as outputs. Its an easy lemma to show that in fact if you have an N bit PRF if you truncated, in other words if you only output first key bits The result is also a secure PRF and the intuition here is if the big PRF outputs N bits of randomness for any inputs that you give to the PRF then certainly chopping it off and truncating it to T bits is still gonna look random. The attacker now gets less information so his job of distinguishing the outputs from random just became harder. In other words, if the N bit PRF is secure, then the T less than N bit PRF, the truncated PRF, would also be secure. So this is an easy lemma and since any secure PRF also gives us a secure MAC, what this means is if you give me a MAC that's based on a PRF and what I can do is I can truncate it to W bits, however, because of the error term in the MAC based PRF theorem we know that truncating to W bits will only be secure as long as one over two to the W is negligible. So if you truncate the PRF to only three bits, the resulting MAC is not going to be secure. However, if you truncate it to say 80 bits or maybe even 64 bits, then the resulting MAC is still gonna be a secure MAC. Okay, so the thing to remember here is that even though we use AES to construct larger PRFs and the output of these PRFs is gonna be 128 bits, it doesn't mean that the MAC itself has to produce 128 bit tags We can always truncate the outputs to 90 bits or 80 bits, and as a result, we would get still secure MACs but now the output tag is gonna be more reasonable size and doesn't have to be the full 128 bits. Okay, so in the next segment we're gonna look at how the CBC-MAC works.