In the last segment, we looked at a padding Oracle attack that completely
breaks an authenticated encryption system. I hope this attack convinces you that you
shouldn't implement authenticated encryption on your own cause you might end
up exposing yourself to a padding oracle attack or a timing attack or any other
such attack. Instead you should be using standards like GCM or any other of the
standardized authenticated encryption modes as implemented in many crypto
libraries. In this segment, I'm gonna show you another very clever attack on an
authenticated encryption system. And I hope after you see this attack, you'll be
completely convinced not to invent and implement your own authenticated
encryption systems. But instead, always use one of the standard schemes, like GCM
or others. So this particular attack that I want to show you is an attack on the SSH
binary packet protocol. So SSH is a standard secure remote shell application
that uses a protocol between a client and the sever. It has a key exchange
mechanism and once two sides exchange keys, SSH uses what's called the binary packet
protocol to send messages back and forth between the client and the server. Now
here is how SSH works, so recall that SSH uses what we called encrypt-and-MAC. Okay
so technically what happens is every SSH packet begins with a sequence number, and
then the packet contains the packet length, the length of the CBC pad, the
actual payload follows, then the CBC pad follows. Now this whole red block here is
CBC encrypted also with a chained IV, so this is also vulnerable to the CPA attacks
that we discussed before. But nevertheless, this whole red packet is
encrypted using CBC encryption. And then the entire clear text packet is MAC-ed.
And the MAC is sent in the clear, along with the packet. So I want you to remember
that the MAC is computed over plain text packets, and then the MAC is sent in the
clear. This is what we call encrypt-and-MAC. And we said that this is not a good
way to do things, because MACs have no confidentiality requirements. And by sending
the MAC of the clear text in the clear, you might be exposing information about
the clear text. But this is not the mistake that I want to show you here. I
want to show you a much more clever attack. So first, let's look at how decryption
works in SSH. So what happens is, first of all, the server decrypts the encrypted
packet length field only. So it only decrypts these particular first few bytes.
Then it will go ahead and read from the network, as many bytes as specified in the
packet length field. It's gonna decrypt the remaining cipher text blocks using CBC
decryption. Then, once it's recovered the entire SSH packet, it will go ahead and
check the MAC of the plain text, and report an error if the MAC happens to be
invalid. Now the problem here is that the packet length field is decrypted and then
used directly to determine the length of the packet before any authentication has
taken place. In fact, it's not possible to verify the MAC of the packet length field
because we haven't recovered the entire packet yet and as a result we cannot check
the MAC. But nevertheless the protocol uses the packet length before verifying that the MAC
is valid. So it turns out this introduces a very, very cute attack. And I'm only
gonna describe a very simplified version of this attack, just to get the idea
across. So here's the idea. Suppose the attacker intercepted a particular cipher
text block, namely the direct AES encryption of the message block M. And now
he wants to recover this M. And I emphasize that this intercepted
cipher text is only one block length. It's one AES block. So here's what the
attacker is gonna do. Well, he's gonna send a packet to the server that starts
off as normal. It's basically, starts off with a sequence number and then he's going
to inject his captured cipher text as the first cipher text block that's sent to the
server. Now, what is the server going to do? The server is gonna decrypt the first
few bytes of this first AES block and he's going to interpret the first few bytes as
the length fields of the packet. The next thing that's gonna happen is, the server
is gonna expect this many bytes, before it checks that the MAC is valid. And so what
the attacker is gonna do, is, he's gonna feed the server one byte at a time. So the
server will read one byte, and then another byte, and then another byte.
Eventually, the server will read as many bytes as the length field specifies, at
which point, it will check that the MAC is valid. And of course, the attacker was
just feeding the server junk bytes. And as a result, the MAC is not gonna verify, and
the server will send a MAC error. But you realize that what happened here, the
attacker was counting how many bytes it sent to the server. And so it knows
exactly how many bytes were sent at the time that it receives the MAC error from
the server. So that tells it that the decryption of the first 32 bits of its
cypher text C are exactly equal to the number of bytes that were sent before it
saw the MAC error. So this is a very clever attack. So let me say it one more
time to make sure this is clear. So again, the attacker has a one block cipher text C
that it wants to decrypt and let's pretend that when C is decrypted the 32 most
significant bits of the plain text happen to be the number five. In this case, what
the attacker will see, is the following behavior. The server is gonna decrypt the
challenge block c and he's gonna obtain the number five as the length field. So,
now, the attacker is gonna feed the server one byte at a time and after the attacker
feeds the server five bytes the server says, hey, I've just recovered the entire
packet, let me check the MAC. The MAC is likely to be false and, then, it will
send, bad MAC error. So after five bytes are read off the network the attacker is
gonna see a bad MAC error and then the attacker learns that the most significant
32 bits of the decrypted block is equal to the number five. So there. So, you just
learned the 32 most significant bits of C. So this is a very significant attack,
because the attacker just learned 32 bits of the decrypted cipher text block. And
since he can apply this attack to any cipher text block he wants, he can
basically learn the first 32 bits of every cipher text block in a very long message.
So what just happened here? Well, there are actually two things that were wrong in
this design. The first one is the decryption operation is non-atomic. In
other words, the decryption algorithm doesn't take a whole packet as input, and
respond with a whole plain text as output, or with the word reject. Instead, the
decryption algorithm partially decrypts the cipher text, namely to obtain the
length field, and then it waits to recover as many bytes as needed and then it
completes the decryption process. So these nonatomic decryption operations are fairly
dangerous, and generally, they should be avoided. In this example, this nonatomic
decryption happens to break authenticated encryption. The other problem that
happened is that the length field was used before it was properly authenticated. And
this is another issue that should never be done. So the encryption field should never
be used before the field is actually authenticated. So let me ask you, if you
had the option of redesigning SSH what is the minimum change that you would do to
make SSH resist this attack? And let me tell you that multiple answers might be
correct. One option is to send a length field in the clear, just as in the case of
TLS. In this case, there's no opportunity for an attacker to submit chosen cipher
text attack, because, well, the length field is never decrypted. And so there's
no decryption taking place that the attacker can abuse. Replacing encrypt-and-MAC
by encrypt-then-MAC doesn't have any impact because this attack would apply
either way. The problem is that the length field is used before it's authenticated
and that would have to happen either way. So a better mode of encryption doesn't
actually help. Another option is to MAC the length field separately so that now
the server can read the length field, check that the MAC for just the length
field is valid, and then it would know how many subsequent bytes to read before
checking MAC field on the entire packet. The last option is actually one that
works, but is terribly inefficient, and it would expose the server to a denial of
service attack, so I'm not going to mark it as a valid answer. So the main lesson
to remember is, don't implement or design your own authenticated encryption system.
Just use the standards like GCM. But if for some reason, you can't use the
standards, and you have to implement your own authenticated encryption system, then
use encrypt-then-MAC. And make sure not to repeat the mistakes of the last two
segments, namely don't use a length field before the length field is authenticated.
And more generally, don't use any decrypted field before that field is
authenticated. Okay so this is the end of our discussion of authenticated
encryption. I wanted to point out a couple of papers on authenticated encryption that
you could use as further reading. The first one is a very cute one on the order
of encryption and authentication that talks about whether one should do encrypt-then-MAC
or MAC-then-encrypt and it shows that one is correct and one is
incorrect. It's a good read and there's a lot of information in that paper. The
second discusses OCB mode, which is a very efficient way of building authenticated
encryption. In particular, it looks at a variant of OCB with associated data as we
discussed when we described OCB. The last three papers are attack papers. The first
one describes the padding oracle attack that we discussed in the last segment.
This one here describes the length attack that we just described in this segment.
And the last one describes a number of attacks on encryptions that just use CPA
security without adding integrity. So this last paper actually provides a number of
good examples for why CPA security by itself should never, ever, ever be used
for encryption. Remember the only thing you're allowed to use is authenticated
encryption for confidentiality. Or if all you need is integrity with no
confidentiality then you just use a MAC.