0:00

In this segment, we're gonna look at the concept of deterministic encryption that

often comes up in practice. When I say deterministic encryption system, I mean an

encryption system that will always map given message to exactly the same cipher

text. So if we encrypt the same message three times, every time we'll get exactly

the same cipher text. So there are no nonces involved here. Literally this is

just a consistence encryption scheme that will always output the same cipher text

given a particular message. So let's see where this comes up in practice and in

particular, I want to show you the case of look-ups into an encrypted database. So

the settings are imagine we have a server here that is going to store information

inside of an encrypted database. So what it will store is records, and every record

has an index and some data that's stored inside of the record. Now, the first thing

the server's gonna do is, it's gonna encrypt this record. So here's a record

became encrypted and you notice that the index became encrypted with K1 and the

data is encrypted with K2 and then the encrypted record is sent over to the

database. And the same thing happens to many records so that the database overall

holds many, many encrypted records. Well, again, you can imagine that the index is

encrypted using the key K1 and then the data and the records is encrypted using

the key K2. Now, if encryption is deterministic, the nice thing about that

is that, at a later time, when the server wants to retrieve a record from the

database, all he needs to do is send to the database an encryption of the index

that the server is interested in. So here, it would send an encryption of the index,

Alice. That again, becomes encrypted, and the resulting cipher text is identical to

the cipher text that was generated when the record was first written to the

database. And the database can then, you know, find the record that has this

encrypted index in it, and then send the result back to the server. The nice thing

about this is that now the database is completely in the dark as to what data is

stored in the database and it doesn't even know what records are being retrieved by

the server since all it sees are basically requests for encrypted entices. And so

this deterministic encryption mechanism. Let's just do a quick look up in the

database given an encrypted index and we're guaranteed that because of the

deterministic encryption property that the index is going to be encrypted in exactly

the same way as if was when the record was created. And so this should be disturbing

to many of you because we previously said that deterministic encryption simply

cannot be chosen plain text secure. The problem is that an attacker can look at

different cipher text and if he sees the same cipher text twice he knows that the

underlying encrypted messages must also be the same. So in other words, by looking at

cipher text, he can learn something about the corresponding plain text because every

time he sees the same cipher text twice, he knows that the underlying messages are

equal. In practice, this leads to significant attacks, and particularly when

the message space is small. For example, if we're transmitting single bytes across

the network, such as, keystrokes that are being transmitted one keystroke at a time.

Then, in fact, an attacker can simply build a dictionary of all possible cipher

texts. If it's only single bytes, then there will only be 256 possible cipher

texts. And then, simply by learning what the decryptions of those cipher texts are,

he can actually figure out all future cipher texts, simply by looking them up,

in this dictionary. And so there are many cases where the message being so small,

where this, deterministic encryption, simply is insecure. Now concretely, in the

case of an encrypted database, what the attacker would see is if there are two

records that happen to have the same cipher text in the index position then now

he knows that those two records correspond to the same index. So again, even though

he doesn't know what the index is, he learns something about the corresponding

plain text. I wanted to briefly remind you that, formally, we say that deterministic

encryption cannot be CPA secure by describing an adversary that wins the CPA

game. The chosen plain text attack game, and let me quickly remind you how that

works. The game starts by the adversary sending two messages, M zero and M zero.

And remember that, in this game, the adversary is always going to be given the

encryption of the left message or the encryption of the right message. In this

case, since he used the same message in both cases, both on the left and on the

right, he's simply gonna get the encryption of the message M zero. In the

next step, he's gonna send the messages M zero, M1. And now he's either gonna get

the encryption of M zero, or the encryption of M1. And his goal is to tell

which one he got. But because the encryption is deterministic, this is very

easy for him to do. All he has to do is just test whether C is equal to C zero.

And if C is equal to C zero, then he knows that he got the encryption of M zero. If C

is not equal to C zero, he knows that he got the encryption of M1. So he can output

zero. If C is equal to C0 and output one, if C is not equal to C0 and his advantage

in this in this particular game would be one. So it, it would be a high, as high an

advantage as possible which means that he attacker completely defeats chosen

plain text security. Okay so, this is just a formal way of saying that the attacker

basically learns more information about the plain text than he should. So, the

question is, what do we do? And it turns out the solution is basically to restrict

the type of messages that can be encrypted under a single key. And so, the idea is to

say that suppose the encryptor never ever, ever encrypts the same message under a

single key. In other words the message key pair is always different and never

repeats. So that for every single encryption, either the message changes, or

the key changes, but, or both change. But it can't be that we encrypt the same

message twice under the same key. So why would this ever happen? Well it turns out

there are very natural cases where this happens. For example, if the messages

happen to be random. Say you, the encryptors encrypting keys and those keys,

you know, say are 128 the keys so, they'll never actually with very high

probability, they'll never repeat. In this case when we're encrypting keys, in fact,

is very likely that all the messages that are encrypted under one master key are

always distinct because, again, these keys are very unlikely to ever repeat. The

other reason why messages would never repeat is simply because of some structure

in the message space. For example, if all we're encrypting are unique user IDs. So

imagine, in the database example, the index corresponds to a unique user ID. And

if there's exactly one record in the database for each user, that says that

every encrypted record basically contains an encrypted index, where the index is

distinct for all records in the database. Okay so these are two reasons why messages

might never repeat. And this is a fairly reasonable thing that actually does happen

quite often in practice. So now if the messages never repeat, now maybe we can

actually define security and actually give constructions to satisfy our definitions.

So this motivates the concept of deterministic chosen plain text attacks and

let me explain what they mean. So as usual we have a cipher defined as an encryption

on a decryption algorithm. So we have a key space, a message space and a cipher

text space and we're gonna define security just as normal using two experiments.

Experiment zero and experiment one. And the game actually looks very similar, it's

almost an identical game to the standard chosen playing test attack game where

basically the attacker issues queries, so you can see these queries consist of pairs

of messages, M0 and M1. They, as usual they have to be the same length and for

each query the attacker either gets the encryption of M0 or the encryption of M1

and the attacker can do this again and again. He can do this Q times, and at the

end of the game he's supposed to say whether he got the encryptions of the left

messages or the encryptions of the right messages. So this is the standard chosen

plain text attack game, but now there's one extra caveat. Which is to say that, if

the bit is equal to zero, if B is equal to zero. In other words, the attacker always

sees the encryption of the left messages, then it so happens that the left messages

must all be distinct. In other words, he will never get to see the encryption of

the same message twice, because these left messages are always distinct. So if the

bit B is equal to zero, again, he'll never see. The same message encrypted under the

same key. Because, again, these zero messages, M1 zero to MQ zero, are all

distinct. Similarly, we require that all the one messages are also distinct. And so

that, again, if B happens to be equal to one, the attacker will never see two

messages encrypted under the same key. Okay? So this requirement that the, all

these messages are distinct, and similarly, all these Q messages are

distinct. Basically captures this use case that the encryptor will never encrypt the

same message multiple times using one particular key. And as a result, the

attacker simply can't ask for, the encryption of the same message multiple

times using the same key. Now, I should point out as you go back to our, to the

original attack, here. So, here we go back to our CPA attack on deterministic

encryption, you notice that here, in fact, this is not a deterministic CPA game,

because, here, the attacker did ask for the same message m0 to be encrypted twice.

So, in fact, this attack would be an illegal attack under the deterministic CPA

game. And by ruling out this attack we actually make it plausible that we might

be able to give constructions that are deterministic CPA secure. And so as usual

we say if the adversary cannot distinguish experiment zero, when he's given the

encryption of the left messages, from experiment one, when he's given the

encryption of the right messages, then the scheme is semantically secure. Under a

deterministic CPA attack. Okay. So as usual, we ask for what's the probability

that the adversary outputs one in experiment zero? What's the probability

that the outputs one in experiment one? Then if these probabilities are close then

his advantage in attacking the scheme is negligible. And if that's true for all

efficient adversaries then we say that the scheme is semantically secure under

deterministic CPA attack. So the first thing I want to do is show you the cipher

block training with a fixed IV, in fact, not deterministic CPA secure. And this a

common mistake that's used in practice. There are many products that should be

using a cipher that's deterministic CPA secure, but instead, they just use CBC

with a fixed IV and assume that's a secure mechanism. And in fact, it's not and I

want to show you why. So suppose we have a PRP that happens to operate on N bits

blocks. And we're going to use this PRP in CBC mode. So, you know, if there are five

blocks in the message then this PRP E will be called five times to encrypt each one

of those blocks. Okay. So here's how the attack's going to work. Well, the first

thing the adversary is going to do is he's going to ask for the encryption of the

message as 0N, 1N. In other words, the first block is all zeros and the second

block is all ones. So the left message is equal to the right message here which

means that he just wants the encryption of this 0N, one to the N message. So let's

see what the cipher text looks like. Well, for completeness I'm gonna write the IV,

the fixed IV, as the first element in the ciphertext. And if you think about how

CBC works and the second element in the ciphertext is gonna be basically the

encryption of the IV XOR the first block of the message. Well in our case the

first block of the message is all zeros so really all we're encrypting is just a

fixed IV. So the second block of the ciphertext is simply gonna be the

encryption of this fixed IV. So next, what the adversary's gonna do is, now he's

gonna output two messages that are a single block. So the first message is

gonna be, the message on the left is gonna be the all zeroes block. And the message

on the right is gonna be all ones block. So observe that this is a valid query,

because messages on the left are all distinct. And the messages on the right

are also all distinct. So these are all valid deterministic CPA queries. Now, what

does the attacker get in response? Well, what he'll get in response is the

following. If he gets the encryption of the message on the left. Then, well,

what's the encryption of the one block message, zero to the N? It's simply the

encryption of the fixed IV, just as we saw before. However, if he's getting the

encryption of the message on the right, well, that's gonna be the encryption of 1 XOR

the fixed IV. That's the definition of, CBC encryption. So know, you can see

basically how the attack is going to proceed. You notice, if I, here I'll use a

different color here. You notice if these two blocks happen to be the same, then we

know that he received the encryption of the message on the left, in other words B

is equal to zero. If they're not the same then he knows that B is equal to one Okay,

so he's gonna output zero if, you know, C of one, which is this block, is equal to

C1 of one, which is this block, and he's gonna output one otherwise. So this

basically shows that CBC with a fixed IV is completely insecure. Basically the

first block can be very easily attacked. And, in fact, if the messages are short

you can again build dictionaries and completely break systems that kind

of view CBC with a fixed IV hoping that it's gonna be a deterministic CPA secure.

So don't do that. We're actually gonna see secure deterministic CPA constructions in

the next segment. So I'll say one more time, if you need to encrypt the database

index in a consistent manner, don't use CBC with a fixed IV to do it. Use the

schemes that I'm gonna show you in the next segment. And so let me ask you the

same question about counter mode with a fixed IV. Is this gonna be a deterministic

CPA secure system or not? And here, I'm reminding you what counter mode with a

fixed IV is. Basically, we concatenate the fixed IV. Fixed IV plus one, Fixed IV plus

L. We encrypt all these counters. We concatenate the results, we encrypt the

message to get the cipher text. So if you think this is gonna be deterministic CPA

secure. So I hope everybody said no, because basically this is a onetime

padding encryption, and if we use this one time pad to encrypt distinct messages,

basically we'll be getting a two time pad. And to see more precisely, here I wrote

down the, deterministic CPA game. So, you notice what the attacker will do is he

would start off by asking for the encryption of the message m. So he would

get here the message m XOR the encryption of the fixed iv and then he's gonna ask

for some two distinct messages m0 and m1 that's different from m. So, m, m0 and m1

are a three are all three are distinct messages. And then what he'll get is the

encryption of mb and now he can simply mount the standard, kinda two time pad

attack. And if this equality here of c XOR c prime is equal to m XOR m zero, he

knows that c prime is the encryption of m zero. And otherwise he knows it's the

encryption of M1. So, again, he will completely win this game with advantage

equals to one. Okay, so again deterministic CPA with a fixed IV is also

completely insecure as a deterministic CPA cipher. So, don't use any of those

schemes, instead let's use the schemes I'm going to describe in the next segment.