0:10

designed. But before we do that, I want to, first of all, define more precisely what a

cipher is. So first of all, a cipher is actually, remember a cipher is made up of

two algorithms. There's an encryption algorithm and a decryption algorithm. But

in fact, a cipher is defined over a triple. So the set of all possible keys,

which I'm going to denote by script K, and sometimes I'll call this the key space,

it's the set of all possible keys. There's this set of all possible messages and this

set of all possible ciphertexts. Okay, so this triple in some sense defines the

environment over which the cipher is defined. And then the cipher itself is a

pair of efficient algorithms E and D. E is the encryption algorithm; D is the

decryption algorithm. Of course, E takes keys and messages. And outputs ciphertexts.

And the decryption algorithm takes keys and ciphertexts. Then outputs messages.

And the only requirements is that these algorithms are consistent. They satisfy

what's called the correctness property. So for every message in the message space.

And every key. In the key space, it had better be the case that if I encrypt the

message with the key K and then I decrypt using the same key K I had better get back

the original message that I started with. So this equation here is what's called the

consistency equation and every cipher has to satisfy it in order to be a cipher

otherwise it's not possible to decrypt. One thing I wanted to point out is that I

put the word efficient here in quotes. And the reason I do that is because efficient

means different things to different people. If you're more inclined towards

theory, efficient means runs in polynomial time. So algorithms E and D have to run in

polynomial time in the size of their inputs. If you're more practically

inclined, efficient means runs within a certain time period. So for example,

algorithm E might be required to take under a minute to encrypt a gigabyte of

data. Now either way, the word efficient kind of captures both notions and you can

interpret it in your head whichever way you'd like. I'm just going to keep

referring to it as efficient and put quotes in it as I said if you're theory

inclined think of it as polynomial time, otherwise think of it as

concrete time constraints. Another comment I want to make is in fact algorithm E.

It's often a randomized algorithm. What that means is that as your encrypting

messages, algorithm E is gonna generate random bits for itself, and it's going to

use those random bits to actually encrypt the messages that are given to it. On the

other hand the decrypting algorithm is always deterministic. In other words given

the key and the ciphertext output is always the same. Doesn't depend on any

randomness that's used by the algorithm. Okay, so now that we understand what a

cipher is better, I want to kind of show you the first example of a secure cipher.

It's called a one time pad It was designed by Vernam back at the beginning of the

twentieth century. Before I actually explain what the cipher is, let's just

state it in the terminology that we've just seen. So the message space for the

Vernam cipher for the one-time pad is the same as the ciphertext space which is

just the set of all ended binary strings. This, this just means all sequences of

bits, of zero one characters. The key space is basically the same as the message

space which is again simply the embed of all binary strings. So a key in the one

time pad is simply a random big string, it's a random sequence of bits. That's as

long as the message to be encrypted, as long as the message. Okay, now that we've

specified kind of what's the cipher's defined over we can actually specify how

the cipher works and it's actually really simple. So essentially the ciphertexts.

Which is the result of encrypting a message with a particular key, is simply

the XOR of the two. Simply K XOR M. [inaudible] see a quick example of

this. Remember that XOR, this plus with a circle. XOR means addition

modulo two. So if I take a particular message, say, 0110111. And it take a

particular key, say 1011001. When I compute the encryption of the message

using this key, all I do is I compute the XOR of these two

strings. In other words, I do addition module or two bit by bit. So I get one,

one, zero, one, one, one, zero. That's a ciphertext. And then how do I decrypt? I

guess they could do kind of the same thing. So they decrypt a cipher text using

a particular key. What I do is I XOR the key and the ciphertext again. And so all

we have to verify is that it satisfies the consistency requirements. And I'm going to

do this slowly once and then from now on I'm going to assume this is all, simple to

you. So we're gonna make, we're gonna have to make sure that if I decrypt a cipher

text, that was encrypted using a particular key, I had better get. Back the

message M. So what happens here? Well, let's see. So if I look at the encryption

of k and m, this is just k XOR m by definition. What's the decryption of k XOR

m using k? That's just k XOR (k XOR m). And so since I said that XOR is

addition modulo two, addition is associative, so this is the same as (k XOR k)

XOR m, which is simply as you know k XOR k is a zero, and zero XOR anything

is simply m. Okay, so this actually shows that the one-time pad is in fact a cipher,

but it doesn't say anything about the security of the cipher. And we'll talk

about security of the cipher in just a minute. First of all, let me quickly ask

you a question, just to make sure we're all in sync. Suppose you are given a

message m and the encryption of that message using the one time pad. So all

you're given is the message and the cipher text. My question to you is, given this

pair m and c, can you actually figure out the one-time pad key that was used in the

creation of c from m?

So I hope all of you realize that in fact, given the message in

the cipher text it's very easy to recover what the key is. In particular the key is

simply M XOR C. Then we'll see that if it's not immediately obvious to you we'll

see why that's, the case, a little later in the talk, in the lecture. Okay alright

so the 1-time pad is a really cool from a performance point of view all you're doing

is you exo-ring the key in the message so it's a super, super fast. Cypher for

encrypting and decrypting very long messages. Unfortunately it's very

difficult to use in practice. The reason it's difficult to use is the keys are

essentially as long as the message. So if Alice and Bob want to communicate

securely, so you know Alice wants to send a message end to Bob, before she begins

even sending the first bit of the message, she has to transmit a key to Bob that's as

long as that message. Well, if she has a way to transmit a secure key to Bob that's

as long as the message, she might as well use that same mechanism to also transmit

the message itself. So the fact that the key is as long as the message is quite

problematic and makes the one-time pad very difficult to use in practice.

Although we'll see that the idea behind the one-time pad is actually quite useful

and we'll see that a little bit later. But for now I want to focus a little bit on

security. So the obvious questions are, you know, why is the one-time pad secure?

Why is it a good cypher? Then to answer that question, the first thing we have to

answer is, what is a secure cipher to begin with? What is a, what makes cipher

secure? Okay, so the study, security of ciphers, we have to talk a little bit

about information theory. And in fact the first person, to study security of ciphers

rigorously. Is very famous, you know, the father of information theory, Claude

Shannon, and he published a famous paper back in 1949 where he analyzes the

security of the one-time pad. So the idea behind Shannon's definition of security is

the following. Basically, if all you get to see is the cypher text, then you should

learn absolutely nothing about the plain text. In other words, the cypher text

should reveal no information about the plain text. And you see why it took

someone who invented information theory to come up with this notion because you have

to formulize, formally explain what does information about the plain text actually

mean. Okay so that's what Shannon did and so lemme show you Shannon's definition,

I'll, I'll write it out slowly first. So what Shannon said is you know suppose we

have a cypher E D that's defined over triple K M and C just as before. So KM and

C define the key space, the message space and the cypher text space. And when we say

that the cypher text sorry we say that the cypher has perfect secrecy if the

following condition holds. It so happens for every two messages M zero and M1 in

the message space. For every two messages the only requirement I'm gonna put on

these messages is they have the same length. Yeah so we're only, we'll see why

this requirement is necessary in just a minute. And for every cyphertext, in the

cyphertext space. Okay? So for every pair of method messages and for every cipher

text, it had better be the case that, if I ask, what is the probability that,

encrypting N zero with K, woops. Encrypting N zero with K gives C, okay? So

how likely is it, if we pick a random key? How likely is it that when we encrypt N

zero, we get C. That probability should be the same as when we encrypt N1. Okay, so

the probability of encrypting n one and getting c is exactly the same as the

probability of encrypting n zero and getting c. And just as I said where the

key, the distribution, is over the distribution on the key. So, the key is

uniform in the key space. So k is uniform in k. And I'm often going to write k arrow

with a little r above it to denote the fact that k is a random variable that's

uniformly sampled in the key space k. Okay, this is the main part of Shannon's

definition. And let's think a little bit about what this definition actually says.

So what does it mean that these two probabilities are the same? Well, what it

says is that if I'm an attacker and I intercept a particular cypher text c, then

in reality, the probability that the cypher text is the encryption of n zero is

exactly the same as the probability that it's the incryption of n one. Because

those probabilities are equal. So if I have, all I have the cypher text C that's

all I have intercepted I have no idea whether the cypher text came from M zero

or the cypher text came from M one because again the probability of getting C is

equally likely whether M zero is being encrypted or M one is being encrypted. So

here, we have the definition stated again. And I just wanna write these properties

again more precisely. So let's write this again. So what [inaudible] definition

means is that if I am given a particular cipher text, I can't tell where it came

from. I can't tell if it's, if the message that was encrypted. Is either N zero or N

one and in fact, this property is true for all messages. For all these N zero, for

all N zero and N ones. So not only can I not tell if'c' came from N zero or N one,

I can't tell if it came from N two or N three or N four or N five because all of

them are equally likely to produce the cypher text'c'. So what this means really

is that if you're encrypting messages with a one time pad then in fact the most

powerful adversary, I don't really care how smart you are, the most powerful

adversary. Can learn nothing about the plain text, learns nothing about the plain

text. From the cypher text. So to say it in one more way, basically what this

proves is that there's no, there's no cypher text-only attack on a cypher that

has perfect secrecy. Now, cypher attacks actually aren't the only attacks possible.

And in fact, other attacks may be possible, other attacks may be possible.

13:24

Such that, well. If I encrypt. And with k I get c. So I literally count the number

of keys and I divide by the total number of keys. Right? That's what it means, that

if I choose a random key, that key maps m to c. Right. So it's basically the number

of key that map n to c divided by the total number of keys. This is its

probability. So now suppose that we had a cypher such that for all messages and all

cypher texts, it so happens that if I look at this number, the number of k, k, and k,

such that e, k, m is equal to c. In other words, I'm looking at the number of keys

that map m to c. Suppose this number happens to be a constant. So say it

happens to be two, three, or ten or fifteen. It just hap, happens to be an

absolute constance. If that's the case, then by definition, for all n0 and n1 and

for all c, this probability has to be the same because the denominator is the same,

the numerator is the same, it's just as constant, and therefore the probability is

always the same for all n and c. And so if this property is true, then the cypher has

to have, the cypher has perfect secrecy. Okay, so lets see what can we say about

this quantity for the one time pad. So the sec-, so, the question to you is, if I

have a message in a cipher-text, how many one time pad keys are there [inaudible]

map, this message ends, so the [inaudible] C? So, in other words, how many keys are

there, such that M XOR K is equal to C? So I hope you've all answered one. And

let's see why that's the case. For the one time pad, if we have that, the encryption

of K of M under K is equal to C. But basically, well, by definition, that

implies that K XOR M is equal to C. But that also simply says that K has to equal

to M XOR C. Yes, I just X over both sides by M and I get that K must equal the

M XOR C. Okay? So what that says is that, for the one time pad, in fact, the

number of keys, in K, shows the EKM, is equal to C. That simply is one, and this

holds for all messages in cipher text. And so, again, by what we said before, it just

says that the one time pad has, perfect secrecy. Perfect secrecy and that

completes the proof of this [inaudible] very, very simple. Very, very simple

lemma. Now the funny thing is that even though this lemma is so simple to

prove in fact it proves a pretty powerful statement again. This basically says for

the one time [inaudible] there is no cypher text only attack. So, unlike the

substitution cipher, or the vigenere cipher, or the roller machines, all those

could be broken by ciphertext-only attack. We've just proved that for the one-time

pad, that's simply impossible. Given the cyphertext, you simply learn nothing about

the plaintext. However, as we can see, this is simply not the end of the story. I

mean, are we done? I mean, basically, we're done with the course now, cuz we

have a way. To encrypt, so that an attacker can't recover anything about our

method. So maybe we're done with the course. But in fact, as we'll see, there

are other attacks. And, in fact, the one time pad is actually not such a

secure cipher. And in fact, there are other attacks that are possible, and we'll

see that shortly. Okay? I emphasize again the fact that it has perfect secrecy does

not mean that the one time pad is the secure cypher to use. Okay. But as we said

the problem with the one time pad is that the secret key is really long. If you had

a way of. Communicating the secret key over to the other side. You might as well

use that exact same method to also communicate the message to the other side,

in which case you wouldn't need a cypher to begin with. Okay? So the problem again

is the one time pad had really long keys and so the obvious question is are there

other cyphers that has perfect secrecy and possibly have much, much shorter keys?

Well, so the bad news is the Shannon, after proving that the one-time pad has

perfect secrecy, proved another theorem that says that in fact if a cypher has

perfect secrecy, the number of keys in the cypher must be at least the number of

messages that the cypher can handle. Okay, so in particular, what this means is if I

have perfect secrecy. Then necessarily the number of keys, or rather the length of my

key, must be greater than the length of the message. So, in fact, since the one

time pad satisfies us with equality, the one time pad is an optimal, cipher that

has perfect secrecy, okay? So basically, what this shows is that this is an

interesting notion. The one time pad is an interesting cipher. But, in fact, in

reality, it's actually quite hard to use. It's hard to use in practice, again,

because of these long keys. And so this notion of perfect secrecy, even though

it's quite interesting, basically says that it doesn't really tell us the

practical cyphers are going to be secure. And we're gonna see, but, as we said, the

idea behind the one time pad is quite good. And we're gonna see, in the next lecture,

how to make that into a practical system.