[SOUND] Hi, in this video we're going to study the RSA cryptosystem, which allows Alice and Bob to establish some secure communication even thought Eve is always listening. And in case Alice and Bob don't share a secret key yet. So RSA is called so because it was invented by Rivest, Shamir and Adleman in 1978. And there is a really interesting story about how several such group systems were invented relatively the same time. But some of them were classified and this one was public and it was patented. And although it was not supported by the government gradually it became the most frequently used. And now programs based on RSA are among the most frequently run programs in the world. Can read that interesting story more in Wikipedia and we're going to describe the system itself. So RSA is an algorithm for asymmetric or public key encryption, as opposed to one time that is symmetric and private key. So this picture illustrates what generally happens in asymmetric encryption. So instead of Alice and Bob sharing a common secret key, just Alice publishes a public key. And saves for herself a private key and doesn't tell that to anyone. And if Bob after that wants to send a message to Alice like Hello Alice. He can encrypt this message using Alice's public key, which is known to Bob, to Eve, to Alice, to anyone. And then Alice receives this message and she can use her private key to decrypt this message. But no one else knows this private key. And without it, while it is formally possible to still decrypt. But it is so hard that it cannot be done using the publicly known algorithms in any reasonable time, like in thousand or millions of years. So this is the general schema. And now we're going to describe the protocol. So in this protocol we assume that Bob wants to receive messages. The protocol is asymmetric, so the one who wants to receive messages needs to generate two keys, the public key E and the private key D. And if Alice and Bob want to send messages in the direction from Alice to then Bob generates these random keys. If Bob wants to send some messages to Alice, then Alice would have to generate two keys. So in this case, we assume that Alice is going to send messages to Bob. And Bob generates two random keys, public key E and private key D. Then Bob publishes the public key for anyone to access. And then anyone can encrypt message for Bob using E. Not just Alice but also Eve or anyone else can encrypt and send a message for Bob using this public key. But only Bob can decrypt the encrypted message using the private key D. And this operation is going to be fast and easy to implement. However, anyone can also decrypt by trying all possible keys. Because the encryption algorithm itself, the RSA is public. I know it, you will know it soon, and so anyone can try to decrypt the ciphered text by trying all possible keys. But with the algorithms that are currently published, it will take many, many years. So this is what we base our algorithm on. We base it actually not on our strong side. But on our weakness that we don't know how to do something very fast or at least with some reasonable speed. And this is what allows us to send messages securely. So as soon as someone comes up with a very clever algorithm to do the things we currently think are not possible faster, well, every scheme based on this algorithm will become unsecure immediately. So, what about the keys themselves? In this particular algorithm Bob generates first two big random primes, p and q. And this is where we start to actually use some number theory. So D generates the random primes p and q, and big means thousands of bits. Not just numbers like a few hundreds, but thousands of bits, or hundreds of thousands of decimal digits. That's pretty large numbers. And he computes their product. This is, although it is hard to do by hand. This is considered pretty a easy operation on a computer. So, after that, Bob generates, also, a random number e, which must be coprime with this product (p- 1)(q- 1). You might remember this number of (p- 1)(q-1), as the order function of n, which is equal to pq, but we'll talk about that later. For now this is just this product (p- 1)(q- 1). After that, public key E is just the pair of number n and the small e. And this is going to become known to everyone. And private key D is the pair of primes p and q. So people know the product of p and q, but no one knows p and q, themselves. So we rely basically on the fact that given just their product, we cannot say what were the initial numbers. So the encryption and decryption goes as following. We take message m, which can be encoded as a sequence of bits as usual, and that is converted to an integer. And to be able to use this message, it needs to be between 0 and n- 1, where n is the product of p and q which is the public key. So we need to choose p and q big enough, so that this length in bits is sufficient for the messages we want to send. Then anyone who wants to send message m can create a cypher text c, which is equal to the modular exponent m to the bar of e, modular n. And to do this encryption first you use the first modular exponentiation algorithm from the precious module. To do the decryption it turns out that both can compute and actually pre-compute such number d that c to the power of d module n will be equal to the initial message m. So this d is not publicly known, and it's only Bob who can really compute such d fast. But in general, such d exists, and this is what allows us to decrypt quickly. Because to compute c to the power of d module n, we, again, can use fast modular exponentiation algorithm. So how do we compute such d and where to take it? So let's assume we already have such d, what needs to happen then? c is our ciphertext, we make the exponent c to the power of d. This is equivalent to taking the initical c, which is equal to m to the power of e modulo m. And then taking the d exponent of that. So m to the e and all that to the d. And this is equal to m to the ed module n. So what we need is that m to the power of ed modulo m is equal to message m itself. Now we know that n, the module, is equal to the product of two primes, p and q. And of course those are two different primes, so they're coprime. So we can use the Chinese remainder theorem here. And by Chinese remainder theorem, this module equality m to the power of ed is equal to m mod n is equivalent to two new equalities. m to the power of ed is equal to m mod p. And m is equal to m to the power of ed equal to m mod q. because p and q are coprime, we use the Chinese remainder theorem. Now, how to achieve that? We need these to equalities, they're pretty similar, just modular different prime numbers. As these are prime numbers, we can use the Fermat's little theorem. And we know that with Fermat's little theorem, we can optimize the computation of modular exponent. And, in particular, for any m and any k, m to the power of k is equal to m to the power of k mod(p- 1), mod p. So the above will hold if ed is equal to 1 mod (p- 1), and ed is equal to 1 mod (q- 1). because we just instead of taking m to some exponent equal to m to the power of 1. We just take the exponent ed and say that it is going to be equal to the exponent 1 on the right hand side of the multiplication, mod (p- 1)and the same mod (q- 1). So this is what we need again. This is not what is already satisfied, this is what we need. And again, to satisfy this we can make an even stronger request. We can request that ed is equal to 1 mod (p- 1)(q- 1). If this is true then of course the above is also true. Because (p- 1) and (q- 1) are divisors of (p- 1) and (q- 1). And we know that the remainder, modular number defines the remainder module any its divisor. So remember that when Bob generated e, we required that it should be coprime with (p- 1)(q- 1). This is what is going to help us now. So, Bob can use the extended Euclid's algorithm to compute such d that ed is equal to 1 mod (p- 1) (q- 1) why is that? Because e is coprime with the module, so multiplication by e is invertible. And so there is an inverse element, d, which gives 1 after multiplying by mod (p- 1)(q- 1). So not only it exists, but we know the algorithm to do that and this is called the extended Euclid's algorithm. So, the decryption algorithm is compute (p- 1)(q- 1). Bob knows both p and q, so this is easy to do. Then compute d, which gives ed is equal to 1 mod (p- 1)(q- 1) using the extended Euclid's algorithm. And actually, Bob can compute and store this d right after generating numbers p, q, and e. And then use it, and not just compute it every time someone sends a message to him. So to decrypt c iphertext c, he will just computer the modular exponent of c to the power of d mod n using fast modular exponentiation algorithm. And the communication protocol is as following. Alice represents message m as a number 0 and n- 1. She computes and sends the ciphertext m to the power of e mod n. Bob receives this c and computes m back by c to the power of d mod n. So what do we use here? We know that n is publicly known, but its factorization, prime factorization is secret. And rely on the difficulty of factorization. If some day an efficient integer factorization algorithm appears, RSA will immediately become insecure. That is for now, we don't any efficient algorithm for factoring big integers. Actually, we also rely on the fact that it is difficult to compute the number (p- 1)(q- 1), without knowing p and q. Otherwise, even if she couldn't factorize m, she could still compute d, because it only uses the (p- 1)(q- 1) number. However, computing this number which is actually Euler's totient part function of m is equivalent to factorizing n. We can show that if you can compute this number, then you also have an efficient algorithm for factorizing n in the first place. And encryption and decryption can be equivalently explained using Euler's phi function and Euler's theorem instead of Fermat's little theorem to know that for any m and k, m to the power of k is equal to m to the power of k mod phi of n, mod n. And you can do this as an exercise following the slides where I explain the same using Fermat's little theorem. In general, to decrypt we don't even need the (p- 1)(q- 1). All we actually need is to solve the modular root problem. So we are given the ciphertext, c, and you are given the public exponent e. And you need to find such m that m to the power of e = c mod n. Or which is the same as finding the modular root of c, but not just regular root of a number. But just the e-th modular root of c. Again there are no known efficient algorithms to solve this problem. However, there are known algorithms which solve modular root problems and avoid factorization. So in theory, it could be possible to solve this problem efficiently, but still not solve the factorization problem efficiently. So these two problems are are not necessarily equivalent. However all the known algorithms that do that are still inefficient and the scheme is still secure. So the cryptoanalysts have been working on different attacks against RSA for decades. Because it is always a science of coming up with a cipher and then trying to break it from different points of view. There are, of course, accurate implementations, which aren't broken, or at least aren't very much broken. Still regularly receives some attacks which breaks some ciphers. However, there are a lot of subtle details. And missing them leads to a breakable cipher. And you'll learn in the next lecture several attacks and break several ciphers as an exercise. So see you in the next video. [SOUND]