So, welcome to our lecture on Pubic Key Encryption where we're going to go back to confidentiality. And so, so here we go. If, if you recall we've been having these two topics that have been our theme throughout. just some[INAUDIBLE] . Oh sorry, I'm starting to, talk in route 13. Let's translate this back to non route 13. the terminology, the two kind of themes we've been following over the last couple of lectures and this lecture, are confidentiality and integrity. And confidentiality is hiding, right, shielding information. Not leaking information to people that you don't want to show it to. And integrity is making sure that you know who you're dealing with. And then the previous lecture, we really talked about kind of real light approachable ways of ensuring confidentiality with things like Caesar cipher. And then integrity using a simple message digest, that, based on a shared secret. So, the problem with all of those things that we just saw, is that they require a shared secret. And the problem in the world of the internet is, it's just really difficult. For every one of us before we establish well, before we can make any purchases or whatever at Amazon, that we somehow have to drive to Amazon headquarters. And, and get a shared secret from Amazon. Open up a book and say okay, hi Chuck, I see who you are and here's our shared secret, and you walk away. And as long as you carry that shared secret while you go. And if the shared secret is lost, it's difficult to review, revoke. So, as the internet and, and frankly in general, as security needed to be able to work at arms length. Meaning that you couldn't always bring everybody together and hand out shared secrets. And then have them go to the far reaches of the world and communicate. public key encryption was identified, as an extremely elegant solution to this problem. And so it was proposed by Diffie and Hellman in 1976. And it relies on two keys. It's asymmetric, meaning we're not using the same key to encrypt as decrypt, the way we were in the previous lectures. These are asymmetric. There is a public key, which is actually, does not need any protection whatsoever, and a private key. And the idea is they're generated inside of a computer. You generate the public key and the private key. You send out the public key, the public is used to do the encryption. And then private key is used to do the decryption. And they're related mathematically, in a way that's well understood, but difficult to compute for a key length that's large enough. So, there's a public key and a private key. So, I'd like you to take a look at this little video up on YouTube of Diffie, Hellman, and Merkle, the, the inventors of this. and I think it's a great video. I would love it if this were my video, but I didn't produce this video. So so take a quick look. So, one of the things about this public private key encryption is now that we know about it, it's like wow, it's pretty obvious. And frankly Caesar and the Germans and everybody could have used this idea. They just hadn't thought of it yet. And the other thing that's kind of interesting if you look into the story of this, is that the first reaction people got when they started thinking about this is like it can't be this easy. Now, it's sort of both easy and hard but, but the concept is real elegant and really beautiful, and that is that we have this public key. So, the public key is part of a public private pair, and it's used to do the encryption. The beautiful, beauty is it's computation and difficult to recover that private key from the public key and the encrypted text. A key thing is, is it's not impossible. And that's kind of one of the interesting philosophies of security that, that we started at the very beginning in talking about security. The perfect security is kind of impossible to achieve, unless you simply don't send anything. And so, public private key, asymmetric keys is, is well understood as to how you would break it. Everyone knows how to break it. The problem is, is that computers aren't fast enough to break it, and when computers get faster we'll just make the keys bigger. So, the mathematics of this makes it impractical to break. I mean literally impractical to break. Now I think we can safely assume that, governments probably have enough computation to crack these once in a great while. I mean, they're not cracking every transaction between you and target when you want to buy something. But if they really have to, they can record the encrypted transmissions. And if they really had to and took a long time, I have no idea how long it would be, they can break it. So, that's actually kind of a neat way to think about this. By revealing it all, frankly, any computer scientist could make a name for them their whole life if they proved that there was something wrong with this. By revealing the algorithm, revealing the cracking technique, if someone can come up with a better cracking technique, it is like, fame and glory forever. Which means that, we're pretty sure that there's no good way to crack this other than the brute force mechanism, that requires a large amount of computation. So, if you're going to use public private key encryption, you have to generate a pair. And it starts by charging, choosing two really large random numbers, with hundreds if not thousands of digits that are prime. See you kind of choose a, choose a random number really big. And then you kind of look around for a nearby prime number and you choose two of those. And then you multiply them, okay? Getting an even larger number. And then, through some steps, through some calculations, you compute the public and the private keys from that large number. The essence of this, are those two prime numbers. Prime numbers of course are numbers that you only divide by themselves and one which means they have no factors. Which means they're kind of like looking for a needle in a haystack. And so the public and private key is really based on these two prime numbers. If you could figure out what the prime numbers were, you'd be okay. But the computational difficulty is finding the prime numbers that are extremely large, and finding the right prime numbers that are extremely large. So, it's easy to do some calculations in one direction, but not in other. So, for example, what are the factors of 55,124,159? Quick. But if I simply ask you what do you multiply 7,919 to get that 55 million number. That's easy. You do a division. And it turns out that you can find out 6961 really easy, right? So, if I just say what are these two numbers? That's hard. If I say given this number, what's the other number? That's trivial. So, you can think of this as, the decryption is where the receiver of the message knows kind of half of the calculation. Where as the world doesn't know either half of it. Doesn't know the calculation, so has to figure out both halves. Whereas the receiver only has to figure out one half. And so that's how asking the question of what are the factors, versus given one, what's the other. So it, it takes a problem that's easy, makes it computationally nearly impossible. But again, not impossible, just nearly impossible. Okay, so here's the notion. So, you're about to type your visa card into a credit card into like Amazon's web page. And so what happens is, is that Amazon will has a public key and a private key. That they retain. And they will send you the public key across a medium, the internet. They're going to send this to you somehow. But the bad guys, Eve, or Charlie or whoever they are. The bad guys. This is Alice and Bob. Eve and Charlie are always looking. So, Eve and Charlie could intercept it. And you assume that they can. This is the key. Don't, don't try to pretend they can't. Even though it's very difficult for them to do it. But you assume they can. So, the public key comes across. It is simply sent to you as part of the beginning of establishing a sur, secure connection. And the bad guys see it too, or girls. They see it too. So, the public key comes to you. And then what you do, is you encrypt, using that public key. And create some encrypted text, cipher text. Which you then send back across the danger, where Eve and Charlie are watching. And it comes across, they intercept the encrypted text. They've intercepted the public key. And they, they can try as hard as they like with supercomputers to derive this. And frankly, like I said, if they had months and months and months and really fast computers, they could. Okay, but because Amazon is in sole possession of private key and it never left Amazon servers. It is a very simple matter for Amazon to decrypt and get your plain text. It happens, very quickly. Just like if you kind of know half of the prime number calculation. Figuring out the other prime number is really, really easy. Okay, so, so again, these people see all of this information, and yet it's computationally virtually impossible, for all practical purposes, to do it. And so it's beautiful, because there was no need to protect the public key. We never had to get in the same room, and away it goes. So you just, Amazon just blasts out it's public key and we encrypt using Amazon's public key. We can't decrypt it but we don't need to decrypt it. All we need to do is send it to Amazon and voila, it works. So, the beautiful thing is the public keys can be distributed, they can be intercepted and it does not matter. So, with this notion of public private key encryption in general, we made a change to HTTP. A layer, a mini layer is in the data model. If you remember way back, perhaps you've even forgotten about the layered model. Remember that layered model? Application, transport, Internet. Remember this is sort of one computer and this is the other computer. These are their routers, routers. These are the hops. There's like 15 of these. Remember? Remember all this? So, it comes back now to haunt us. Okay. So, if you recall, just sort of to, to briefly remember that, the transport layer is responsible for the retransmission. It gives us the appearance of a reliable, ordered connection between the, our application and the far application. HTTP is one of the application protocols. and so there is a little mini layer that, that is layered in, sort of seamlessly on top of the transport layer. That basically takes plain text and encrypts it and turns it into ciphertext. And then ciphertext on the way out turns it back into plain text. Okay. And so what happens is, is these applications just send plain text. And out comes plain text. And there's a little bit of extra glue in the middle here, that's sort of a secure transport, secure sockets layer. This is, this thing here is often called a socket, oop, SOCK. It'd be good if I could spell socket. So, this is a socket, and then the red part is a secure socket. So, the applications kind of don't encrypt the data at all. There's a library that encrypts it. And the other thing is, is that all the rest of the internet, the internet, the link layer, the routers nothing, the Ethernet, the fiber. They don't even know the difference between encrypted text or non encrypted text, because the encrypted text wanders around fully encrypted. Addresses are not encrypted. And so it stays encrypted all the way through the entire network. It actually, if you then think about the fact that, this is the moment that it leaves your computer. The only thing, so the plain text comes in here, gets encrypted here, encrypted comes down. The only thing that leaves your computer is encrypted text. And it makes it all the way across. The encrypted text goes into Amazon, so this is Amazon. This is you. The encrypted text finds its way through all these things. And it come in encrypted. And it actually doesn't get encrypted, until it's sort of right at the point where Amazon's web server that's going to actually charge your credit card. So, this is actually beautifully elegant. In that, the rest of the network is blissfully unaware, that any encryption is happening. It's just moving the data. So, this did not require any change. Again the beauty of the layer of architecture. Did not require any change, sort of below the transport layer. And as a matter of fact, all of the sequencing and re-transmission that happens in the TCP layer. That happens with the encrypted stuff too because it's just encrypted. It's just text. It's gibberish text, it's not the original visa card number that you're sending. You're sending 123 and out comes, you know, wxy, the wxy just goes. It's re-transmitted. All this crap just works, it's like, beautiful. It's a beautiful thing. It's absolutely a beautiful thing. Then it's just like this mini layer kind of between, it's like the top slice of the transport layer. That's how I'm drawing it right here. It's like this little kind of top extra little thing, that says you know what, we're going to transport, actually help me out and give me some encryption while we're at it. And there's all kinds of cool stuff that goes back and forth. The public and private keys get exchanged. That's all kind of stuff we don't worry about. We just send data and get data back. Pretty cool, huh? So, this really solves the problem of the fact that we basically should assume, that everything between our computer and the destination computer. This is you. This is Amazon, right? Everything here, this is all dangerous. There's some like, terrifyingly scary individual, that's watching everything, doing packet sniffing. This might be Eve the eavesdropper. This looks like a little he, he looks pretty tough. Right. And, and so that even the wireless, right? This is the wiFi connection. The WiFi is dangerous. Now, the reality is, is these things aren't all that dangerous. The WiFi's probably the weakest link of these whole thing. But we have to assume that it's dangerous, right? We, we want to assume that the only thing that's safe, and unfortunately if you put viruses in your computer then they can get at the plain text. If Amazon loses its data somehow, then they get the plain text, right? But, but basically, you know, we want to distrust all of this, okay? So, this concept is called Transport Layer Security. Also called SSL. Also known as HTTPS. HTTPS for secure, and it's kind of like between the TCP layer and the application layer. Or the top half of the TCP layer is the way I like to think about it. It's because it's based on public private key encryption it's, difficult but not impossible. Normal people don't have the kind of equipment to break it, even governments, if they can break it, I don't know I'm not an expert. I don't hang out with the government, so I don't really know but, assume that if they really put their mind to it, in a very narrow situation. If you become really interesting, they will find your credit cards. Probably there's easier ways to get your credit cards, than by decrypting your text. So, it's hard to decrypt. And as I mentioned, because of the layered architecture, the TCP layer, IP and Link layers are completely unaware. So, you'll, you, you see this in the form of URLs that start with HTTPS. [SOUND]. Right, they start with HTTPS. Dub dub dub facebook.com versus HTTP. And there was a time a few years back where you know, they were, it used to be a little more expensive inside of the servers to do HTTPS. It still is. And so some sites would try to do some of their activity without using secure protocols. And others would use use, use non secure and secure and then flip you back and forth. Like if you're typing your password. The problem was is that there is actually still sensitive data being sent. Even across the insecure. And there was a, quite a famous, thing where people could install a Firefox plug in. And watch Facebook, non secure Facebook things go back and forth across, like, a Starbucks. And it would just show you all the people's Facebook accounts. And you could log in as them and post as them. And so you've seen a situation where companies are just starting to use HTTPS for everything. You, as a user, have to be aware to see if you're typing anything sensitive, never type it into a URL that doesn't say HTTPS. 'Kay? Never do that. You're typing in a password, a credit card number, any kind of personal information. Make sure you're doing HTTPS, and make sure that, that you know what it is. We'll talk about that in a bit, we'll talk a little bit more about that in a bit. So, so if we take a look and we think about where the bad guys are it, the bad guys are kind of everything. And this secure TCP runs, is the one part of the layer of our architecture that runs from within your laptop to within the server. And so, then, if we kind of assume the worst. The, the, the backbone is pretty safe. The wiFi is probably the most dangerous. Alright, but when we do secure system TCP, secure system to system TCP. We are doing the encryption, right here inside your computer before it leaves. And we're only doing the encryption right when it comes back in the computers. So, the decryption and encryption are happening inside of Amazon's computer and inside your computer. And nothing else. So, Secure Sockets is pretty good. Now the place where you're still in danger, is there might be a virus that's watching your keystrokes, right. This is why virus checking is so important. Because at some point, you're typing it into your computer and the, and the greatest danger you have to losing your data is really two things. One, that you've got a virus. Or B, somebody has redirected you not to talk to evil Amazon instead of Amazon. And that's what we'll talk about in the next lecture. How to know, how do these browsers really know they're talking to the real Amazon? And that is not confidentiality. Confidentiality is stopping the bad guy from seeing what you're sending. As they're eavesdropping. Eve is eavesdropping. Okay? Now, the next thing is the question of is this the real Amazon, or is this a fake Amazon? So, we'll talk about that next.