0:03

Hi, everybody. Ed Amoroso here, and in this video,

I want to tell you a little bit about a very special function that we

use in computer science and certainly in cyber security called the Hash Function.

Now, if you're a student of computer science or mathematics,

and you know that a function is so central to just about everything we do.

Functions associate elements of a set we call

domain with elements of the set we call a co-domain.

And mathematicians come up with all these different properties of functions and

different types of characteristics of how the domain, co-domain or map.

None of that's all that important for

the concept I want to cover in this video which is known as a hash.

Now what a hash does,

the defining characteristic from the perspective of the cyber security analyst is,

it takes basically variable-length input.

Think about your email or something you write or a paper or a database,

the size of that is going to vary depending on what type of data

you're putting into the database or what type of

sentences you're typing or what you've included in your email.

Clearly variable, unpredictable.

But, the hash produces a fixed-length output and what you're

trying to do is make sure that the co-domain

is large enough that you're not going to have collisions.

It's not going to be the case that two different things will produce the same hash.

Obviously, metaphysically, that's possible but for the most part,

it's not practical to consider that as a possibility.

So you're going from variable-length input to fixed-length output. Really useful thing.

If you've ever used,

for example, like a checksum,

that's kind of a trivial example of a hash.

Albeit, a very small co-domain where you're seeing the parity of something,

adding up the variable-length input and saying whether it's odd or even.

In some sense, it's a hash but not a very secure one.

So with hashing, we try and use that as a means for

doing something we refer to as a digital signature.

Now the beauty of digital signature is it allows us to sign, electronically sign,

or authenticate the document or content or a message, whatever it is you want to do.

So let's think it through. There's a number of different steps

that we often see in the context of digital signature.

Let's imagine, again usually, Alice and Bob, right?

We always like to start with Alice and Bob,

and let's assume that they've gone through the process of generating

a public and secret key pair. Fair enough?

And Alice wants to generate some sort of message,

send it to Bob, and sign it.

Maybe she's signing a contract,

buying something from Bob.

It can be a million different reasons why

she might want to sign it and it doesn't even matter whether it's a secret message.

We could do that separately.

We would encrypt with the public key of the recipient if we wanted to make it secret.

But for digital signature,

we're going to do something different.

Now what we want to do, in some sense,

is come up with sort of a fingerprint of the message and that's where the hash comes in.

It allows us to take the message that Alice has created,

run a hash function on it,

and produce output, something that sort of by convention,

we usually refer to the output of digital signature abstractly as Lambda.

So I run the hash function on a message m, and I produce Lambda.

Now, I assume that both Alice and Bob have

access to the same hash function or hash algorithm.

If they were using keys, cryptographic keys,

with the hash, we'd refer to that as a message digest algorithm.

So hash plus key, message, digest.

But for now, we just assume we're using a hash.

So Alice writes her message m, whatever it is.

She runs a hash function on it to produce Lambda.

And then what she will do, typically,

is encrypt Lambda using her secret key,

which only Alice has.

So yes, you can maybe create an envelope there.

Maybe she has her name,

the time, the Lambda,

a bunch of stuff, or just trivially,

usually we'll just say, her name and the Lambda.

And we encrypt that with the secret key of Alice,

send it off to Bob, right?

So in some sense, like a signed message now.

So when it gets over to Bob, what happens is,

we realize that Bob is going to want to somehow figure out,

where did this come from?

What's the authenticity?

So how the heck is he going to decrypt something that's signed by Alice?

Well, it's easy, because

the presumption is that everybody has everyone else's public key, right?

So if I sent something signed with my secret key,

I send it to Bob, Bob just takes my public key,

decrypts it, and what happens then is he gets Lambda,

and he gets whatever else is in there,

the name, the message.

Definitely, the message would be sent.

It can either be encrypted or sent separately.

In our diagram here,

we show it is being encrypted in the digital signature, doesn't have to be.

But what Bob will then do is take the message and apply

a local hash to it to provide some local Lambda, say Lambda prime.

He'll take the Lambda that Alice sent,

the lambda prime generated locally, compare them.

If they match, "Hello, Alice, it's you."

There's, in some sense,

proof that it was Alice who signed

the thing because Alice used her secret key to encrypt.

I used the public key to decrypt,

compared the digital signatures,

boom, we're in pretty good shape. Does that make sense?

It seems to me to be a very clever way to produce digital signatures,

because when I first started learning about this as a student,

I thought, what do you put in a digital signature?

Do you put your name? Do you sign it?

And then encrypt it? It seems dumb.

Like, what would I put in there? What kind of content?

"Hi, I'm Ed Amoroso."

Send it off encrypted.

It's kind of a hokey content for digital signatures.

So the idea that you would use a hash function

provides a good reasonable piece of content to embed in the signature,

but it gives you one more thing.

If the message m should happen to be

garbled on transmission from Alice to Bob across the network,

the hash test won't work, right?

So it gives you a little bit of integrity protection on top.

So it's kind of nice.

Hashing enables different types of applications,

the two most obvious being digital signature,

which provides authenticity, and

also some integrity management because

the hash functions wouldn't match if the message got garbled.

So, as we often do,

a little quiz here to test your understanding of hashing and digital signature.

And the answer is B, spoofing.

I mean, that's the whole point.

So we're trying to make sure that we have some authenticity,

that we can authenticate,

that we can sign something,

send it to a recipient,

and that recipient would have confidence that in fact,

it came from the originator,

Alice, claiming to be Alice.

So it really does reduce that spoofing risk.

I hope this has been useful and you'll see in

subsequent videos that hashing is going to be

the basis or one of the base components for something that we refer to as block chain.

So I'll see you in one of the upcoming videos.