0:00

The Triplet Loss is one good way to learn

Â the parameters of a continent for face recognition.

Â There's another way to learn these parameters.

Â Let me show you how face recognition can also be posed

Â as a straight binary classification problem.

Â Another way to train a neural network,

Â is to take this pair of neural networks to take

Â this Siamese Network and have them both compute these embeddings,

Â maybe 128 dimensional embeddings,

Â maybe even higher dimensional,

Â and then have these be input to

Â a logistic regression unit to then just make a prediction.

Â Where the target output will be one if both of these are the same persons,

Â and zero if both of these are of different persons.

Â So, this is a way to treat face recognition just as a binary classification problem.

Â And this is an alternative to the triplet loss for training a system like this.

Â Now, what does this final logistic regression unit actually do?

Â The output y hat will be a sigmoid function,

Â applied to some set of features but rather than just feeding in,

Â these encodings, what you can do is take the differences between the encodings.

Â So, let me show you what I mean.

Â Let's say, I write a sum over K equals 1 to 128 of the absolute value,

Â taken element wise between the two different encodings.

Â Let me just finish writing this out and then we'll see what this means.

Â In this notation, f of x i is the encoding of the image x i

Â and the substitute k means to just select out the cave components of this vector.

Â This is taking the element Y's difference in absolute values between these two encodings.

Â And what you might do is think of these 128 numbers

Â as features that you then feed into logistic regression.

Â And, you'll find that little regression can have additional parameters w,

Â i, and b similar to a normal logistic regression unit.

Â And you would train appropriate waiting on these 128 features in

Â order to predict whether or not

Â these two images are of the same person or of different persons.

Â So, this will be one pretty useful way to

Â learn to predict zero or one whether these are the same person or different persons.

Â And there are a few other variations on how you can

Â compute this formula that I had underlined in green.

Â For example, another formula could be this k minus f of x j,

Â k squared divided by f of x i

Â on plus f of x j k. This is sometimes called the chi square form.

Â This is the Greek alphabet chi.

Â But this is sometimes called a chi square similarity.

Â And this and other variations are explored in this deep face paper,

Â which I referenced earlier as well.

Â So in this learning formulation,

Â the input is a pair of images,

Â so this is really your training input x and the output y

Â is either zero or one depending on whether you're inputting

Â a pair of similar or dissimilar images.

Â And same as before,

Â you're training is Siamese Network so that means that,

Â this neural network up here has parameters that are what they're

Â really tied to the parameters in this lower neural network.

Â And this system can work pretty well as well.

Â Lastly, just to mention,

Â one computational trick that can help neural deployment significantly, which is that,

Â if this is the new image,

Â so this is an employee walking in hoping that the turnstile

Â the doorway will open for them and that this is from your database image.

Â Then instead of having to compute,

Â this embedding every single time,

Â where you can do is actually pre-compute that,

Â so, when the new employee walks in,

Â what you can do is use this upper components to compute that encoding and use it,

Â then compare it to

Â your pre-computed encoding and then use that to make a prediction y hat.

Â Because you don't need to store the raw images and

Â also because if you have a very large database of employees,

Â you don't need to compute these encodings every single time for every employee database.

Â This idea of free computing,

Â some of these encodings can save a significant computation.

Â And this type of pre-computation works both for this type of

Â Siamese Central architecture where you

Â treat face recognition as a binary classification problem,

Â as well as, when you were learning encodings maybe using

Â the Triplet Loss function as described in the last couple of videos.

Â And so just to wrap up,

Â to treat face verification supervised learning,

Â you create a training set of just pairs of images now is

Â of triplets of pairs of images where the target label is one.

Â When these are a pair of pictures of the same person and where the tag label is zero,

Â when these are pictures of different persons and you use

Â different pairs to train

Â the neural network to train the scientists that were using back propagation.

Â So, this version that you just saw of treating face verification

Â and by extension face recognition as a binary classification problem,

Â this works quite well as well.

Â As sort of that, I hope that you now know,

Â whether it would take to train

Â your own face verification or your own face recognition system one that can do one

Â