0:00

In terms of designing content architectures,

Â one of the ideas that really helps is using a one by one convolution.

Â Now, you might be wondering,

Â what does a one by one convolution do?

Â Isn't that just multiplying by numbers?

Â That seems like a funny thing to do.

Â Turns out it's not quite like that.

Â Let's take a look.

Â So you'll see one by one filter,

Â we'll put in number two there,

Â and if you take the six by six image,

Â six by six by one and convolve it with this one by one by one filter,

Â you end up just taking the image and multiplying it by two.

Â So, one, two, three ends up being two,

Â four, six, and so on.

Â And so, a convolution by a one by one filter,

Â doesn't seem particularly useful.

Â You just multiply it by some number.

Â But that's the case of six by six by one channel images.

Â If you have a 6 by 6 by 32 instead of by 1,

Â then a convolution with a 1 by 1 filter can do something that makes much more sense.

Â And in particular, what a one by one convolution will do is it will

Â look at each of the 36 different positions here,

Â and it will take the element wise product between

Â 32 numbers on the left and 32 numbers in the filter.

Â And then apply a ReLU non-linearity to it after that.

Â So, to look at one of the 36 positions,

Â maybe one slice through this value,

Â you take these 36 numbers multiply it by 1 slice through the volume like that,

Â and you end up with

Â a single real number which then gets plotted in one of the outputs like that.

Â And in fact, one way to think about

Â the 32 numbers you have in this 1 by 1 by 32 filters is that,

Â it's as if you have neuron that is taking as input,

Â 32 numbers multiplying each of these 32 numbers in

Â one slice of the same position heightened with by these 32 different channels,

Â multiplying them by 32 weights and then applying

Â a ReLU non-linearity to it and then outputting the corresponding thing over there.

Â And more generally, if you have not just one filter,

Â but if you have multiple filters,

Â then it's as if you have not just one unit, but multiple units,

Â taken as input all the numbers in one slice,

Â and then building them up into an output of six by six by number of filters.

Â So one way to think about the one by one convolution is that,

Â it is basically having a fully connected neuron network,

Â that applies to each of the 62 different positions.

Â And what does fully connected neural network does?

Â Is it puts 32 numbers and outputs number of filters outputs.

Â So I guess the point on notation,

Â this is really a nc(l+1),

Â if that's the next layer.

Â And by doing this at each of the 36 positions,

Â each of the six by six positions,

Â you end up with an output that is six by six by the number of filters.

Â And this can carry out a pretty non-trivial computation on your input volume.

Â And this idea is often called a one by one convolution

Â but it's sometimes also called Network in Network,

Â and is described in this paper,

Â by Min Lin, Qiang Chen, and Schuicheng Yan.

Â And even though the details of the architecture in this paper aren't used widely,

Â this idea of a one by one convolution or this

Â sometimes called Network in Network idea has been very influential,

Â has influenced many other neural network architectures

Â including the inception network which we'll see in the next video.

Â But to give you an example of where one by one convolution is useful,

Â here's something you could do with it.

Â Let's say you have a 28 by 28 by 192 volume.

Â If you want to shrink the height and width,

Â you can use a pulling layer.

Â So we know how to do that.

Â But one of a number of channels has gotten too big and we want to shrink that.

Â How do you shrink it to a 28 by 28 by 32 dimensional volume?

Â Well, what you can do is use 32 filters that are one by one.

Â And technically, each filter would be of dimension 1 by 1 by 192,

Â because the number of channels in

Â your filter has to match the number of channels in your input volume,

Â but you use 32 filters and the output of this process will be a 28 by 28 by 32 volume.

Â So this is a way to let you shrink nc as well,

Â whereas pulling layers, I used just to shrink nH and nW,

Â the height and width these volumes.

Â And we'll see later how this idea of one by one

Â convolutions allows you to shrink the number of channels and therefore,

Â save on computation in some networks.

Â But of course, if you want to keep the number of channels at 192, that's fine too.

Â And the effect of the one by one convolution is it just adds non-linearity.

Â It allows you to learn the more complex function of your network by adding

Â another layer that inputs 28 by 28 by 192 and outputs 28 by 28 by 192.

Â So, that's how a one by

Â one convolutional layer is actually doing something pretty non-trivial

Â and adds non-linearity to your neural network and allow

Â you to decrease or keep the same or if you want,

Â increase the number of channels in your volumes.

Â Next, you'll see that this is actually very useful for building the inception network.

Â Let's go on to that in the next video.

Â So, you've now seen how a one by one convolution operation is actually doing

Â a pretty non-trivial operation and it allows you to shrink

Â the number of channels in your volumes or

Â keep it the same or even increase it if you want.

Â In the next video,

Â you see that this can be used to help build

Â up to the inception network. Let's go into the-

Â