Let's talk about hyper-parameters of convolutional layers. One hyper-parameter is the filter size. We can have a three-by-three filters, when it overlaps with the image it will calculate the convolution and assign a value in the middle pixel. Five-by-five, little bigger. We will calculate some convolution according to 2D convolution rule, and then assign the output value in the middle pixel here. As you can see, when you do the operation, scan through the image, your feature map size will be like this shaded region, and as you can see they lost one outermost cubes, or squares, or pixels. But it becomes more number of loss on this outer-parameter. I'm going to use a larger filter size, so for five-by-five filter, you lose two lines each side because they have to calculate things only in the middle pixel. This final shaded region would be your feature map size. There's something called padding. As you can see, three-by-three filters makes you lose one line, the outer perimeter, so your feature map becomes smaller one by one pixel per each side. If you don't want that, you can actually add some extra one layer on each side like this so that you can do the convolution operation in the padded image, then you will end up with the same size as the original input image. The same size of feature map as the original input image. That's called the padding, and it's also a design consideration along with the filter size. Another design consideration is called the stride. So far we saw our convolution filter only shift one pixel at a time to the right. When you calculate this, we slide one by one side and then calculate this one and slide again, then calculate this one and so on. That was stride equals 1, but you don't have to stick with that if you want to make your feature map output aggressively smaller, then you can use strides more than two for example. In this case, stride is two, that means my filter is this, and then calculate something here, and then I go one-two site instead of one site, my filter is now here, and then calculate the value here, and then jump to site again, calculate here, two site again, calculate here, and at the end of the day if you just calculate the values together, then your feature map becomes this smaller. Roughly, if it was n by n image, then you would get roughly n/2 by n/2, but it depends on also your filter size. I'm thinking about three by three filter here, but it could be different if you have different thing. Also, whether you have an even number of pixels or a odd number of pixels, they'll find out dimension can be slightly different but roughly the dimension becomes like this. Another important hyper-parameter is the number of filters. When you have many number of filters, your depths for the feature map dimension can be that many, and this weight matrices or filter, they are all synonyms, their weight, their filters, sometimes you will see in some literature, they're also called kernels. They are same meaning in the deep learning or convolutional neural network terminology. Anyway, filter size, number of filters, padding, strides, those are hyper-parameters. That means those are your design choices. However, the values inside of this filter, what value it should be, is not hyper-parameter. It's going to be determined after your optimization, after training your neural network model. They will update the value of each element inside of a filter, that's why this filter is called the weights, because each element inside is it's individual weight.