0:02

You've seen how the convolution operation allows you to implement a vertical

Â edge detector.

Â In this video, you'll learn the difference between positive and negative edges, that

Â is, the difference between light to dark versus dark to light edge transitions.

Â And you'll also see other types of edge detectors,

Â as well as how to have an algorithm learn,

Â rather than have us hand code an edge detector as we've been doing so far.

Â So let's get started.

Â 0:47

What happens in an image where the colors are flipped,

Â where it is darker on the left and brighter on the right?

Â So the 10s are now on the right half of the image and the 0s on the left.

Â If you convolve it with the same edge detection filter,

Â you end up with negative 30s, instead of 30 down the middle, and

Â you can plot that as a picture that maybe looks like that.

Â So because the shade of the transitions is reversed,

Â the 30s now gets reversed as well.

Â And the negative 30s

Â shows that this is a dark to light rather than a light to dark transition.

Â And if you don't care which of these two cases it is,

Â you could take absolute values of this output matrix.

Â But this particular filter does make a difference between the light to dark

Â versus the dark to light edges.

Â 1:42

Let's see some more examples of edge detection.

Â This three by three filter we've seen allows you to detect vertical edges.

Â So maybe it should not surprise you too much that this

Â three by three filter will allow you to detect horizontal edges.

Â So as a reminder, a vertical edge according to this filter, is a three by

Â three region where the pixels are relatively bright on the left part and

Â relatively dark on the right part.

Â So similarly, a horizontal edge would be a three by three region where the pixels

Â are relatively bright on top and relatively dark in the bottom row.

Â So here's one example, this is a more complex one,

Â where you have here 10s in the upper left and lower right-hand corners.

Â So if you draw this as an image, this would be an image which is going to be

Â darker where there are 0s, so I'm going to shade in the darker regions, and

Â then lighter in the upper left and lower right-hand corners.

Â And if you convolve this with a horizontal edge detector, you end up with this.

Â 2:48

And so just to take a couple of examples,

Â this 30 here corresponds to this three by three region,

Â where indeed there are bright pixels on top and darker pixels on the bottom.

Â It's kind of over here.

Â And so it finds a strong positive edge there.

Â And this -30 here corresponds to this region,

Â which is actually brighter on the bottom and darker on top.

Â So that is a negative edge in this example.

Â And again, this is kind of an artifact of the fact that we're working

Â with relatively small images, that this is just a six by six image.

Â But these intermediate values, like this -10, for

Â example, just reflects the fact that that filter here, it captures part

Â of the positive edge on the left and part of the negative edge on the right, and

Â so blending those together gives you some intermediate value.

Â But if this was a very large,

Â say a thousand by a thousand image with this type of checkerboard pattern,

Â then you won't see these transitions regions of the 10s.

Â The intermediate values would be quite small relative to the size of the image.

Â So in summary, different filters allow you to find vertical and horizontal edges.

Â It turns out that the three by three vertical edge detection filter

Â we've used is just one possible choice.

Â And historically, in the computer vision literature,

Â there was a fair amount of debate about what is the best set of numbers to use.

Â So here's something else you could use, which is maybe 1, 2,

Â 1, 0, 0, 0, -1, -2, -1.

Â This is called a Sobel filter.

Â And the advantage of this is it puts a little bit more weight to the central row,

Â the central pixel, and this makes it maybe a little bit more robust.

Â But computer vision researchers will use other sets of numbers as well,

Â like maybe instead of a 1, 2, 1, it should be a 3, 10, 3, right?

Â And then -3, -10, -3.

Â And this is called a Scharr filter.

Â And this has yet other slightly different properties.

Â And this is just for vertical edge detection.

Â And if you flip it 90 degrees, you get horizontal edge detection.

Â And with the rise of deep learning, one of the things we learned is that when

Â you really want to detect edges in some complicated image, maybe you don't

Â need to have computer vision researchers handpick these nine numbers.

Â Maybe you can just learn them and treat the nine numbers of this matrix

Â as parameters, which you can then learn using back propagation.

Â And the goal is to learn nine parameters so that when you take the image,

Â the six by six image, and convolve it with your three by three filter,

Â that this gives you a good edge detector.

Â 5:50

And what you see in later videos is that by just treating these nine numbers as

Â parameters, the backprop can choose to learn 1, 1, 1, 0, 0, 0,

Â -1,-1, if it wants, or learn the Sobel filter or learn the Scharr filter,

Â or more likely learn something else that's even better at

Â capturing the statistics of your data than any of these hand coded filters.

Â And rather than just vertical and horizontal edges,

Â maybe it can learn to detect edges that are at 45 degrees or

Â 70 degrees or 73 degrees or at whatever orientation it chooses.

Â And so by just letting all of these numbers be parameters and learning them

Â automatically from data, we find that neural networks can actually learn low

Â level features, can learn features such as edges, even more robustly than

Â computer vision researchers are generally able to code up these things by hand.

Â But underlying all these computations is still this convolution operation,

Â Which allows back propagation to learn whatever three by three filter it wants

Â and then to apply it throughout the entire image, at this position, at this position,

Â at this position, in order to output whatever feature it's trying to detect.

Â Be it vertical edges, horizontal edges, or edges at some other angle or

Â even some other filter that we might not even have a name for in English.

Â 7:19

So the idea you can treat these nine numbers as parameters to be

Â learned has been one of the most powerful ideas in computer vision.

Â And later in this course, later this week, we'll actually talk about the details of

Â how you actually go about using back propagation to learn these nine numbers.

Â But first, let's talk about some other details, some other variations,

Â on the basic convolution operation.

Â In the next two videos, I want to discuss with you how to use padding as well as

Â different strides for convolutions.

Â And these two will become important pieces of this convolutional building block of

Â convolutional neural networks.

Â So let's go on to the next video.

Â