0:00

You have learned a lot about ConvNets, everything ranging from

Â the architecture of the ConvNet to how to use it for image recognition,

Â to object detection, to face recognition and neural-style transfer.

Â And even though most of the discussion has focused on images,

Â on sort of 2D data, because images are so pervasive.

Â It turns out that many of the ideas you've learned about also apply,

Â not just to 2D images but also to 1D data as well as to 3D data.

Â Let's take a look.

Â In the first week of this course, you learned about the 2D convolution,

Â where you might input a 14 x 14 image and convolve that with a 5 x 5 filter.

Â And you saw how 14 x 14 convolved with 5 x 5,

Â this gives you a 10 x 10 output.

Â And if you have multiple channels, maybe those 14 x 14 x 3,

Â then it would be 5 x 5 that matches the same 3.

Â And then if you have multiple filters, say 16 filters, you end up with 10 x 10 x 16.

Â It turns out that a similar idea can be applied to 1D data as well.

Â For example, on the left is an EKG signal, also called an electrocardioagram.

Â Basically if you place an electrode over your chest, this measures

Â the little voltages that vary across your chest as your heart beats.

Â Because the little electric waves generated by your heart's beating can be

Â measured with a pair of electrodes.

Â And so this is an EKG of someone's heart beating.

Â And so each of these peaks corresponds to one heartbeat.

Â So if you want to use EKG signals to make medical diagnoses, for

Â example, then you would have 1D data because what EKG data is,

Â is it's a time series showing the voltage at each instant in time.

Â So rather than a 14 x 14 dimensional input,

Â maybe you just have a 14 dimensional input.

Â And in that case, you might want to convolve this with a 1 dimensional filter.

Â So rather than the 5 by 5, you just have 5 dimensional filter.

Â So with 2D data what a convolution will allow you to do was to take the same 5 x 5

Â feature detector and apply it across at different positions throughout the image.

Â And that's how you wound up with your 10 x 10 output.

Â What a 1D filter allows you to do is take your 5 dimensional filter and

Â similarly apply that in lots of different positions throughout this 1D signal.

Â And so if you apply this convolution,

Â what you find is that a 14 dimensional thing convolved with

Â this 5 dimensional thing, this would give you a 10 dimensional output.

Â And again, if you have multiple channels, you might have in this case you

Â can use just 1 channel, if you have 1 lead or 1 electrode for EKG, so times 5 x 1.

Â And if you have 16 filters, maybe end up with 10 x 16 over there,

Â and this could be one layer of your ConvNet.

Â And then for the next layer of your ConvNet, if you input a 10 x 16

Â dimensional input and you might convolve that with a 5 dimensional filter again.

Â Then these have 16 channels, so that has a match.

Â And we have 32 filters, then the output of another layer

Â would be 6 x 32, if you have 32 filters, right?

Â And the analogy to the the 2D data,

Â this is similar to all of the 10 x 10 x 16 data and

Â convolve it with a 5 x 5 x 16, and that has to match.

Â That will give you a 6 by 6 dimensional output,

Â and you have 32 filters, that's where the 32 comes from.

Â So all of these ideas apply also to 1D data, where you can have the same

Â feature detector, such as this, apply to a variety of positions.

Â For example, to detect the different heartbeats in an EKG signal.

Â But to use the same set of features to detect the heartbeats even at different

Â positions along these time series, and so ConvNet can be used even on 1D data.

Â For along with 1D data applications, you actually use a recurrent neural network,

Â which you learn about in the next course.

Â But some people can also try using ConvNets in these problems.

Â And in the next course on sequence models, which we will talk about

Â recurring neural networks and LCM and other models like that.

Â We'll talk about the pros and cons of using 1D ConvNets versus some of those

Â other models that are explicitly designed to sequenced data.

Â So that's the generalization from 2D to 1D.

Â How about 3D data?

Â Well, what is three dimensional data?

Â It is that, instead of having a 1D list of numbers or a 2D matrix of numbers,

Â you now have a 3D block, a three dimensional input volume of numbers.

Â So here's the example of that which is if you take a CT scan,

Â this is a type of X-ray scan that gives a three dimensional model of your body.

Â But what a CT scan does is it takes different slices through your body.

Â So as you scan through a CT scan which I'm doing here,

Â you can look at different slices of the human torso to see how they look and

Â so this data is fundamentally three dimensional.

Â And one way to think of this data is if your data now has some height,

Â some width, and then also some depth.

Â Where this is the different slices through this volume,

Â are the different slices through the torso.

Â So if you want to apply a ConvNet to detect features in this

Â three dimensional CAT scan or CT scan, then you can generalize the ideas from

Â the first slide to three dimensional convolutions as well.

Â So if you have a 3D volume, and for

Â the sake of simplicity let's say is 14 x 14 x 14 and

Â so this is the height, width, and depth of the input CT scan.

Â And again, just like images they'll all have to be square,

Â a 3D volume doesn't have to be a perfect cube as well.

Â So the height and width of a image can be different, and

Â in the same way the height and width and the depth of a CT scan can be different.

Â But I'm just using 14 x 14 x 14 here to simplify the discussion.

Â And if you convolve this with a now a 5 x 5 x 5 filter,

Â so you're filters now are also three dimensional

Â then this would give you a 10 x 10 x 10 volume.

Â And technically, you could also have by 1, if this is the number of channels.

Â So this is just a 3D volume, but your data can also have different

Â numbers of channels, then this would be times 1 as well.

Â Because the number of channels here and the number of channels here has to match.

Â And then if you have 16 filters did a 5 x 5 x 5 x 1 then the next output

Â will be a 10 x 10 x 10 x 16.

Â So this could be one layer of your ConvNet over 3D data, and if the next

Â layer of the ConvNet convolves this again with a 5 x 5 x 5 x 16 dimensional filter.

Â So this number of channels has to match data as usual, and

Â if you have 32 filters then similar to what you saw was ConvNet of the images.

Â Now you'll end up with a 6 x 6 x 6 volume across 32 channels.

Â So 3D data can also be learned on,

Â sort of directly using a three dimensional ConvNet.

Â And what these filters do is really detect features across your 3D data,

Â