0:00

Hello and welcome back, I want to talk now about a half transform.

The half transform is a really nice technique to detect objects, to segment

out objects, for which you know their shape.

Maybe a straight line, maybe a circle, as we have seen in the previous example.

For example we want to detect the ball. We know it's circular.

So can we include that information in our detection, in our segmentation technique,

to detect that it's actually circular? Can we use the information that we're

expecting here as a straight line, and basically get a straight line.

So the half transform allows to do that, and in a very simple and really, really

nice fashion. Let's illustrate that.

The basic idea is quite simple, and we're going to start with ideas on how to

detect straight lines, and then we're going to move to circles.

But starting with straight lines makes it much, much easier to understand.

In school I mean, probably in elementary or middle school you have learned how to

represent a straight line that is on a plane.

And I'm going to write down the equations and explain them to you again.

So here we have the plane. This is a image plane, the x y.

This is the image plane and we have points on a straight line on the plane.

For example this straight line might be this one or this one.

Now a straight line on a plane is represented, can be written down in the

following form. We have rho, and I'm going to explain

each one of these symbols in a second, is X cosine theta plus Y sine theta.

So rho is the distance from the coordinate system's center to the

straight line, and this should be perpendicular, of course.

It's a distance. Might be a bit tilted in the video, but

it should be, basically, a 90 degrees angle.

So rho is the distance to the straight line, and theta is this angle here that

basically is the angle with one of the axes. Let's pick this one in this case

with the line connecting test center with the straight line that we are

considering. So basically, every point that is on the

line holds this equation. Every single point, this one and this one

and every point in the line will hold this equation.

Using that, here is the idea of the half transform.

We take a point on the image and that comes from the edge detection.

So basically every point that the edge detector said, there is an edge going

through. For every point, we basically have an

infinite number of lines going through it.

This is one of them, with a particular rho and a particular theta.

Now, let's draw a different one. This line has its own rho and its own

theta. This will be rho, this will be theta.

We can go and draw a different line, all the lines going through the point.

See, here is another one. It has its own rho, here, and its own

theta. And we can keep doing that, we can pick

yet another one. For example, let us pick this one.

And, he had rope. And it has theta.

So, every point has an infinite number of lines.

And basically, every point will go and say, any of those lines is okay with me.

The point will go and vote for. Ro.

And theta for a series of them. It should be voting for an infinite

number but we're going to discrotize we're going to do this on the computer so

we won't be able to vote for an infinite number.

We're going to vote for discreet space of row and feta.

Now that's only one point and basically every possible line that goes through

that point will get one vote. Okay.

Now what happens with this other point? This other point has it's own lines that

go through it. So it's going to go and vote for it's own

lines. It's going to vote for this one of

course. But it's also going to vote, for example,

for this one. That has its own rho and its own theta.

And it's going to vote, and I don't want to draw too many, because it will make

the picture very unclear. But it's going to vote for another line

here and another line here and so on. So it's basically going to vote also for

a series of rho and theta. And all the votes will get one vote.

But look what happened. This line is common to both of them.

And we know that in Euclidean space through two points, there is a single

line. So there is only one line.

The one that we are looking for, that's going to get two votes.

Basically, and that will help us to understand what's the row and the theta

for this line. But of course not only two vaults,

because there's going to be another point here, that's also going to vote for a

series of rho and theta, but is going to vote for this one.

So this one is now getting three votes, one for each point that the edge detector

found, that is on that line.

now of course the edge detector might find points that are outside of that

line, and those will also make their own votes, but hopefully the only line that

gets a large number of votes is the line of interest.

So that's the basic idea of the Hough transform, and we're going to provide

more details in a second but every point that the edge detector find votes for the

possibilities, in this case, lines. If we have multiple points on the same

line, then we're going to have multiple votes.

So how do we do that in reality. So, in reality, we are going to go from

the XY coordinate system, which means the image coordinate system, to a new

coordinate system, which is theta and rho.

That coordinate the system, the same way that our image is discretized,

is going to be discretized. So theta can be anything around the

circle, and we basically discrotize as fine as we want or as fine as our memory

or our, our computational time allow us. Let's say we can discretize every one

degree. We do the same for rho, we discrotize it.

We know that row can not go further than the size of the image, so we know it's

bounded, and then we basically discretize it.

And then what we're going to do, let me write the equation again, with half the

equation, rho x, cosine theta plus y sine theta.

So what we are going to do is the following.

We go through every point, let's say this one.

So every point gives us x, y, the coordinates of that point.

Then we run through theta, and we say, let's say, one degree.

So we have the coordinate x and y. We put cosine of one, sine of one.

That gives us a rho. We go and vote for that one.

We, let's say, add one vote. This is sometimes called the accumulator.

And then we go ascertain, theta now equal two.

We have x, y. We define theta 2.

We have cosine of two plus sine of two. We get a new rho.

We go and vote for that one as well. And we keep voting for every single

point. Now it's not very hard to see that once

you fix x and y, if you vary theta as here, or as here, rho changes in a

sinusoidal fashion. So, that's very easy and it counts on the

cosine and sine. So your votes will look like a sinusoidal

function for every single point. Remember now when I go to a second point,

I will got, I will get more votes, a different sinusoidal.

So I will get different votes, and there will be a place, the place for responding

to this straight line, that is starting to accumulate votes.

Everything, for every point, we are going to have

that sinusoidal. We vote basically, on that sinusoidal,

and we start to accumulate. And now, the next step is to go into the

accumulator, and find the maxima. It may be multiple, if there are multiple

straight lines, but we go and find, and say, oh,

this is the rho and theta that many points like,

and that's how we detect these lines. So, let me just run you through the

pipeline again. You start from the image, and you do an

edge detection. Every point.

Some of them in the lines that you care about.

Some of them outside. Every point defines an equation like

this. Every point gives you the x, y

coordinates. So you go into the accumulator space, and

you vote sinusoidal votes. Or you just run through the thetas in the

discretized space and you vote for every rho that you get from this equation.

Every point a vote, then you finish, you basically say, lets go and look which

areas, which basically entries in my accumulator have enough bolts that means

that there is a straight line going through them.

Now this concept of voting goes very far, and you can actually do many, many things

beyond detecting straight lines. Let us show one example.

10:31

The basic, so here, we have four, five points, and then they accumulators so its

just going and you will see there is a straight line so there will be three

votes for that, and there is another straight line that will should be three

votes. Two votes, two votes, and so on.

For this very, very particular example, I just want to use it to show how we

basically get this sinusoidal type of votes for every point.

Now, let's see that on a real image. So what we see here is an aerial image

here. we clearly see that there is some

straight lines. This is the result of edge detection.

In particular, it's a canny edge detector.

Now, it's a bit confusing. From it, we want to get the parametrized

straight line. We want to go the rho, we want to get the

rho and theta for this one. This is the picture of this accumulator

that went through all these points. Now, you don't need to use all of them.

You can just sample some of them. There's plenty of points on this straight

line so you will get very high accumulator just by using a subset of

them. But, this is the result of the

accumulator. And then you go here and you look for the

maximum values. And from then you get the rho and the

theta that correspond to those maximum values.

Remember this is your theta, rho accumulator.

So wherever is the maximum, it gives you a value for theta and a value for rho,

and those corresponds to what we see here, and then you can put it back on the

picture and say, this is my most basically extreme line, my most prominent

line. Then there might be other maximal.

There would certainly be for this line, and this line, and this, and you can go

and pick all of them. In this case, we are only marking the

most prominent one, a very simple and extremely powerful technique.

I want to show you also how we do this for circuits but before that I want to

run this example in mat lab. Let's see that.

We're now as we said inside the mat lab environment.

And let's see the type of operations that we are going to do to illustrate the

Hough transform to the text trade lines. So there is a bunch of commands here but

I'm going to describe them to you. Most of them are just to make nice plot.

I'm going to describe to you those that are important, to understand what

basically the Hough transform is doing here.

So first, as we have seen many times in the past, we read the image.

This is the first operation. Then we're going to rotate that image.

This is just for the fun of it. I don't want you to believe that we can

only detect horizontal or vertical lines, so we're going to rotate that image.

Here, we do edges. Remember, the first thing we need to do

is detect the edges in the image. That's the input to the Hough transform.

And then here it is the Hough transform in MATLAB.

That would basically create this accumulator in the theta rho space.

And these are a bunch of operations, just to illustrate, to visualize.

We're going to run it in a second, I'm just describing the operations for now.

Then, the next, we need to detect the peaks.

We need to detect the maximal in the accumulator.

And that's this operation, and we're going to plot it.

Now we have the maximal, which means that we have the rho, and the theta and, we

need to detect the lines. We actually need to draw those lines.

And that's what this operation does, is basically finds the lines and then the

rest is going to draw them, and actually it's going to draw them as segments and,

with the images, I'm going to illustrate you how we do that.

So let's just run all these operations. and then just see the results.

So, here are the two images. As you see it's pretty fast, and we see

the original image. This is the accumulator, and here, the

squares, the wide squares are basically the peaks the, basically, the most

important maximal in the accumulator. And those are these green lines that we

see here. Once again in the accumulator, we have

rho and theta. Once we have that we can go back into the

image plain and draw the lines. Now you may wonder, but wait a second,

rho and theta would give us straight lines, but all across the image how do we

get to segments if we need to get to segments.

And the basic idea is that there are many ways of doing that.

But one way of doing that is you go and put the line back in the image as we have

here. And then you can go around that line and

see if there actual points in the image which were detected by the age detector

that voted for that line. And you basically can get what actually

was the real segment and not the whole line in the image.

That's just one way of doing that. And as we see here, we got this straight

lines. Very fast, and very simple operation to

get straight lines. There are others.

Just, their votes are weaker, and we need to decide to put a threshold in the

cumulator. We might put a threshold that says don't

give me any line that has less than certain amount of votes,

or we can say give me the ten top lines, give me the, the, the five top lines or

anything that is application dependent. So with this image and with this example

we see the whole pipeline. From the image, detect the edges, create

the accumulator, find the maximal,

draw up those lines back to the image. And if you need,

basically find the segments by looking in the image, where do you have actual

points that voted for that, if you need just the segment.

So now we're going to go back and I'm going to describe what we do for circles.

What about circles then? So here we have an image of circles and

we are going to know if we can basically find the circle using the half transform.

The idea is exactly the same. Now a circle can be written down in the

following form. X minus X zero, that's a X coordinate of

the circle. Square plus Y minus Y zero squared equal

to the radius squared so, every point in the circle holds this equation where the

center of the circle. Is given by this and this.

These are the coordinates of the center, what we need to find.

And the radius of the center, which is also an unknown.

It's given here. If you don't remember this formula, just

pick any of the elementary geometry books, or just believe me that this

formula is the right formula for a circle on the plane.

Now in the same way that we need rho and theta for straight lines, we need to find

the center and the radius for a circle. So we need one.

To three parameters. So now accumu, our accumulator won't be

on the plane. It will be a 3-dimensional accumulator,

but we're going to do exactly the same. We're going to have an edge detector.

Every point is not a vote for a circus every point is going to give us.

Its coordinates. And then we are going to run through the

accumulator on the X coordinate of the center, the Y coordinate of the center,

and get the radius. So we are going to run through that,

through the Xs, run through the Ys. And then, get the radius.

We accumulate, we pick the maximum one, and we basically are done.

We got our center and radius. So, let us see this straight one example.

Here we have the edges and I want to repeat that once again.

Every point here will vote for multiple centers, multiple radius.

We can of course limit the centers, because we know the dimensions.

We can also limit the votes for the radius, because we also know how large

the objects can be in the image. We are going to vote for that.

There are going to be multiple votes for the real circles in the images and then

we basically detect them and here they are.

We basically went through that cummulator, and find the top votes.

And we get out results. Basically the Hough transform can be used

for any parametric curve. Anything that we need to vote for some

parameters of that curve. As rho and theta, the centers we can find

parabolas, we can find any, any curve that we can write with parameters.

The more parameters you have, the more expensive the Hough transform is.

So normally, the Hough transform is used for finding straights line, circles,

ellipses, relatively simple objects that have a

controllable number of parameters. Otherwise, your accumulator space is

huge, takes a lot of memory, and takes a lot of computational time, both to

compute it and to find the maximal. But for those simple shapes, there's,

one of the best techniques out there is the Hough transform.

So I hope you enjoy this video and I'll see you in the next video.

Thank you very much.