And so, that we can see here now that the number of hospitals varies.

So, for each hospital there's a count in the number of employees.

I've now called that capital B, capital B for the number of elements in the cluster

sub alpha B because it varies from alpha equal to 1, to alpha equal to 12.

So the hospital number is the value of alpha that changes.

The B sub alpha is the size.

We can see that these sizes vary from 60.

Two of them have about 60 employees.

To one of the hospitals having many times that number, 1860.

So we've got a big variation here in the size, probably 30 to 1.

And that gives us some pause.

Because suppose what we decide to do is draw a sample.

We've figured out that we can afford to do about 100 employees.

And we're just going to do two hospitals just to keep it simple here.

Now ordinarily in practice we'd want more hospitals than this, but

we're going to do this sample by selecting 100 employees from two hospitals.

So, we're going to group our data, so

that we can do more interviews in each hospital.

I added up the sizes there are 6,000 employees across these 12 hospitals.

So, an average of 500 but they vary from 60 to 1,860.

Okay, so I am going to do a hundred from 6,000 that is my sampling fraction though

we haven't talked in these terms before, but that's really what I've specified.

Now by specifying that sample size, and knowing what the population size is,

I'm taking one out of every 60 employees there.

That's my sampling rate.

And we're going to do this by first selecting two of the hospitals.

And let's say we are going to do a simple random sample, as we did before.

We're going to generate two random numbers from 1 to 12.

And take those two hospitals without replacement.

That's a first stage rate then of 1 in 6.

1/6th of the 2/12th.

So that means that if I've now said that I'm going to do 100 employees from

the 6,000 hospitals, that's the overall rate.

That's the overall f.

And then, I'm taking 1/6th of the hospitals to get there.

I've also forced something.

Maybe I didn't realize it at first, but I forced something.

If I'm taking one-sixth of the hospitals to get one-sixtieth

of the employees overall, I can only take one-sixtieth of

the employees in each hospital to pull this off.

Those two numbers, one-sixth times one-tenth have to equal one-sixtieth.

And what I did was I sort of naturally thought about what's the overall rate?

1 in 60.

Now let's start in hospitals, take 2 from 12, 1/6th of them.

That forces me to have 1/10th from each of the hospitals.

Okay, all right, that sounds, okay.

I can do some kind of sampling there by figuring out what one-tenth

would be in each case and draw that simple random sample,