interviewers ability to recruit participants to a take part in interviews.

So this would be really kind of non-response effect of interviewers, or

non-response origin of interviewer effects.

So that what appears to be an effect due to measurement may actually

reflect differences in recruiting by different interviewers.

That is different interviewers might recruit different types of respondents for

whom the true values differs than other interviewers.

And this can happen despite an interpenetrated design

where cases are randomly assigned to interviewers.

If different interviewers are recruiting different respondents

from those that have been assigned to them.

We can see something that's really indistinguishable ordinarily from

interviewer effects due to something about the way the questions are asked.

So West and Olsen demonstrated this phenomena or confirmed that this is a real

possibility using administrative records which served as their true value so

they could see where interviewer behavior was leading to departures from true value.

To see if there were also interviewer effects that were due to non-response or

differences in the true value of the respondents who were recruited.

And what they found was that for

two questions, int was reliably different than zero.

So they focused on those two to see what the records

could tell them about the origins of those effects.

And questions were age when the respondent was married, and

age when the respondent was divorced.

Based on the records, what they found was that the effect for

the question regarding age at marriage was due to measurement error,

kind of the classic interviewer effect.

There were errors in the answers they were eliciting differently for

some interviewers than for other interviewers.

But for the other question,

age at divorce, the effect was due to a significant non-response error.

So they were able to confirm that the true values of those who were recruited

by different interviewers actually differed.

So in a way, it kind of undermines the interpenetration,

the random assignment of cases to interviewers.

So essentially, the reason there's an interviewer effect for

the question about age at divorce is because some interviewers were recruiting

younger respondents than other interviewers.

This, in effect, undermines the interpenetration.

Even though cases were randomly assigned to interviewers, differential

recruiting led to a different mix of true values for different interviewers.

So as I said earlier, one reason to calculate Rowe int,

in addition to just quantifying clustering, is to quantify

the impact of interviewers on overall variance in the study.

So there's a measure that does this.

It's called the design effect due to interviewers.

So the design effect due to interviewers is a measure of the extent to which

interviewers increase the total error in the survey.

So you can see that there really are two parts of the equation.

The first is the number one, which just represents the variance in

the design where there are no interviewers or no clustering due to interviewers.

And we're just going to call that one.

And we want to know what additional variance can be attributed

to clustering due to interviewers, which has two parts to it Rowe int,

which we've been discussing, and m-1 and we can just think of this as m,

where m is the average interviewer workload, or the number of interviews.

And the idea is that the more interviews any one interviewer conducts, the greater

the impact of their idiosyncrasies or the particular way they administer

the questionnaire might be compared to one of their interviewer colleagues.

So the bigger the m, the greater the impact of any one interviewer which

is another way of thinking about this by reducing the workload,

the number of interviewers that any one interviewer conducts,

the lower the impact of any one interviewer.

So for a fixed budget, it would be better from this perspective, to

have a larger number of interviewers each conduct a smaller number of interviews.

There's a variance on the design effect which is called the design factor.

Often abbreviated by deft, and

it's really just the square root of the design effects, so

instead of talking about variance, we're talking about standard errors.

As I understand it they're really used interchangeably.