Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

来自 Johns Hopkins University 的课程

Mathematical Biostatistics Boot Camp 2

45 个评分

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

从本节课中

Techniques

This module is a bit of a hodge podge of important techniques. It includes methods for discrete matched pairs data as well as some classical non-parametric methods.

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

Hi, my name is Brian Caffo and this is Mathematical Biostatistics Boot Camp

2, and today we're going to be talking about matched two by two tables.

Okay, so today we're going to be talking about matched pairs data, which

is basically going to be, like, the paired t-test,

but only now, we're going to have binary data.

So to discuss that, we'll have to talk about the subject of Dependence,

then we'll go over some specific

test associated with matched pairs, contingency tables.

and then talk about some other

details like relationships with the Cochran–Mantel–Haenszel test.

So here I have some example from, some example data

from Agresti's Categorical Data Analysis of Matched pairs binary data.

So the the data in this case was a survey where they were asked

approval or disapproval of the politician.

I believe it was a prime minister, a British prime minister on two occasions.

So they had a first survey, approve or disapprove,

and then they had second survey, approve or disapprove, okay?

So and this two by two table is a little different than some of

the other ones we've studied so in this case, 794 approved on both occasions.

So you, you know, every one

of these 794 measurements then represents two measurements where a person, where 794

people said approve once, then approve again 150 said approve

the first time and then disprove the second time, okay?

So

a, a, a related very common example of matched pairs data is case control data.

So here we have cases in controls we have an exposure and then an

unexposed group and in this case we have 27 that were exposed.

that were both cases and controls, 29

that were exposed controls, and unexposed cases,

and so on.

So how did this data wind up

being matched, since a person, for most exposures

can't be both exposed and unexposed for

example, you can't both smoke and not smoke.

so the way this usually works is they will, say let's say this is

a retrospective study, they would take the

collection of cases from charts and ascertain

whether they were exposed or unexposed. And then they would go find with respect

to lots of demographic variables like age, and other things.

very closely matched subjects, closely matched controls, and

they'd ascertain whether those controls were exposed, or unexposed.

So this process of matching pairs

each individual observation because they were matched

on all these other variables, like age and perhaps you know, other demographics.

So it's different from an instance where you have a bunch of cases,

and you just select a bunch of controls, and hope that they're comparable.

In this case, you've made subjects directly comparable, by at least

on everything that you can think of that's important, you've matched on.

So here's two very common instances of Matched,

binary data.