So here's a, a basic plot. I made this using the qplot function.

You can see the code down there at the bottom.

And and the basic idea is I plotted the relationship between, I've taken the

kind of scientific question in the previous

slide, and translated that into a plot.

So I plotted the relationship between nocturnal symptoms here.

So these are the

number of days in the past two weeks where there were symptoms at night.

So, symptoms while sleeping for example.

And so, so the, I plotted the relationship between nocturnal

symptoms and the in the log of the PM2.5 indoor value.

And you can see that, you know, roughly speaking,

on the left side we have all the normal

weight children and there doesn't seem to be a

very strong relationship between log PM2.5 and nocturnal symptoms.

Now the right side we have the overweight or obese children and

there seems to be an increasing relationship between PM 2.5 and symptoms.

And so you can see that there's perhaps

some evidence of an interaction between the BMI

and PM 2.5, because the relationship between PM

2.5 and symptoms appears to differ by BMI status.

So, of course, in order to formalize this, you'd probably want to fit some

sort of model, do a little bit more

thorough analysis, of course, and adjust for various things.

But we're going to stick, stick here with this plot for the moment.

So you can make this plot very easily with the qplot

function and by adding things like facets and geoms, to the plot.

But I'm going to recreate this plot using the

kind of more, the, the lower level ggplot framework.

And so how do we build this up layer by layer?

First, so first we start with

the data, right?

So here I've got a little excerpt of the data frame.

And you can see that there are three variables here.

There's logpm2.5, there's BMI category, and that's

two levels, there's normal weight and overweight.

And then I have the number of days with nocturnal symptoms.

So that's going to be a number between 0

and 14, because it's over the last two weeks.

And so you can see, so the data are very important in

any ggplot graphic, because almost everything that you put on the plot

is a, can be thought of as some sort of data.