After conducting the descriptive statistics it can be

important to generalize results from your survey in order to inform what

could happen in the overall market.

By the end of this lesson you should be able to determine the level of

confidence you should have in the results based on the sample size.

You should also be able to compare and contrast parameter and statistics.

Surveys are administered on a sample of the overall population of interest.

What you want to do with a survey is to infer something,

about the overall market or population, using data from the survey.

So, it's really important that when you design surveys that you ensure that

the sample is somewhat representative of the overall population.

If it's not, you simply are not going to be able to make such generalizations.

So, the questions we want to ask ourself is, what can we say about

the population's true parameter based on the simple statistics?

And so here are two words that are important, parameter and statistics.

Statistics pertains to whatever information you have gathered with

your sample, whereas the parameter is a quantity or

the variable of interest that you want to learn about the pertaining market.

Here, I'm going to use a particular example, but before we do that,

there are two types of generalizations that one might want to make,

either on proportions or on means.

Let's focus on proportion first.

So let's say you want to conduct a statistical inference.

What does that mean?

It means that taking the information from the samples, analyzing the data in

the sample, and inferring it to something that might be true at the market level.

So imagine that you have two candidates for a local election, Candidate A and

Candidate B.

Let's assume that you're collecting the survey with 1,000 local residents.

This information is important, because that's what we call the sample size,

which we will use to make the inference.

Respondents tell you in the survey that 70% of them will vote for

Candidate A, and 30% of them will vote for Candidate B.

So your P is 70%, and population y- p is 30%.

The marketing researcher might ask,

what can we say of the overall population when they are going to vote?

Based on my sample, and assuming that my sample is representative,

can I predict who is going to win at the election?

We need to construct something called the confidence interval,

which captures the degree of accuracy of the data.

If the confidence interval is very small,

that means that the level of accuracy is very high.

Conversely, if the confidence interval is very large,

that means that the degree of accuracy is very low.

The construction of the confidence interval is based on the confidence level

desired by the researcher.

You can find information about confidence level and confidence interval online.

The most commonly used level of confidence level is 95%, and

we'll see why this number is important later.

To compute the confidence interval, for a presentation measure,

we need to compute the standard errors for the presentation measure.

The formula for standard errors is going to be s,

which denotes the standard errors, and is equal to the square root of p(1-p) / n.

Based on the data I gave you, and the number of respondents,

the samples in the sample size, p in our case is going to be equal to 70% and

1-p is going to be equal to 30%.

So the confidence interval in this case is going to be (0.7-1.96)( s),

and I would explain the times s part in a minute, and

that's going to be the lower bar for your confidence interval.

The upper bar is going to be (0.7 + 1.96)( s).

And again s is going to be equal here,

to the square root of 0.7(1- 0.3) / 1,000.

The number 1.96 come from the 95% confidence level, and

this information can be found online.

As you can see you I get a range for the values.

So, you can interpret it as 0.7 plus or minus something and

this plus or minus something is important because that gives you

the level of accuracy of your estimate or margin of errors.

This margin of errors is driven, not only by the confidence level that you want

to work with, but also by the standard errors for the formula, and as you can

see again from the formula, this standard error is a function of n, the sample size.

If n is very, very large, the standard error is going to be very very small, and

assuming your sample is representative of the population, it means that the larger

the sample size is, the more accurate the results you are going to have.

Conversely, if you have a sample sides that is too small,

it means that the standard errors are going to be very, very large.

This also means that the accuracy of the inferential analysis you can do

is going to be compromised, and so

there is always a tension between what you can afford to do, and what would be useful

in terms of guaranteeing some labor of accuracy in your analysis.

Here, I'm going to give you another example that has to do with measuring

the willingness to pay off a product.

It is not uncommon for marketers to ask themselves,

how much people are willing to pay for my brand, and so here the difference is that

this is not a proportion measure, this is a mean measure.

The way to make inferences about it is similar.

You want to construct the confidence interval.

Let's assume that the willingness to pay your data was $105, the observe

standard deviation is about $16, and that your confidence interval is 95%.

You can construct the confidence interval for your willingness to pay,

which will be equal to 105 +/- 1.96,

that comes from the 95% confidence level,

times the standard deviation of $16.40.