In this unit, we build upon the framework that we learned in the previous unit for statistical inference with a focus on numerical variables. Let's start with an example of a study that uses techniques that we're going to be learning in this unit. The study is titled Acceptability of Workplace Bullying. And it explores the relationship between culture and acceptability of workplace bullying across the globe. As part of this study the researchers collected data using a survey from 1484 alumni and current MBA students from 14 counties on 6 continents. And they asked some questions on acceptability of work related bullying and what's defined as work related bullying is giving tasks with unreasonable deadlines. Or exposing workers to an unreasonable workload, so on and so forth. In this plot we can see a geographic distribution of the countries included in this study. The sizes of the circles on the plot show how large the sample sizes are from each country. As we can see, the sample sizes are somewhat consistent across the globe, and it seems like we have a pretty even geographic distribution as well. This study further categorizes these 14 countries into 6 continents and those are going to be the 6 groups that we're considering. And also calculates the mean acceptability of work related bullying score for each group. A low score here means bullying is unacceptable in the workplace and a high score means that it is actually acceptable. We can see that the average acceptability is higher in Asia and lowest in Anglo countries. But just looking at these sample statistics, it's not possible to determine if the differences we're observing are statistically significant. So in this unit, we're going to discuss methods for comparing means to each other. In this case, we're comparing many means to each other. But we're also going to learn what methods we use when we have only two groups to compare, as well as more than two groups to compare. Now, let's shift our focus to another data set. And let's look at the distribution of inflation adjusted total family income in the US. These data come from a random sample of Americans. They were collected as part of the general social survey in 2012. We can see that the distribution, as expected, is pretty right-skewed. Suppose we would like to estimate the typical total family income in the US. In the previous unit, we were introduced to the central limit theorem which provided the basis for constructing a confidence interval for the mean. But what if we're not interested in the mean, but the median? There's actually no central limit theorem for the median. So in this unit, we're going to introduce a new technique for creating confidence intervals. Namely boots strapping which takes it's name from pulling one self up by one boots straps. Which basically means accomplishing an impossible task, this is simulation base method that doesn't have adds rigid conditions as the central limit theorem, and, therefore, also works for many estimates beyond the mean, as well. So in this unit, we're going to start by extending the methods we learned in the previous unit to comparing two means. So no longer will the focus just be on one single population mean, but how do we compare means from two populations? And we're actually going to specifically discuss what do we do if these populations are dependent, so our means are dependent, versus if they're independent. We're also, as we mentioned earlier, going to discuss bootstrapping. We're going to define what it means. How to bootstrap? As well as when to bootstrap and when not to bootstrap. We're going to also learn to work with small samples. What if we don't have a sample size greater than 30? What do we do then? Namely, the T distribution is going to come into play. And finally, we're going to wrap up our discussion by methods for comparing many means to each other. So extending what we've learned from comparing two means to comparing many means to each other, namely.