So, your data has been collected and you're ready to start your analysis. But wait, typically the data you receive may not be as ready to analyze as you might hope. Before we can effectively analyze the marketing data we've collected from surveys, we need to look at how data was collected. And what potential effects the data collecting process could have on our analysis. By the end of this lesson you should be able to identify potential problems with data sets that have been collected, and identify what impact those problems can have on your analysis. You will also be able to define and discuss two types of data errors, non sampling errors, and sampling errors. Let's move on to discuss what we can do with quantitative data once it has been completely gathered. So the survey has been designed, the data has been collected. And you want to ensure some of the research questions that have been defined before the survey was designed have been answered. Typically, the data you receive may not be as ready to analyze as you might hope. Before conducting a thorough analysis, you might also want to look at the data to understand whether or not some errors exist in there. Let's talk about errors for a moment. Errors can come in two forms, either inadequate sample methods, In other words sampling errors or errors can result from some other non-sampling error. Let's look at example of a sampling error first. One issue that can occur is that the sample used in a survey might not be representative or is biased in one way or another. An example of such a situation was back in the 50s, 60s in United States where a political poster decided to conduct a survey on the likelihood of a particular political candidate winning over another. To conduct this particular survey they took a telephone book and basically called people randomly in these telephone books to get a sense on whether or not they would vote for Candidate A compared to Candidate B. It turned out that the results showed that Candidate A would win by a large margin over Candidate B. But when it came to the election, Candidate B ended up winning while Candidate A lost. If you think back to the 50s and 60s, people only had phones at home. These land lines, as we now call them, unfortunately at the time represented a segment of people that were likely wealthier than the average population. Since there was a strong correlation between the level of wealth and the likelihood to vote for Candidate A, their forecasting failed to predict the appropriate candidate. And so even though the analysis was maybe correct on the statistical front, the same poll, the data that was used to conduct the survey wasn't appropriate. And therefore, all the results were inadequate. Another problem that you can have in data is the wording of some of the questions that you have in the questionnaire. For example, question could be leading in one way or another, and the measurement that you get for particular variable is not truly representative of what the marketplace or what the population thinks. Another example of poor survey design is a questionnaire that is too long. Toward the end of a lengthy survey, the respondents might decide to give up on answering the questions. And this fatigue would have an impact on the quality of the data that you would have, and the number of variables that you can use to analyze and infer results. If the questionnaire is too long and respondents decide to stop answering the questions toward the end, you are going to miss some important information that would be useful otherwise. So for example, if the questions pertaining to the demographics of the respondents are asked toward the end of the questionnaire, then you won't have access to this information. Because respondents might decide not to answer this question. Another thing you might want to look at is consistently missing answers. So maybe some questions were off putting to respondents in one way or another. Maybe the questions made them feel uncomfortable. If this systemic non answering on a questions occurs even though the question might have been important to you, that's also an issue. And you might be rife with incomplete questions. So when the data has been gathered, it's important to ask yourself some questions. Is a sample representative of the population? What was the sampling design that was used? Was the questionnaire properly designed? Were the questions leading or not? Where any questions consistently not answered very much and so on. You should keep these things in mind when you are beginning to analyze your data.