Conducting a preference study is quite different from doing an A B test. And so, let's talk about how it differs and actually, how you conduct a preference study. So, preference testing is putting two or more treatments of some kind of asset in front of people to see which one they prefer. And typically, the way that you see this done is that you create a survey, and you embed that content within the surveys that you can ask questions, direct questions of people to see what they prefer. So a couple of things are different here. The first is that you don't actually have to have a statistically significant sample. So much of the time because people do much more of a qualitative type test, you will see that there are much more smaller groups of people tested. However, there are certainly cases where you might want to do statistically significant samples. And so there again is a great reason to use remote testing in order to increase your sample size. The other difference is that this is not an experiment. And essentially what that means is you're not putting statistically significant numbers of people in front of something, and just seeing what the differences are in their behavior, you really are asking them directly. So that essentially means that instead of just looking at the numbers. You're looking at numbers around asking people their preferences, their thoughts, their ideas around what a particular variant does for them. So a few considerations that you have to think about when you are structuring a preference test the first is, what size panel do you have to work with? So you will have to think about how much time you have? How many people you can potentially get, how expensive it is for you to get those people. And that's really going to help you to make the determination around what kind of tests you're going to run. Because you're also going to have to balance the number of variants that you want to test with the panel size. And as well how you want to deal with the between versus vs within subject breakdown. And this is where it becomes really important to think about whether or not you want to run a test between or within subjects. So a within subject study essentially means that, each person will see either all variations for at least more than one. A between subject study essentially means that people are only going to see on variation. And we are going to walk through an example just that you can see what practically the difference is in the structure of the test itself. In this particular example, we're looking at three different variance of a prototype. Where the layout controls and messaging are different, let's imagine we're doing a within subjects test. So, in this particular example each person is going to see two variations, and really two because more than about two or three, and it starts to get very difficult for people to make comparisons. So the way that this typically works is you get your pretest questionnaires, so you might ask questions. Have you use this type of experience before, what are your preconceptions about this particular product or brand? Then you show them a variant and you ask a series of follow up questions. Then you show them the next variance and you ask a series of follow-up questions and those questions that you ask should really be the same for both variance. And this enables to compare people answer choices. Then you ask them to explicitly compare the two variance, and this is where potentially you might see some bias. Because recency, right, remember the most recent thing that they saw. They might also tend to have a expectation that the first thing that they saw was better because it was first. So you have to think about this. Might be that you want to make sure that you're flipping the variance that people are seeing so that you control for that bias, for example. At the end of the test you would have them answer a post test questionnaire. Which provides an opportunity for them to provide you feedback around higher level questions that they may not have thought about, or answered while they answering those questions about a specific variant. As you can see there's already a significant difference in the between subjects tests. Where you would essentially have someone complete a pre-test questionnaire, any questions that you might want to have around demographics or other types of warm up questions for example. Then show them one variant with follow up questions, then have them do a post-test questionnaire. So you can already see that there significant differences between the within and between subjects test. Whereas, in the within subjects test you have people looking at multiple variance and explicitly comparing them. In the between subjects tests, essentially you are are relying on the reactions of a larger group of people to a particular variant to tell you whether or not it performs better or worse than the other variants. So fundamentally, actually when you're doing a test, it's relatively simple to decide how many people you want to have look each one of the options when you're doing a between subjects test. Because if you have 120 people, this means that you can evenly distribute them across those options. They're only going to see one and going to get fairly large number in this particular example looking at each one of those options. I'll say this, most recently we did a preference test with essentially ten people looking at seven different variance of a particular prototype. One thing that we found that was interesting was that above, about seven individuals there wasn't a whole of variances in terms of the findings in terms of preference. So essentially that means that, we could have stopped looking at videos of particular sessions at seven users. But we found that having a total number of 70 participants made the client feel a lot more confident in the quality of the results that they're getting because they have a larger sample size. So, bottom line, you don't have to have a statistically significant sample because, especially in doing preference testing, a lot of the time, the findings that you're going to get are very similar to what would you get out of a usability test. But, if you feel that it's important for you to get that larger sample, it's entirely possible to do that. The best way to do that would be to use a remote and unmoderated testing platform in order to expand the number of people that you can have looking at each one of those options. As you can see when you're doing within subject testing it gets a lot more complex. Because to be able to control for a bias, and as well to expose a large number or a relatively large number of people to each one of those options and combinations. You have to make sure that you are balancing the number of people who were looking at each one of those options and as well, make sure that you're balancing what order that they're seeing them in. So that you get a good equal number of people looking at each one of those options, and then looking at each one of those options and in order that flips so that bias is reduced. So, in this particular example, this is how you would divide up the number of people who would look at each one of those options. So again, still 40 people but you would have to create six different tests in order to make sure that everyone was exposed to those options in a way that reduce bias as much as possible. When you were considering how to ask questions, what you want to think about is following up each one of those variants with several questions that help you to gauge what people think or feel about them. So you're not limited to just overall usability. What you really need to be thinking about is do you like this for what reason? Is it because it seems more useful, or is it clearer? You might think about asking people what their perceptions of the ease of navigation? How much they enjoyed a particular approach, or how clear that they thought that the content was? The best thing to do in this situation Is to ask people a certain number of question so typically between five and ten. After each one of those variants, and make sure that they are answering questions about those variants that are in a consistent way. So that you aren't switching the questions or the answer choices so that you can line up the data, so that you can compare apples to apples, the preference between each one of those variance. You may have seen the Likert scale Up before. Essentially it's an approach where you have people read a question and then select an answer from a number of choices, that enables them to measure a concept, along a continuum. In this particular example what you're doing is having subjects read a question, and use an answer choice typically an odd number of responses, so that the middle point is neutral. And all those response choices are along a scale so that you're measuring that same concept. In this example you've got on a 1 to 7 scale, how clear was the information on this page to you? And you're asking people on a scale of, not at all clear to very clear what their thoughts were about, where 4 is neutral. There a couple of reasons why you might want to conduct a preference test. The first is that preference tests are really good at helping you to uncover why is that people prefer one variant over another? And that's because, you're actually asking them why they prefer one over the other. By using those scaled questions to get at the different attributes of the experience that you want to understand more about. The other reason why you might want to use a preference test is, you actually don't have a particular experience out in the wild so you can actually run an experiment on it. So you'll see people running preference tests very frequently on prototypes.