So how the people actually go about addressing these issues? So [COUGH] there variety of different things that you can do kind of the simplest is just to take your best and worst performances in the job. And compare them on a variety of characteristics to see which characteristics are best predict performance. As you do that, one of the things I do think you wanna do is what we call test the statistical significance. Again, please think about taking a course, an introductory course in statistics. These tests for statistical significance are basically trying to see, you know, is this a big difference or could this just be explained by the fact there's random variation in any data. So there's kind of the very basic. I think better than that is to try and be a bit more systematic about who you compare. So, for example, like I said, we expect the time somebody has been in the job to have a big effect on their performance. So you want to compare people who've had similar amounts of time in the job and look at what characteristics shape who's a better and worse performer within that cohort. Also try and just look at people who are on the same job together. Even better than that, and I think what most organizations would look to do will be what we call multivariate regressions. I'm not sure how many of you know much about this, so I'm just gonna give a very brief conceptual introduction. It's something you can't actually do in Excel. If you Google it there bunch for other statistical packages. It's very useful for this kinds of. So the basic idea of what it's doing is, if you plot out your data set, so, plot out, see here y would be performance, x could be for example GPA. And so each dot is a person right, and you've got on the one hand their GPA, on the other hand their performance. And you get the kind of the scatter plot like this. And you want to understand, is there a relationship, how much does a higher GPA actually predict performance? Conceptually, what multivariate regression is doing is it's just finding the best fit line through this data. The line that you would draw that really summarizes this relationship. And it has three very nice characteristics for doing this work. Okay the first is usually it doesn't do this graphically, but it spits out an equation that looks something like this. So here maybe performance is 3.2 on average, plus .2 times whatever the GPA is. So it gives you a way to quantify the effect of this variable. The second thing that's nice about this is that it also tells you the level of statistical significance. So is this really just a gap, if we throw a bunch of dots at a piece of paper, we're gonna be able to draw some line through it. You know, is that what's going on here or does there really look like there's a relationship? The third reason the multivariate regression is useful, particularly in this context, is you can have multiple different x-variables. So, I'm not going to try and draw what this would look like in multidimensional space. I really don't have that artistic talent. But you could imagine, you know you don't just have GPA. Maybe you have their level of related work experience, and maybe you have how well they rated on a job knowledge test, and various other things. And you want to try and disentangle how each of these variables shapes performance. That's what the regression will do, will kind of draw this best-fit line through this multidimensional space, and you'll end with something that looks like this, where it tells you okay, has an effect on GPA. So GPA affects performance, in addition so does experience and so does the job launch test. And this is how much. And so it's that ability to untangle multiple different things we know about people. Figure out which of those have the strongest effect on performance and which of them doesn't really matter very much. Where regression is really very strong. And so I think that's probably the way that most people if they had the time would go about doing this. Probably best practice would be also to try and think a little bit about this selection process that we're not looking at what this relationship looks like in all of the people that applied, but only the ones that we actually try to hire. This is quite challenging cuz obviously you don't have the performance data on all of the people that you didn't hire. There are various statistical techniques for actually correcting the sample, so you can correct for that. You can also potentially correct for the fact that not everybody stays, and so the sample of people for whom you have this data is also selected on the other side by who's left. So there are things like Heckman selection corrections that do this for you. They're quite sophisticated. You want to take your advanced statistics course for that, and they also have a number of problems in terms of they're kind of quite sensitive to some of the statistical properties. So most cases I think people will stick with multivariate regression. I think a question that a lot of people are interested in as they do this is, okay so we're kind of thinking we're gonna use an algorithm to give us a sense of who to hire. Really, we're gonna let a computer tell us? Surely, we should have a better sense of this. How good are these algorithms? So, there's some bad news here. So, the bad news is these algorithms are very far from perfect. The combination of various effective means, work samples, cognitive ability tests, all these sorts of things, most cases probably don't predict more than about 30 or 40% of the variation in performance. So we're not that good at really identifying who's gonna be your best performer. That's the bad news. The worse news is that they're probably still better than people are. There was a very nice study done by Mitch Hoffman at the University of Toronto and some colleagues where he looked at data on hiring into call centers. So I mentioned earlier attrition is often one of the things that you're worried about. You're really worried about that in a call center because most of these call center will take one, two, three, four weeks to train people before they ever let them answer the phone. And then turnover is very high, a lot of people leave after three, six months, something like that, and so it's hugely costly. And so being able to drive down turnover is a big influence on cost, so it's one of the things they really screen for heavily. There are a number of vendors that help them with this, and so Mitch had data that looked at how these performance screens, which are algorithmic. We give you a bunch of tests and then we spit out a score, how they did with predicting turnover. And so in particular what these screens do is they just advise the manager, they say this person is a good risk. This person's a poor risk. We found a couple of things. Firstly, when companies adopted these screening techniques, then their turn over dropped. So when managers had more information on who is likely to turn over, they did better in terms of hiring in terms of having lower turn over. The thing that was more frightening was that managers also had discretion. So they got this advice but they didn't have to follow it. But what Mitch found was that when they used their discretion more, turnover was higher, and so I think the most obvious interpretation of this is algorithms are not great but they're still better than we are. You know we get back to Daniel Carnamen carrying his log around, right? The idea is that algorithms are not perfect, we're worse. And so there's huge potential in this space for data analytics and people analytics to improve how we go about hiring.