Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Loading...

来自 University of Houston System 的课程

Math behind Moneyball

36 个评分

Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

从本节课中

Module 6

You will learn how two-person zero sum game theory sheds light on football play selection and soccer penalty kick strategies. Our discussion of basketball begins with an analysis of NBA shooting, box score based player metrics, and the Four Factor concept which explains what makes basketball teams win.

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

Okay, let's talk about Dean Oliver's great four factors law.

So, Dean worked a long time for the Denver Nuggets.

I believe he played college basketball for Cal Tech.

Worked for the Denver Nuggets.

Became the director of ESPN stats and recently he went back

to the NBA to be the analytics director, I believe, for the Sacramento Kings.

So you remember we talked about the Giants in baseball, what made them win.

We talked about there's hitting, there's pitching, there's fielding.

And base running a little bit, but that's usually not [INAUDIBLE].

Okay, and in the NFL you could

break it down to pass offense,

pass defense, run offense,

run defense, and special teams.

Okay, so in basketball how can you break down what makes a team

win if you want to see where your team ranks on various things.

So I have data from the 2007/2008 NBA season here.

And the four factors, what are the four things that sort of

you need to be good at to be good at basketball?

Well, you need to be able to shoot well and

stop the other team from shooting well.

And so that's one factor, which is your effective field goal percentage minus

the opponents, and that's the difference.

So here Miami shot 52.44% effective field goal over their opponents, 47.51,

the difference is what matters.

Well you need to go to the foul line, so

you take free throw attempts divided by field goal attempts.

So the Heat would get 36 free throw attempts for 100 field goal attempts.

Their opponents 29.9.

Really it probably should be something like free throws made would be better, but

they usually use free throw attempts.

Okay. Now then you want to not turn the ball

over.

Now here, so the difference on free throw attempts from offense to defense.

Your team and the other team, that positive.

But turnover as a positive difference is bad, it means you turned the ball over.

So it's turnovers per 100 possessions for the Heat was 13.52 and opponent was 12.6.

And then rebound percentage.

What percentage of your shots did you rebound on offense?

What percentage did the opponents shots does the opponent rebound?

Take the difference.

So those four differences are your four vacuums, and basically the question is how

you rank what are each of those worth and then how do you rank them?

So in the regression data, we've got the difference columns put adjacent.

Okay, and what you're trying to predict is the number of wins.

Okay.

So we're going to run regression in the season, and

then you have the EFG difference as one independent,

you've got the free throws divided by field goal attempts as another variable.

Turnovers per 100 possessions.

Difference.

And your offensive rebound percentage minus

the opponent offense rebound percentage.

Okay, so you can run a regression.

We know how to do that, data analysis, regression.

So the y range is going to be the wins and

the x range is going to be four different.

So it's going to be these four columns.

Okay, so we've got row 56 through 86 there.

That should work, we're going to have labels.

I can put this on the same worksheet.

Okay, so our score is 93.1, I've got the results somewhere else,

I think that's the right answer.

Okay so this is the four factor regression results.

Okay.

So let's see how good this model will do for

this data in predicting how many wins the team has.

So the r squared is 93 percent.

93% of variation in wins is explained.

The standard error, we know is important, 3.72.

So, you double that.

95% of the time we can use four factors

to predict wins within two standard errors.

So, two times 3.72, about seven wins.

And we can see all these variables are significant.

The P values are very low.

Rebounds don't occur to be quite as important.

They make sense.

Okay, you have one more turnover for a hundred possessions.

You're going to win four less, about 3.7 less games okay.

And if you think about that that makes sense.

One more turnover per 100 possessions.

Let's say on offense well that's going to be,

you don't have 100 possessions per game.

Okay so one more turnover.

Okay, it's going to cost you possession, which is worth about a point, so that

would cost you about three wins, but then you're giving them a better possession

when you turn the ball over, they'll average more points for possession.

So that coefficient makes sense and

you can interpret the other coefficients similarly.

Okay, free throw differences, like if you had one more free throw

per hundred possessions, per hundred field goal attempts,

that's a little less than one free throw extra per game,

which might be about 0.7 points, okay?

And, well you might have scored on that possession, so it's not clear that.

How to evaluate what that coefficient should be.

But all the P values are low and that's good.

So what percentage of basketball is based on shooting offense minus defense?

Free throw offense minus defense?

Notice the four factors.

What percentage of basketball is basically based on those four?

And so I did a bit of an analysis here and this you could use in any regression.

It's a crude way to figure out how important each variable is.

So, I've copied the coefficients from our regression over here.

And so then we ask ourselves if we can improve ourselves by

one standard deviation.

On each factor, how many more wins from average

to one standard deviation above average?

How many wins would we get?

And then we can see for each factor what percentage of the total is each factor.

Now I can tell you Dean Oliver said.

Although disagrees with this, that this would be 40%.

Turnovers would be 25%.

Rebounding 20%, so that leaves what?

15% for a prefix.

Okay.

Now let's see what we get here?

So we need the standard deviation for each of these differences, so

I've got that with the stdev function.

For instance the standard deviation on EFG offense minus defense is 2.82

freethrow attempts divided by field goal attempts is 3.75 etc.

And so, let's take the difference in effective field goal percentage.

So you want to make yourself one standard deviation better than average which means

move from the 15th percentile to the 84th percentile,

you would have to go up by 2.82 on effective field-goal percentage, and

multiply that by the absent value of the coefficient here.

That would be worth ten wins.

And on free throw difference, it'd be worth three wins, and

on turnover difference it's worth about five wins.

I took the absolute value of the coefficients and on run defense, sorry

rebound defense it's about 1.3 wins and if you see what this adds up to about 19.

So ten wins out of 19 is about 53%.

So you can see, right here I get almost exactly with Dean Oliver says

although basketball reference seems to disagree with these percentages.

But I think that shooting is

more important than Dean Oliver got and I sort of agree with that and

I get rebounding is less important than Dean Oliver got.

There isn't much differential on rebound percentages.

Well that's the end of our video on the four factors.

But any GM, one of the first things in basketball, what a person he should do or

she should do is look at basically where they stand on the four factors and

see basically who they can get in the free agent market or

via the draft who might impact those four factors.

Ok well we'll see you in the next video when we'll talk probably

about how we can measure how good a basketball player is.

You might think that's simple.

Okay, but it really isn't.