Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Loading...

来自 University of Houston System 的课程

Math behind Moneyball

36 个评分

Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

从本节课中

Module 6

You will learn how two-person zero sum game theory sheds light on football play selection and soccer penalty kick strategies. Our discussion of basketball begins with an analysis of NBA shooting, box score based player metrics, and the Four Factor concept which explains what makes basketball teams win.

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

Okay. So how do we determine who a really good

basketball player is?

There's two schools of thought, use the box score and that's been,

metrics that use the box score are sort of like linear weights in baseball.

They've been around for a long time, although the box score's changed.

They didn't used to keep track of blocks and steals.

I'm not sure exactly when they started, but certainly in 1950's and

60's they didn't keep track of a lot of those things, and

then basically, how does a player move the score?

Everything's in.

And we'll get to that later.

I believe Jess Ager and

myself invented this idea working with Mark Cupin in the year 2000.

And we'll get to adjust the plus minus, and then ESPN's version, real plus minus,

which is probably available and you can look at.

We'll discuss those things, and

ordinary plus minus which is virtually in Hockey in a couple of videos from now.

But let's talk about three linear weight formulas that have been developed.

The first one I believe was NBA Efficiency, and

it simply says, and these are all on a permitted basis.

Okay, you add up, and it's linear weights,

because you take each statistic from the box score and multiply it by a constant,

be it positive or negative, and add it together.

Okay, so NBA efficiency says add up the good things, points, rebounds,

assists, steals, blocks.

Take away the bad things, turnovers, missed field goals, and

missed free throws, and do that for a minute, okay.

Now I can tell you right away even though this is simple, the weights are all plus

one and minus one, and I really do believe in the principle of parsimony.

I like simple metrics when they capture what's going on, but

there's two problems with this okay.

Where is defense?

You know defense is probably half of basketball, you gotta score points and

you've gotta stop the opponent.

I think defense might even be more than half of basketball,

because you have a weak link on defense they'll find it.

If you have a strength on offense they'll try and take it away, but

a weak link on defense, I mean, it's hard to hide that.

And so, the other problem here, with NBA efficiency,

is the weights given to points in missed field goals.

Suppose you shoot one for three on two pointers.

Okay now, what you get then, would be in the points,

you'd get a plus two in the missed field goals, you get a minus two.

Which is the formula is neutral if you shoot one for three.

Well if you shoot one for three on two pointers.

Okay, you're like the worst shooter in the NBA.

Okay, and that means if you shot 34%.

Like if you made 34 of 100 shots, you get plus 68, minus 66, and

if you shot 34%, the more shots you take the better your NBA efficiency.

That to me makes me conclude that, that is a really terrible metric.

A volume shooter will be rewarded for being a poor shooter.

Okay, and Russell Westbrook by the way had a great season last 2014,

2015, triple doubles coming out at almost every game he played

during the end of the season with Durant out.

But you look at his shooting chart,

he was a below average shooter from just about every point on the floor.

Although he did get fouled a lot, but I mean,

an average shooter would probably have shot a better percentage than Westbrook.

I mean you could take a look at that actually.

Okay, so then another attempt was by David Berri who,

I like his work a lot he wrote a book Wages of Wins I think stumbling on

wins with his co-authors and he has wins created.

That's a bit complex, and then there's something called Win Shares on

Basketball-Reference, and that's really complex.

I mean, I want to deal here with simple metrics, but

the original win score metric wasn't too bad.

It's this basically, and yet, you do this per minute.

You take points plus rebounds, plus steals, plus a half of assist,

which is totally arbitrary to me.

Why is an assist half as important as a rebound, I have no idea.

Half of blocked shots, minus field goal attempts, minus turnovers,

minus half of free throw attempts, minus a half of personal fouls, and

I like the fact that personal fouls are in there.

And so, basically, the nice thing about this formula is if you shop one for three,

what's the break even point on shooting percentage to make the formula neutral.

If you shoot one for two on two pointers, then the formula is neutral.

See if I take two shots and I make one that's a two pointer, I'd get a 2 here,

and I have two field goal attempts, so I get a minus 2, and actually the average

effect of the field goal percentage in the NBA is pretty close to 50%.

So this does a much better job of capturing

whether you're a good shooter or not.

If you shoot over 50%, the more shots you take, the higher your win score.

Shoot less than 50%, the more shots you take, the lower your Win Score.

Now why don't I like this formula,

although basically it was modified in 2012?

Because I have a plus one for rebounds and minus one for turnovers.

That says a rebound is as good as a turnover is bad.

So I mean, let's suppose I have a game I got,

ten rebounds and nine turnovers.

So I'm thinking of James Hartingen, 14 turnovers in Game 5,

the Western Conference Finals, I can't get that out of my mind, and

I'm sure he can't either.

Okay, but if I had ten rebounds and nine turnovers, I get a plus 10 and a minus 9,

and so if you could scale that up, that's a plus.

If I could scale up ten rebounds and nine turnovers, I'll raise my Win Score.

That makes no sense whatsoever.

I mean, I don't think any player would be happy with a ten rebound,

nine turnover game.

Okay, so I mean that to me means that rebounds are overvalued.

I don't think turnovers, or I think turnovers are valued fairly properly,

but in 2012, but this, we'll get to Hollinger in a minute.

The Win Score formula change,

in the rebound aspect.

So you start with +, you give a weight of 1 to an offensive rebound,

and only give a weight of 0.5 to a defensive rebound.

Now, I think that makes a lot more sense,

although I really don't know how to validate this and, but

it makes sense because a defense rebound, probably somebody else would've got it.

Offensive rebound is sort of like an extra possession you got your team,

it's almost like a steal.

It's almost like stealing the ball, in a sense.

So giving twice the weight to an offensive rebound as a defensive rebound,

I think does a pretty nice job of fixing this formula.

So I really don't have any strong objections to comparing

players on a per minute basis by this formula.

But again, where is defense?

Defense, defense as the Knick fans used to chant, when I was in for the Bradley,

Buscher, Frazier, Willis, Reid, Tick, Barnett teams, okay which I worship.

Red Holzman, his famous phrase was see the wall.

I mean I grew up on those teams.

I mean I never, I don't think I've seen better ball movement ever

than those teams and the great garden crowd in the late 60's and early 70's

started that chant defense, defense, which now we hear in every sport everywhere.

Okay. So I like Win Score, but

I think it's missing defense.

Now, player efficiency rating, you can go on Basketball-Reference and

read five pages on how it's computed, and it's really complicated, but

let me just show you.

I mean, almost everybody who talks about how good a player was in a year,

refers to player efficiency.

Well I I think player efficiency rating is not that efficient,

as we'll see in a couple minutes.

We'll deconstruct it using regression, but if I would to ESPN.com,

hopefully that'll pop up here, and I go to NBA stats

Hollinger player ratings, we'll get to real plus minus later.

Okay, so the best per rating, a per rating of 30 is historically great.

The average NBA player has a per rating of 15, and so

Russell Westbrook comes ahead of Steph Curry here, and

ahead of James Harden, and I just don't buy that.

I think that's, you'll see PER I think rewards inefficient shooters, and

Russell Westbrook was really an inefficient shooter,

but you look at his total shooting percentage here, he's 53.6.

And let's see, how far down do I have to get, that includes free throws,

which he's very good at, but I gotta go down to LaMarcus Aldridge who's actually

the king of the long two pointer, to find somebody who's basically a worse shooter

with total shooting percentage in the top 11.

You can see Steph Curry is unbelievable.

He's dominant there 60, well him and Kevin Durant are basically dominant.

Surprisingly how it is on White side cause I am thinking dunk is a pretty high

percentage shot.

But LeBraun, is again not that great a percentage shooter,

but he is again much better than Russell Westbrook there.

Okay and so, per rating, again it's based on the statistics in the box score,

and a lot of defense is not in the box score.

Okay, if you take a charge that's not in the box score.

Okay, if you box your man out and

somebody else gets the rebound it's not in the box score.

Save the ball from going out of bounds, okay, and your team gets possession.

Helps the defense, helps the offense, it's not in the box score.

Set a good screen, or you go through the screen properly,

it's not in the box score.

Okay, the hockey assist is not in the box score where the NBA is sort of keeping

track of those with the sports view data that we'll talk about later.

Okay, so how can we deconstruct PER?

Okay, so what I did is I know it's based on box scores statistics so

I did something which seems to be pretty obvious.

Okay? I had my class do this for homework and

I had a student Paul.

I'm not sure I'm saying your name right Paul,

because you're probably going to watch this.

One of my A students at the Kelly School of Business in Indiana University,

who just loved this stuff, and he kept asking for more work.

He did this in one afternoon and said give me something else to do.

Okay so, what we're going to look at here,

is dependent variables, so player efficiency rating, and

then the independent variables were per minute, the box score stats.

This is 2015, so I got this off Basketball-Reference here.

Okay.

Okay, well now actually I think this the old data, this from a couple years ago,

but this will make our point to you, okay.

And so, basically,

you run a regression, you predict the PER rating, take points per minutes,

rebounds per minute, assists per minute, steals per minute, turnovers per minutes,

missed shots per minute, missed free throws per minute,

blocks per minute, personal fouls per minute, and this is old.

Again, this is data from a couple years ago that Paul worked with.

So you run a regression, predict the PER rating from this stuff, and

we'll give you a homework problem where you can work on this yourself okay.

And, so we can explain 98% of the variation and

PER rating and 95% of the time,

we get the PER rating right within double this, or p values are very low.

Look at these, okay, they're all less than two in a thousand here, our PER values.

So you can see the more points you score, the higher your PER score value.

Rebounds per minutes helps,

assists per minute helps, steals help, turnovers hurt you, missed shot hurts you,

missed free throws hurt you, personal fouls hurt you, and blocks help you.

That makes perfect sense.

Okay, well the problem is, this rewards the volume shooter.

Let's take a look at that.

So let's suppose you shoot one for three, okay.

So if you shoot one for three on the points you get 44 times two, and

on the missed shots you get roughly minus 40.

Since you shot 1 for 3, minus 40 times 2, and that gives you a plus 8.

So if I'm the worst shooter in the NBA at 33%,

okay the more shots I take basically, the higher my PER rating.

That makes no sense but it sure explains why Russell Westbrook

is basically really high in PER, higher than I think he should be,

okay, because PER rewards inefficient shooters, I believe.

Okay, the more,

the volume inefficient shooters can get reported by PER as this regression shows.

Okay, so that leaves us probably wanting more.

Is there a way to improve on box score metrics?

Or is there a box score metric that would incorporate defense better?

And I think that's really hard,

because the box score doesn't keep everything we need for defense.

Maybe with the sport view cameras,

somebody will come up with something that's a little bit better.

But we'll start talking about evaluating players based on how they move

the score of the game when they're in.

But first I think we have to talk about how to download data that let, how to

compute the raw plus minus, and how to download data that will help us with that.