0:13

Welcome to class.

In this lecture we're going to talk about the basics of probability.

Now you've probably already seen some examples of probability in action.

For example, what's the chance that I'll lose to Scott in four conflicts?

What's the chance that Scott's cheating?

What's the chance that I'm going to grade all this week's assignments?

0:46

On a serious note, we're going to teach you the basics of probability here, and

kind of equip you then to use those in Python to do interesting things.

So let's get to work.

[BLANK_AUDIO]

All right, to get started, I've built you a class page

that contains an overview of the

terminology that we'll use for probability.

1:54

Let's say we have a trial involving a single six-sided die.

So what are the outcomes of that?

Well, it's the numbers one to six.

Okay.

If the die is fair, again, we have the same chance of rolling any of

the outcomes, then the probability for each

of those individual outcomes is exactly 1 6th.

You may occasionally hear the phrase uniformly distributed.

That just means that all the outcomes have the same chance.

Okay, one more piece of terminology that you'll often hear is talk about events.

So an event is a set of outcomes.

For example, what is the event corresponding to an even die roll?

2:34

It's the set 2, 4, and 6.

It's whenever the die comes up with an even number.

We can talk about the probability associated with an event.

In that particular case see we would add up

the probabilities attached to each of the individual outcomes.

So it would be, 1 6th plus 1 6th plus 1 6th

to give a probability of one half for an even die roll.

3:07

To do that we just simply have to understand how to generate random numbers.

You've probably already seen that in your previous Python programming experience, so

I'm just going to show you a very simple example to get started.

So here I've wrote a little Python code.

And this does the following, this function roll_die just takes a number of sides for

my die and uses random.randrange to actually

generate a die roll and return that result.

The only thing you might notice here is that

I've added one to it because typically our dies

are numbered from one to the number of sides,

not zero to the number of sides minus one.

And then I built a little function here that I give it a number of sides, a number

of rolls and it just rolls the die a

particular number of times, and just prints out the outcomes.

So in this particular case let's roll a

six-sided die, and I'm going to do ten trials.

4:10

So now let's go on and I'll show you a

little more interesting application of computing to probability and randomness.

In some more interesting application, let's go back and

think about how Scott is whenever we play games.

So you probably seen instances where maybe he

fudges a little bit, maybe even just cheats.

For example we're rolling a die, he might have

a die that's loaded, so it prefers ones over sixes.

Or kind of more worrisome, maybe we're playing a game in Code Sculptor

and somehow randrange always returns the

right random number for his particular action.

How can we catch him?

4:58

that.

All right, let's do that.

So here I have a function of Python.

It's called plot fairness.

And what it takes is it takes a particular side for die roller looking for.

It takes the number of sides of the die.

And it takes a maximum number of trials

that we're going to actually do our test on.

So if you look inside the code, what it

does is up to max trials it actually runs a

sequence of trials where we roll the die and

keep track if it came up with the desired side.

And then we compare the ratio of the times it comes up

to what we want versus the max, the total number of trials.

5:44

So what we should see here is we increase the maximum number of trials

that the difference between the mathematical and

the computed value should get smaller and smaller.

So let's try that real quick and see what happens.

So here's a plot with 0 trials.

The difference between the mathematical and

the computed value is actually pretty big.

But as we increase the number of trials, notice here that

the difference is kind of getting closer and closer to 0.

This is what we should expect if the number of trials was really, really large.

So when we get up to 300 trials here, you can see that the

values are maybe kind of bouncing around by about maybe 3 to 5% around 0.

6:26

So we'd like to try maybe even more trials.

So here's a trick we can do.

We don't have to basically try, like, 200 trials

and then 201 trials and 202 trials and so forth.

We can jump up by maybe strides of ten

trials and just 200, 210, 220 and so forth.

So I have a stride that I can vary here.

And so if I increase that stride, I can do more trials, so let's do that.

So I'm going to go up the max number of trials we're going to try is a thousand.

6:52

So if we look at that, what you can see here is

we get out past 400 or 500 or 600, up to 1,000

trials, the spread has now gone down, is actually much smaller and

we're only maybe, I don't know, like 2 or 3% away from zero.

7:05

So you can actually keep this, keep doing this and increasing

the stride to say, 100, and increase the number of trials.

You can kind of see if that difference is getting smaller and smaller.

That's a sign that your die is actually fair.

This is kind of a first little trick that we can use

computation to check for things like is a random process fair or not.

I, I'll finish up by showing you one more kind of very interesting example.

[BLANK_AUDIO]

All right, let's finish off with a lighthearted example

of a trial that has lots of potential outcomes.

So here I've got a program.

This was written by a student in our intro Python class.

He wrote it after only knowing Python for two weeks.

It's actually quite a remarkable program.

It's a love song generator.

It was by Antonio, and he has a little function here called print verse,

print first verse, that actually chooses between

four possible lines to start his love song.

And he has some more helper functions.

And at the end down here, he has kind of two main functions.

One that prints a stanza by kind of

putting together four lines in a single stanza randomly,

and then he makes a love song that consists

of three stanzas, and an optional riff by Pitbull.

[LAUGH] Let's just run it.

So here it is, all right.

What did, what did the random numbers generator give us this time?

In this dark and cold winter, I walk alone, trying to forget you.

Even if you don't share my passion for Lord Fenner.

I don't know who Lord Fenner is.

One day we'll run away together because I love you babe, because I love you, babe.

Oh, and we didn't get the riff by Pitbull.

I'm so disappointed.

Let's run it again here.

Oh come on, I want the riff by Pitbull.

Come on random number generator, don't be this way.

There we go.

We finally got tunes, tunes, tunes, tunes.

Excellent.

8:47

So this is a example of a trial that can have lots and lots of outcomes.

They don't always have to be, maybe one to six, or deal a card.

You can have programs that generate millions of possible outcomes.

In fact, let's just do a quick bit of counting here to finish off lecture.

Let's count how many possible love songs this generator can create.

9:08

So we need to do a little counting.

So we look at how we built up the thing we had, three stanzas

and an optional riff by Pitbull, so how many possible stanzas can we have?

Well, let's see, a stanza consists of a

first line, there were four possibilities for that,

four possibilities for the second line, four possibilities

for the third, four possibilities for the fourth.

So there's four to the fourth possible stanzas we can create.

Four to the fourth is 256.

So the number of love songs is going to be 256 times 256 times 256.

And then we have, I guess, a 50/50 chance whether Pitbull actually provides a riff.

So let's go over and compute that real quick.

So we're going to do print, let's see two, that's whether or not we have a

riff by Pitbull, and it's going to be 256 to the third.