Something to keep in mind is that when you're splitting your

data sets up into training, testing and validation sets, they can

get a little bit small, but you need to avoid small

sample sizes, particularly if you're dealing with the test set size.

And the reason why is, suppose you were predicting a binary outcome, so in my

case, a very common thing to try to do is to predict diseased versus healthy.

And in general, it might be something like, whether people will

click on an ad, or whether they won't click on an ad.

Then one classifier is just flipping a coin.

You could always just flip a coin, and say, they'll be diseased if

the coin is heads, and not diseased if the coin comes out tails.

And so the probability of a perfect classification, using this

really silly algorithm is one half raised to the sample size.

In other words, half the time you'll be

right, just by chance by flipping the coin.

And each time, supposing each prediction is independent,

then each time that you flip a coin, then

you'll get one half times that, a number will

be the decrease in accuracy that you would get.

So if you were pr-, test set has only one sample in

it, then you have about a 50/50 chance of getting that sample right.

So, even if you got prediction accuracy of 100% on the test set,

you would have a 50% chance of that, even with a coin flip.

With n equals 2, you only have a, you

still only have a 25% chance of 100% accuracy.

And with n equals 10 in your test set, now

you have, only about a .1% chance of getting 100% accuracy.

So if you see that 100% accuracy you'll feel a little bit

more confident that it's actually true and it's not something that's just random.

So this suggests that we should make sure

that especially our test sizes are of relatively

large size so we can be sure that

we're not just getting good prediction accuracy by chance.