0:03

Hello I'm Professor Brian Bushee, welcome back.

In this video, we're going to let the data speak, literally let the data speak.

We're gonna throw out any notion of trying to model discretionary earnings or

unusual trends and ratios and instead just look at the distribution of the numbers.

There's something called Benford's Law, which reports how frequently certain

numbers appear in naturally generated data sets.

But that means that in artificially generated data sets, for example,

financial statements that have been manipulated through earnings management,

the numbers won't appear in the same frequency as Benford's Law.

And so we can look for

deviations from Benford's Law to find financial statements that look fishy.

It's pretty off-the-wall stuff but it's gaining prominence

as a fraud detection tool, so let's take a look at how it works.

0:57

Okay, for our last model we're gonna throw away all notions of trying to model any

discretionary earnings or fraud prediction ratios.

And instead we're just gonna let the data speak, to tell us whether it's true or

whether it's been manipulated.

And we're gonna do this with something called Benford's Law.

1:14

So, back in 1881, Simon Newcomb determined that the probability that

a number has a first digit, d, is given by this equation here.

Now, apparently he discovered this because back then you had to use

books of logarithms to do complex calculations.

And he noticed that the pages in the ones were much more worn than the pages in

the nines.

So, he got the idea that ones must appear more often than nines as leading digits,

and what he found is that they seem to follow this distribution.

1:45

In 1938, Frank Benford found a large number of

naturally occurring data sets follow this pattern.

So, you find this in the surface area of rivers, molecular weights, death rates,

street addresses and the numbers contained in an issue of Reader's Digest.

2:01

So, much like what always happens is the person that discovers something

doesn't get the credit, it's the person that popularizes it.

So this became known as Benford's Law instead of Newcomb's Law.

And since then it's been used to find irregularities in published scientific

studies, fraudulent election data in Iran, suspicious macroeconomic data from Greece,

and tax return misreporting.

So in each of these cases the distribution of the numbers and what you're looking at

doesn't meet this equation first discovered by Simon Newcomb and

popularized by Frank Benford.

And we're gonna use this distribution to look for

irregularities in financial statements.

2:54

>> So, Benford's Law is what's called a phenomenological law, which means that

it explains a phenomenon that occurs in nature, but we have no idea why it occurs.

In fact, I found a quote where a leading scientist had said

it continues to defy attempts at an easy derivation.

So, if the scientists don't know a good explanation,

then I certainly can't come up with one.

But it seems to work in a lot of unconstrained,

naturally occurring data sets.

And just because we don't understand why it happens

doesn't mean that we can't use it to our benefit.

So, let's go on.

3:31

Here's what the expected distribution of leading digits is under Benford's Law.

So you'd expect to see a 1 as the first digit about 30% of the time.

Whereas a 9 would be a first digit only about 5% of the time.

Notice that 0 doesn't appear here, so if you're working with a decimal number like

0.05, we would consider 5 to be the leading digit.

There's another example to show how this can detect fraud.

Here's the distribution of leading digits from 215 months of returns for

the Fairfield Sentry Fund which was a fund that invested only with Bernie Madoff.

And if you haven't heard of Bernie Madoff maybe you should Google him.

It's sort of an interesting read.

But he ran this thing called a Ponzi scheme,

which was a fraudulent investment fund where he basically would pay off investors

from new money that he would attract.

So, as long as he would attract new money, he could pay back to investors.

But if the new money ever stopped, then the Ponzi scheme would collapse.

4:29

Anyway, if you look at the distribution of the first digit of the returns,

you find that there were way too many 1's in the first digits, not enough 9's.

And so the more you see a discrepancy from Benford's Law,

the more likely it is you're looking at numbers that have been made up,

as opposed to represent real returns earned by a real fund.

5:07

>> That raises an excellent point, that as Benford's Law becomes a more popular

fraud detection technique, eventually the fraudsters will catch on.

And they will make sure to commit their frauds in a way where the financial

statement numbers conform to the Benford Law of Distribution.

This happens with any kind of fraud detection tool

where the more that the fraudsters know about it, the more they can try

to make their numbers meet the expected level of the model.

But nobody's found a way to manipulate all of the tools that we've looked at in this

course, which means that even though Benford's Law may work in a lot of cases,

you don't wanna rely on it exclusively.

Because the manager can't manipulate their numbers to make everything look good,

they're gonna trip up somewhere, and

that's why you gotta look at multiple tools.

5:56

The approach that we're gonna use to apply Benford's Law to financial statements was

laid out in a paper by Dan Amiram, Zahn Bozanic, and Ethan Rouen in 2014.

What they did was they aggregated all the financial statement data they could

get by industries, by years, they looked from 2000 to 2011.

And that they found that the leading digits in finance statements

tend to follow Benford's distribution.

In fact, 83% of firm's financial statements conform with this distribution.

6:24

So when it doesn't conform, when there's larger discrepancies between

the distribution of numbers in the financial statement and

the Benford distribution, they also found larger Modified Jones Model discretionary

accruals and higher Beneish M-Scores.

So the discrepancies did seem to line up with other tools that we've looked at to

detect earnings management.

6:45

So here's how we're gonna calculate the discrepancy from Benford's Law.

First, we need a large number of financial statement items.

There's no theory on how many are needed but more is better.

So one good way to do this is to basically take all the numbers in the balance sheet,

income statement, cash flow statement and run this test on them.

It doesn't matter what the numbers are, it just matters that we get these numbers.

7:10

Then we need to count the number of each leading digit.

So, we can use an Excel formula where we take the left most digit.

Now, you notice I've got an absolute value here.

That's because with negative numbers it would grab the minus sign so

we need to make it positive.

You also need to multiply it by 1,000 or some large

multiple of 10 before taking the digits, so that will help you with the decimals.

Cuz we don't want zero as the leading digit, we want the first non-zero number.

7:54

And we'll test whether it seems to be different,

using something called the Kolmogorov-Smirnov statistic,

not sure how to pronounce that, which is this sort of ugly formula here.

In this formula, AD is the actual frequency of the leading digit, so AD is

the actual frequency for 1 is the first number, AD2 is for 2 is the first number.

ED is the theoretical frequency or what's expected under the Benford distribution.

And basically, what this statistic is doing is it's looking for

the biggest point or the biggest cumulative difference in the distribution.

8:31

Then to see if that's statistically significant, if it's consistent with

earnings management, you wanna see if this statistic is greater than 1.36 divided

by the square root of P, where P is the total number of leading digits used.

So, this is something where the cut off we're gonna use

is a function of the number of leading digits.

9:29

Then what I need to do is use the formula to find the leading digit for each number.

So, for 129.30, the leading digit's 1, for

315.46 the leading digit is 3.

One that I want to show you is down here.

There's a number that's 0.51, we can't have 0 as our leading digit, so

we wanna pull the 5 as the leading digit.

So we got all the leading digits from all these numbers and

then we want to count everything up.

So, I've got a formula where I count up the number of times 1 is the leading

digit, the number of times 2 is the leading digit, then we add that up.

So, there were a 122 leading digits.

We can divide each count by the percent to see that in this example,

1's were the leading digit 31% of the time, 2's 13% of the time and so on.

Then I've got the expected distribution based on Benford's Law.

And what I calculate is what's called the cumulative difference.

So, for number one it's simple, it's just the difference between the actual and

expected distribution for one.

For two, the cumulative differences, I have to add up the distribution for

one and two, and compare that to the expected distribution for one and two, and

I find the difference is 2.6%.

For three, I add up 31%, 13.9%, 20.5%, and compare that to 30.1, 17.6, 12.5 and

they're off by 5 and so forth and so on.

So, the KS statistic is the maximum the difference which would be 5.4,

the cutoff is this formula of 1.36 divided by the square root of 122,

which is the number of items.

We can see that the KS statistic is way below the cut off.

So, we have no concerns in this case that there's

manipulation based on a deviation from Benford's law.

It conforms to Benford's law pretty closely.

11:31

>> Excellent point.

This is definitely a technique that works much

better the more numbers that you have.

It doesn't work as well at detecting fraud

if you only have a very small number of numbers to take a look at.

So, that's why using three years of numbers,

which is sort of the maximum amount of numbers you can get in a financial

statement, is gonna be more powerful.

Also, remember we've seen examples where if you manipulated one year,

it also affects the other year?

So, it's easier to pick up frauds if you find these deviations across

both years as opposed to one unusual year.

And definitely if you dig through this example,

it's a case where you wouldn't clearly find fraud looking at individual years.

But looking at all three years together, it does seem like these were

artificially generated numbers consistent with earnings management.

Now, we can do the same thing for Beagle Bagel.

So, we put in all the financial statement numbers we can find for

the last three years, calculate all the leading digits,

then count up the number of each of the leading digits to get

the actual distribution compared to the expected distribution.

And we end up with a KS statistic of 16.2%,

that's based on the cumulative difference between the number of one, twos, and

threes, actually versus what the Benford Law would say.

That 16.2% is greater than our cutoff of 12.3%.

So, in this case we would be suspicious that there was manipulation going on

with Beagle Bagel because their financial

statements don't meet this Benford distribution.

And we looked at one of their close competitors where their financial

statement does beat it.

So we can attribute this to some kind of industry effect, instead it

looks like they may have done somethings to manipulate there financial statements.

13:36

>> You're welcome, I've been happy to do this.

And I will ignore the sarcasm in saying 359 tools.

So which ones should you use?

Well, you should use all of them.

[LAUGH] So one thing that I hoped has definitely come across in these videos,

is that there is no one tool that will pick up all forms of earnings management.

If there was one tool that always worked, I certainly wouldn't tell you about it.

I would keep it to myself and get rich so

I wouldn't have to sit here making these videos.

14:04

Also, as we talked about it earlier, if there was one tool that

always seem to work, the fraudsters would find out about it and

do their manipulation in a way that the tool wouldn't pick it up.

So that's why you need to look at a big range of tools to find these kinds of

problems.

The big data approaches of this week are good starting points.

So if you've got a large number of companies to look at,

you run the Benford's test or the fraud prediction test,

figure out which five or six look suspicious, and

then dive in deeper to look at the other tools to see if there's a problem.

Because one thing I'm pretty sure about is that the more tools that suggest

there's manipulation,

the more likely it is that you've found a company that's manipulated their earnings.

14:48

That wraps our look at Benford's Law and it also wraps our week

on big data approaches to try to detect earnings management or fraud.

I hope some of these tools come in handy for you in the future and

help you identify some firms that may have fishy financial statements,

so that you can stay as far away from them as possible.

And I really hope that you enjoyed all this material and

I want to thank you for watching.

>> See you next video.