Corporate fraud / Benford's law
October 13, 2011 9:07 PM Subscribe
One way to measure corporate fraud is look at reported numbers and see if they follow Benford's law - number sets that are manipulated usually deviate from Benford's law. A recent analysis of all public companies over the past 50 years has shown a steady upward deviation, strongly suggesting there is more corporate fraud now than ever before (peaked in 2008).
Bernie Madoff seems to have been smart/lucky enough to avoid getting trapped by Benford's law.
Exploring Benford's law. (YT)
Previously.
Bernie Madoff seems to have been smart/lucky enough to avoid getting trapped by Benford's law.
Exploring Benford's law. (YT)
Previously.
wow the internet is getting very small.... i just finished reading this article which was linked from zerohedge...
posted by dawdle at 9:21 PM on October 13, 2011
posted by dawdle at 9:21 PM on October 13, 2011
I wonder if there's a few publicly traded companies who's CFO's minions will be working all weekend analyzing their disclosed numbers before Monday. Author hinted that she was going to expose a couple of specific companies. Would hate to be one of them.
Could her disclosure result in market movement?
posted by yesster at 9:23 PM on October 13, 2011 [1 favorite]
Could her disclosure result in market movement?
posted by yesster at 9:23 PM on October 13, 2011 [1 favorite]
Wow. I can't believe that worked.
Replicating Benford's law has always seemed like fraud 101, as unreasonable deviations are such a huge red flag. Not only are the accountants lying, they're doing it badly.
posted by Orange Pamplemousse at 9:25 PM on October 13, 2011 [2 favorites]
Replicating Benford's law has always seemed like fraud 101, as unreasonable deviations are such a huge red flag. Not only are the accountants lying, they're doing it badly.
posted by Orange Pamplemousse at 9:25 PM on October 13, 2011 [2 favorites]
Orange Pamplemousse: "Replicating Benford's law has always seemed like fraud 101, as unreasonable deviations are such a huge red flag. Not only are the accountants lying, they're doing it badly."
In their defense, fraud 101 wasn't mandatory at university.
posted by pwnguin at 9:33 PM on October 13, 2011 [3 favorites]
In their defense, fraud 101 wasn't mandatory at university.
posted by pwnguin at 9:33 PM on October 13, 2011 [3 favorites]
In their defense, fraud 101 wasn't mandatory at university.
MBA 601.
posted by ryoshu at 9:38 PM on October 13, 2011 [15 favorites]
MBA 601.
posted by ryoshu at 9:38 PM on October 13, 2011 [15 favorites]
Benford's law applies to numbers generated by exponential growth. The reason small numbers are more frequent as a first digit is that growth is faster when the principle is larger. Right after breaking a "digit barrier" growth is slower then right before breaking one, so you don't hang out at the 9xxx area very long at all.
Corporate profits and expenses don't follow exponential growth. Yes, there are exponential factors (economy of scale is the major one) but for the most part the higher the number grow the HARDER it is for them to grow further (saturated markets) so it's the exact opposite of the kind of processes that track with Benford's law.
So . . . I dispute the validity of this method, and have FINALLY gotten some actual use out of that graduate degree in mathematical studies.
posted by oblio_one at 9:45 PM on October 13, 2011 [7 favorites]
Corporate profits and expenses don't follow exponential growth. Yes, there are exponential factors (economy of scale is the major one) but for the most part the higher the number grow the HARDER it is for them to grow further (saturated markets) so it's the exact opposite of the kind of processes that track with Benford's law.
So . . . I dispute the validity of this method, and have FINALLY gotten some actual use out of that graduate degree in mathematical studies.
posted by oblio_one at 9:45 PM on October 13, 2011 [7 favorites]
IIRC, this is how Nate Silver caught the fraudulent pollster that was scamming Daily Kos with fabricated commissioned poll results. It's a fascinating principle, and very difficult to fake convincingly.
posted by Rhaomi at 9:55 PM on October 13, 2011
posted by Rhaomi at 9:55 PM on October 13, 2011
"In their defense, fraud 101 wasn't mandatory at university."
Any sufficiently advanced business strategy is indistinguishable from fraud
posted by Blasdelb at 10:06 PM on October 13, 2011 [38 favorites]
Any sufficiently advanced business strategy is indistinguishable from fraud
posted by Blasdelb at 10:06 PM on October 13, 2011 [38 favorites]
Can someone explain, in plainspeak, the first two graphs from this link?
posted by vidur at 10:10 PM on October 13, 2011
posted by vidur at 10:10 PM on October 13, 2011
Benford's Law is quite interesting. I just checked the Metafilter user numbers of everyone who commented in this thread before I did. Of the 10 comments before this one, exactly 3 start with the digit 1... so there's your 30%.
I interpret this to mean that all of the comments above were made by real people, which is comforting.
posted by twoleftfeet at 10:16 PM on October 13, 2011 [2 favorites]
I interpret this to mean that all of the comments above were made by real people, which is comforting.
posted by twoleftfeet at 10:16 PM on October 13, 2011 [2 favorites]
vidur: Can someone explain, in plainspeak, the first two graphs from this link?
For the first graph, I think he's taking the set of all reported total assets for all companies and then for each digit 1 through 9, plotting the odds that a given company has assets starting with that digit. For example, in the first graph the first bar on the left is at about .3, which means that about 30% of all companies report asset amounts starting with a 1. The second bar is at .18 or so which means that about 18% start with 2, and so on for all digits.
The second graph is the same, only for revenue.
posted by Wemmick at 10:20 PM on October 13, 2011
For the first graph, I think he's taking the set of all reported total assets for all companies and then for each digit 1 through 9, plotting the odds that a given company has assets starting with that digit. For example, in the first graph the first bar on the left is at about .3, which means that about 30% of all companies report asset amounts starting with a 1. The second bar is at .18 or so which means that about 18% start with 2, and so on for all digits.
The second graph is the same, only for revenue.
posted by Wemmick at 10:20 PM on October 13, 2011
vidur - the author is just illustrating how data he or she pulled from publicly available financial data from over 20,000 corporations follow Benford's Law. (The data would be public because publicly listed companies are obliged to report this stuff, at the very least in their annual reports, and probably more often than that)
The graph marked REVTQ is for total revenue data (eg for the financial year); ATQ is asset data (eg total assets at end of FY).
I'd guess that the author wanted to examine revenue & assets separately, as they're different kinds of things & might behave differently.
posted by UbuRoivas at 10:20 PM on October 13, 2011
The graph marked REVTQ is for total revenue data (eg for the financial year); ATQ is asset data (eg total assets at end of FY).
I'd guess that the author wanted to examine revenue & assets separately, as they're different kinds of things & might behave differently.
posted by UbuRoivas at 10:20 PM on October 13, 2011
Can someone explain, in plainspeak, the first two graphs from this link?
Two answers already, but I'll post another, in case you want it broken down even further.
She was testing how well Benford's Law fit the data. The values predicted by Benford's Law are given by the red dots, the data values are given by the height of the bars (one for each digit 1,...,9). The fact that the dots lie almost exactly at the tops of the bars shows that Benford's Law predicts these values extremely well. (Compare these graphs to the graph in the Example section of the Wikipedia article, where the match is less good, but still ok.)
posted by -jf- at 10:26 PM on October 13, 2011
Two answers already, but I'll post another, in case you want it broken down even further.
She was testing how well Benford's Law fit the data. The values predicted by Benford's Law are given by the red dots, the data values are given by the height of the bars (one for each digit 1,...,9). The fact that the dots lie almost exactly at the tops of the bars shows that Benford's Law predicts these values extremely well. (Compare these graphs to the graph in the Example section of the Wikipedia article, where the match is less good, but still ok.)
posted by -jf- at 10:26 PM on October 13, 2011
Corporate profits and expenses don't follow exponential growth ... So ... I dispute the validity of this method
Have you seen the link under "recent"? Look at the first two graphs. They're pretty convincing.
Benford's law applies to numbers generated by exponential growth.
Have you seen the link under "Benford's Law"? Yes, Benford's Law applies when there is exponential growth, but not only when there is exponential growth. For instance, it also applies when there is scale invariance. See the Scale Invariance subsection in that link.
posted by -jf- at 10:36 PM on October 13, 2011 [3 favorites]
Have you seen the link under "recent"? Look at the first two graphs. They're pretty convincing.
Benford's law applies to numbers generated by exponential growth.
Have you seen the link under "Benford's Law"? Yes, Benford's Law applies when there is exponential growth, but not only when there is exponential growth. For instance, it also applies when there is scale invariance. See the Scale Invariance subsection in that link.
posted by -jf- at 10:36 PM on October 13, 2011 [3 favorites]
Okay, so the Revenue and Asset numbers fit the predictions of Benford's Law. So, what's the chart with the deviation about?
The article says, "today the empirical distribution of each digit is about 3 percentage points off from what Benford's law would predict". That really doesn't sound like much. Am I missing something?
Sorry if this sounds rather dumb. I normally get this stuff. Maybe I should go get some coffee.
posted by vidur at 10:36 PM on October 13, 2011
The article says, "today the empirical distribution of each digit is about 3 percentage points off from what Benford's law would predict". That really doesn't sound like much. Am I missing something?
Sorry if this sounds rather dumb. I normally get this stuff. Maybe I should go get some coffee.
posted by vidur at 10:36 PM on October 13, 2011
If you're cooking the books on a large corporate level, it'd probably be pretty trivial to use a script to generate numbers that fit this law.
Absolutely, but why bother with even this when you know the SEC isn't going to do anything - they haven't done anything in a decade except jail Martha Stewart.
This game is rigged. And unless you cheat, like Stewart or Madoff, you cannot win. You're better off in a casino where at least you know the house has a numerical advantage.
posted by three blind mice at 10:52 PM on October 13, 2011 [2 favorites]
Absolutely, but why bother with even this when you know the SEC isn't going to do anything - they haven't done anything in a decade except jail Martha Stewart.
This game is rigged. And unless you cheat, like Stewart or Madoff, you cannot win. You're better off in a casino where at least you know the house has a numerical advantage.
posted by three blind mice at 10:52 PM on October 13, 2011 [2 favorites]
Okay, so the Revenue and Asset numbers fit the predictions of Benford's Law. So, what's the chart with the deviation about?
The elephant in the room for me is that only revenue & asset numbers were shown to fit the law. And yet the author "used a standard set of 43 variables that comprise the basic components of corporate balance sheets and income statements (revenues, expenses, assets, liabilities, etc.)"
What about the other 41 variables? Did they inconveniently disobey the law?
posted by UbuRoivas at 11:00 PM on October 13, 2011
The elephant in the room for me is that only revenue & asset numbers were shown to fit the law. And yet the author "used a standard set of 43 variables that comprise the basic components of corporate balance sheets and income statements (revenues, expenses, assets, liabilities, etc.)"
What about the other 41 variables? Did they inconveniently disobey the law?
posted by UbuRoivas at 11:00 PM on October 13, 2011
There's no way for an individual company to generate numbers to fit Benford's law in this case, because the analysis is looking across the data for over 20,000 firms. The analysis only suggests that fraud occurred, but can't point to any single company.
posted by Wemmick at 11:03 PM on October 13, 2011
posted by Wemmick at 11:03 PM on October 13, 2011
Benford's law applies to numbers generated by exponential growth....Corporate profits and expenses don't follow exponential growth.
Let's use a simple model. Let's say I have $100000 and get a 2% return on investment per quarter. Using the awesome power of the left command and compounding quarterly I get something like 36 x 1's, 20 x 2's, 15 x 3's...6x9's for 30 years of 2% ROI. The one's seem a little low, though.
posted by Kid Charlemagne at 11:08 PM on October 13, 2011
Let's use a simple model. Let's say I have $100000 and get a 2% return on investment per quarter. Using the awesome power of the left command and compounding quarterly I get something like 36 x 1's, 20 x 2's, 15 x 3's...6x9's for 30 years of 2% ROI. The one's seem a little low, though.
posted by Kid Charlemagne at 11:08 PM on October 13, 2011
If you guys like, I can go knock on her office door and explain that I'm from the internet and we'd like to talk to her. She's only like a mile away and I'm sure there's nothing like a big hairy guy from the internet to make a positive impression.
posted by Kid Charlemagne at 11:11 PM on October 13, 2011 [3 favorites]
posted by Kid Charlemagne at 11:11 PM on October 13, 2011 [3 favorites]
Any sufficiently advanced business strategy is indistinguishable from fraud
Any fraud distinguishable from advanced business strategey is insufficiently advanced.
posted by Clave at 11:43 PM on October 13, 2011 [5 favorites]
Any fraud distinguishable from advanced business strategey is insufficiently advanced.
posted by Clave at 11:43 PM on October 13, 2011 [5 favorites]
Benford's law is very easy to understand without explaining logarithms or using somewhat complex mathematics.
Let's say you start a new business worth $1,000. You double the size of your business every year for four years at a very predictable rate. You spend the first year between $1,000 to $1,999. The second year's valuation is between $2,000 and $3,999. The third year's valuation is between $4,000 and $7,999. The fourth year's valuation is between $8,000 and $16,000.
A substantial portion of this example has the front digit of 1 (All of the first year and a large portion of the 4th year). Why? If doubling up from 1 to 2 is just as difficult as doubling up from 4 to 8, going from 1 to 2 will have 1 dominate the front digit in terms of time, whereas going from 4 to 8 will have the front digit 4 be substantially more brief (as it will go from 4 to 5 as the front digit before a quarter of that year has elapsed).
To make Benford's Law work, you have to be very careful about systemic or environmental biases, though. For instance, it doesn't work on height, but it'd probably work on counting populations, wealth, or the amount of stars in a galaxy. Also, I'd be surprised if the IRS wasn't applying Benford's Law on tax returns already.
posted by amuseDetachment at 12:26 AM on October 14, 2011 [2 favorites]
Let's say you start a new business worth $1,000. You double the size of your business every year for four years at a very predictable rate. You spend the first year between $1,000 to $1,999. The second year's valuation is between $2,000 and $3,999. The third year's valuation is between $4,000 and $7,999. The fourth year's valuation is between $8,000 and $16,000.
A substantial portion of this example has the front digit of 1 (All of the first year and a large portion of the 4th year). Why? If doubling up from 1 to 2 is just as difficult as doubling up from 4 to 8, going from 1 to 2 will have 1 dominate the front digit in terms of time, whereas going from 4 to 8 will have the front digit 4 be substantially more brief (as it will go from 4 to 5 as the front digit before a quarter of that year has elapsed).
To make Benford's Law work, you have to be very careful about systemic or environmental biases, though. For instance, it doesn't work on height, but it'd probably work on counting populations, wealth, or the amount of stars in a galaxy. Also, I'd be surprised if the IRS wasn't applying Benford's Law on tax returns already.
posted by amuseDetachment at 12:26 AM on October 14, 2011 [2 favorites]
Tax returns... Huh. My tax guy and I were estimating some of my deductions last year; things like miles driven to work and stuff where I just don't have concrete records. My estimates would be nice round numbers but he would take care to 'add noise' as it were, purportedly to not raise suspicions that we had completely made these numbers up. I wonder if he knows about this?
posted by garethspor at 1:13 AM on October 14, 2011
posted by garethspor at 1:13 AM on October 14, 2011
Isn't there some circularity problem with that scale invariance argument, jf? Isn't the real point more than lognormal distributions with shape parameter greater than one satisfy Benford's Law.
In particular, there is some tendency for 'fancy' financial fraud to correctly emulate Benford's Law without even trying, simply because lognormal distributions are ubiquitous in finance.
posted by jeffburdges at 1:23 AM on October 14, 2011
In particular, there is some tendency for 'fancy' financial fraud to correctly emulate Benford's Law without even trying, simply because lognormal distributions are ubiquitous in finance.
posted by jeffburdges at 1:23 AM on October 14, 2011
Any sufficiently advanced business strategy is indistinguishable from fraud
Ladies and gentlemen, Blasdelb's Variant of Clarke's Third Law.
posted by BitterOldPunk at 1:47 AM on October 14, 2011 [3 favorites]
Ladies and gentlemen, Blasdelb's Variant of Clarke's Third Law.
posted by BitterOldPunk at 1:47 AM on October 14, 2011 [3 favorites]
The first rule of the free market is that you believe into the free market and its invisible benevolent spanking hand!
posted by elpapacito at 4:06 AM on October 14, 2011
posted by elpapacito at 4:06 AM on October 14, 2011
There's an even less sophisticated and intuitive explanation of Benford's Law:
Pick a number, any number. Then look at the numbers nearby. As the range gets wider, on the high side the first digit starts to go up. (If your number was 245 you get to the 300s then 400s then 500s; if your number was 6,431,010 you get to the 7 millions, the 8 millions, etc.).
Once you start getting 8's and 9's as a first digit, you only have to get a little bit larger before the first digit rolls over any you have a whole lot of numbers starting with 1. Then you have a long time before you hit any numbers starting with 2. (For instance, once you get past 700, 800, and 900 you hit 1000 and all the numbers start with 1 and you have to get past 1700, 1800, and 1900 before you hit 2000). And you have to go past a similar distance to get to numbers that start with 3. And if you get all the way up to numbers starting with 8 and 9, you're close to rolling over again with an even huger set of numbers that all start with 1.
So for any given number, as the range of nearby numbers gets wider, the numbers that are larger will be more likely to start with lower digits like 1, 2, and 3.
This is basically just a quirk of the base-10 numbering system and has nothing to do with the actual values being connected to each other in any way.
posted by straight at 9:32 AM on October 14, 2011
Pick a number, any number. Then look at the numbers nearby. As the range gets wider, on the high side the first digit starts to go up. (If your number was 245 you get to the 300s then 400s then 500s; if your number was 6,431,010 you get to the 7 millions, the 8 millions, etc.).
Once you start getting 8's and 9's as a first digit, you only have to get a little bit larger before the first digit rolls over any you have a whole lot of numbers starting with 1. Then you have a long time before you hit any numbers starting with 2. (For instance, once you get past 700, 800, and 900 you hit 1000 and all the numbers start with 1 and you have to get past 1700, 1800, and 1900 before you hit 2000). And you have to go past a similar distance to get to numbers that start with 3. And if you get all the way up to numbers starting with 8 and 9, you're close to rolling over again with an even huger set of numbers that all start with 1.
So for any given number, as the range of nearby numbers gets wider, the numbers that are larger will be more likely to start with lower digits like 1, 2, and 3.
This is basically just a quirk of the base-10 numbering system and has nothing to do with the actual values being connected to each other in any way.
posted by straight at 9:32 AM on October 14, 2011
For instance, it doesn't work on height, but it'd probably work on counting populations, wealth, or the amount of stars in a galaxy.
It doesn't work on height because there isn't a wide enough variation in heights. Basically it only works on sets of numbers where there is at least one order of magnitude difference in the possible values.
posted by straight at 9:36 AM on October 14, 2011 [2 favorites]
It doesn't work on height because there isn't a wide enough variation in heights. Basically it only works on sets of numbers where there is at least one order of magnitude difference in the possible values.
posted by straight at 9:36 AM on October 14, 2011 [2 favorites]
oblio_one: "Benford's law applies to numbers generated by exponential growth"
... as well as many other distributions.
Your suspicions are based on a poor understanding of Benford's Law. The article points out a truly significant finding.
posted by IAmBroom at 9:38 AM on October 14, 2011 [1 favorite]
... as well as many other distributions.
Your suspicions are based on a poor understanding of Benford's Law. The article points out a truly significant finding.
posted by IAmBroom at 9:38 AM on October 14, 2011 [1 favorite]
Tax returns... Huh. My tax guy and I were estimating some of my deductions last year; things like miles driven to work and stuff where I just don't have concrete records. My estimates would be nice round numbers but he would take care to 'add noise' as it were, purportedly to not raise suspicions that we had completely made these numbers up. I wonder if he knows about this?
From my friends who cheat on their taxes, their accountants are familiar with the amounts that raise red flags and stay right below them. I really doubt the IRS is using this method, but they probably should.
posted by mrgrimm at 10:09 AM on October 14, 2011
From my friends who cheat on their taxes, their accountants are familiar with the amounts that raise red flags and stay right below them. I really doubt the IRS is using this method, but they probably should.
posted by mrgrimm at 10:09 AM on October 14, 2011
MrGrimm, this law goes around every few years. I couldn't cite the source, but I remember reading many years ago, perhaps over 10, that the IRS is well aware of this law and does indeed use it... They'd be foolish not to. As far as proof goes, from what I can gather, the IRS doesn't like to talk about the methods they use to catch cheaters, for obvious reasons.
posted by PigAlien at 1:02 PM on October 14, 2011
posted by PigAlien at 1:02 PM on October 14, 2011
This is basically just a quirk of the base-10 numbering system and has nothing to do with the actual values being connected to each other in any way.
This is misleading, as it suggests that the property described by Benford's Law is not an attribute of the data but instead of the numbering system itself. It's absolutely nothing to do with something special about base 10, there are equivalent distributions in any base.
posted by axiom at 2:12 PM on October 14, 2011
This is misleading, as it suggests that the property described by Benford's Law is not an attribute of the data but instead of the numbering system itself. It's absolutely nothing to do with something special about base 10, there are equivalent distributions in any base.
posted by axiom at 2:12 PM on October 14, 2011
That's right, axiom. What I meant was that it's a property of using an Arabic numerals type system where you have repeating symbols in the "ones" place the "tens" place the "hundreds" place, etc. You'd have the same phenomenon in Base 12.
My point is was that it's a quirk of how numbers are written down; it has nothing to do with what the numbers represent--compounding interest, atomic weights, box office sales, lengths of rivers or whatever (except that you only see the effect when the numbers vary by at least an order of magnitude).
posted by straight at 2:46 PM on October 14, 2011 [1 favorite]
My point is was that it's a quirk of how numbers are written down; it has nothing to do with what the numbers represent--compounding interest, atomic weights, box office sales, lengths of rivers or whatever (except that you only see the effect when the numbers vary by at least an order of magnitude).
posted by straight at 2:46 PM on October 14, 2011 [1 favorite]
straight: "You'd have the same phenomenon in Base 12."
I just converted a dataset to base 2, and they all start with a 1!!
posted by pwnguin at 3:13 PM on October 14, 2011 [1 favorite]
I just converted a dataset to base 2, and they all start with a 1!!
posted by pwnguin at 3:13 PM on October 14, 2011 [1 favorite]
Isn't there some circularity problem with that scale invariance argument, jf?
There might be. Rereading that subsection, it does seem a little off, and at least a little opaque. I see that your link cites a couple criticisms of the scale invariance argument.
Either way, though, my larger point stands. You don't need exponential growth for Bendord's Law.
posted by -jf- at 4:11 PM on October 14, 2011
There might be. Rereading that subsection, it does seem a little off, and at least a little opaque. I see that your link cites a couple criticisms of the scale invariance argument.
Either way, though, my larger point stands. You don't need exponential growth for Bendord's Law.
posted by -jf- at 4:11 PM on October 14, 2011
Wait. Wouldn't the sum of squared deviations from Benford's Law predictions simply increase with the number of data points being used? I.e. this measure would increase as the number of firms increases?
posted by parudox at 6:03 PM on October 14, 2011
posted by parudox at 6:03 PM on October 14, 2011
A follow up post from the author:
Quite a bit more inside to chew on.
posted by pwnguin at 6:59 PM on October 24, 2011 [2 favorites]
After digging into the Benford's law results from my previous post a bit more, I discovered that a different effect is driving the time-series pattern than I first thought. The reason is that Benford's law only applies to nonzero digits, while accounting data contain many zero values. In my original results, I also failed to account for negative numbers. Zeros and negatives added to the total number of observations but not to the counts for any digit from 1-9, leading to a mechanical "deviation" from Benford's law.
Quite a bit more inside to chew on.
posted by pwnguin at 6:59 PM on October 24, 2011 [2 favorites]
I seem to have totally screwed that up that link. Benford's law: a revised analysis. Of course, it's easy to argue the chart doesn't show a decline in fraud, but rather an increase in fraud sophistication!
posted by pwnguin at 7:14 AM on October 26, 2011
posted by pwnguin at 7:14 AM on October 26, 2011
« Older The Lean Publishing Manifesto | The Paint Factory of Blombos Newer »
This thread has been archived and is closed to new comments
Granted, this assumes investors and courts care enough about this type of evidence for deviation to have consequences. I get the impression this just causes minor suspicion at most.
posted by mccarty.tim at 9:18 PM on October 13, 2011