PsyFi Search

Wednesday 19 January 2011

Forensic Finance, Benford's Way

Misnamed and Mad

One of the more curious statistical anomalies of the universe has turned out to have a range of practical applications in the world of finance. It also turns out to be one of the stranger laws underpinning the stockmarket, although figuring out why is a painful process.

Above all, Benford’s Law can be used to spot financial fraudsters because of a psychological quirk that means we humans are dead useless at pretending to be random. All of this comes from a law that was discovered by one man, explained by another and named after a third. Now that’s random.

Naturally Bizarre

Take any naturally occurring set of numbers and keep a count of the first digit in the number. A newspaper is an ideal test bed. Most people reckon that each number from 1 to 9 will occur with equal frequency: after all, why would they not? Yet they don’t: the number 1 appears almost a third of the time and each subsequent number reduces in frequency. And this, frankly, is bizarre at first flush.

This odd feature of the world was first noted by the polymath Simon Newcomb back in the nineteenth century. ‘Polymath’, of course, is an old-fashioned way of saying ‘smart ass’. Anyway, Newcomb noticed that books of logarithm tables were more worn at the front – where the numbers started with a 1 – than at the back. For those youngsters at the back there, log tables were what we used before someone invented the calculator, devised by John Napier back in the fourteenth century as the bane of schoolchildren for four centuries.

Newcomb figured out the curious nature of Benford’s Law, published his observations and then was promptly forgotten about. Frank Benford independently discovered the law in the 1930’s and compiled copious examples to make his point. His original data table shows that this distribution of leading integers applies for a vast range of different quantities.

Benford Stockmarkets

In fact it turns out that daily stockmarket returns follow Benford’s Law. When Eduardo Ley investigated this in On the Peculiar Distribution of the U.S. Stock Indeces’ Digits he observed that:
“The analysis presented here suggests that small changes are more likely than big ones; at the same time, the closer the daily changes are (in absolute value) to 0.1%, the more probable they are too”.
Which accords with commonsense: daily movements in stockmarkets are more likely to be small than large. This is characteristic of situations where Benford’s Law works – the numbers need to describe the sizes of similar phenomena and need to be ranged across different magnitudes. So human height doesn’t follow the law. They also can’t be assigned, like social security numbers. However, where the numbers cover several orders of magnitude, like peoples’ income, then it’s likely to hold.

A familiar example would be the growth of stock market capitalisation. If a company has a market cap of $100 million and is growing at 10% per year it’ll take it about 8 years to grow to $200 million, a further 4 to get to $300 million, 3 to get to $400 million and so on. A first digit of 1 naturally occurs more often in such situations.

Tricky Math

Although it’s not too difficult to see why this peculiar distribution occurs it turns out to be exceptionally difficult to unpick the mathematics. In fact mathematicians are still arguing about it, even though Ted Hill came up with an explanation over a decade ago that suggests, more or less, that it’s what you get if you sample naturally occurring distributions randomly. Hill explains this in this Scientific American article:
“Suppose you are collecting data from a newspaper, and the first article concerns lottery numbers (which are generally uniformly distributed), the second article concerns a particular population with a standard bell-curve distribution and the third is an update of the latest calculations of atomic weights. None of these distributions has significant-digit frequencies close to Benford’s Law, but their average does, and sampling randomly from all three will yield digital frequencies close to Benford’s Law”.
Although that’s a darned peculiar newspaper.

Psychology's Not Random

Now Benford’s Law isn’t something you’d actually expect to find so if you’re not aware of it and you want to fake some numbers for a naturally occurring distribution you’d probably come up with a random distribution. Of course, if you’re looking at a distribution that naturally follows Benford’s Law you’d end up producing something completely wrong. Human psychology being what it is we are very, very bad at pretending to be random anyway. Ted Hill, again, explains this in The Difficulty of Faking Data where he gets his students to:
“Flip a coin 200 times and record the results, or merely pretend to flip a coin and fake the results. The next day I analyse them by glancing at each student’s list and correctly separating nearly all the true from the faked data. The fact in this case is that in a truly random sequence of 200 tosses it is extremely likely that a run of six heads or six tails will occur … but the average person trying to fake such a sequence will rarely include runs of that length”.
This idea lends itself to analysis of financial figures and has been used to, amongst other things, uncover suspected fraud in an Iranian election, show that Bill Clinton rounded up the numbers in his tax return but didn’t falsify them and suggest that Enron started fiddling its numbers in 2002. The main proponent of this approach, especially as applied to financial analysis, has been Mark Nigrini, as described in I’ve Got Your Number.

Shady Corporations

When Saville applied this approach to the Johannesburg Stockmarket he found that the practice met the theory: applying Benford’s Law to the income statements of various companies, including many known to have produced fraudulent numbers he showed that:
“The test of Benford’s Law correctly identified 88.20% of the cases (30 out of 34 companies), and correctly identified 100% of ‘errant’ cases. The reason for this appears to be elegantly simple: like supernovae, fraudulent companies give themselves away by shining more brightly than their peers as they zealously thrash away in their final moments”.
Basically when people start fabricating numbers they simply can’t help but make them up to fit their expectations. This was never more true than with the entrepreneurial yet mathematically illiterate Kevin Lawrence who, as reported by Leonard Mlodinow in his superbly enlightening The Drunkard’s Walk, having raised $91 million to set up a chain of health clubs proceeded to buy:
“Several homes, twenty personal watercraft, forty-seven cars (including five Hummers, four Ferraris, three Dodge Vipers, two DeTomaso Panteras, and a Lamborghini Diablo), two Rolex watches, a twenty-one carat diamond bracelet, a $200,000 samurai sword, and a commercial candy machine”.
He was caught by a determined accountant applying Benford’s Law to his company’s financial dealings. As in so many cases of fraud you end up wondering how exactly he thought he was going to get away with it. Still it’s encouraging to think that a relatively simple analysis of company income statements can reveal recidivist corporations. However, we need to remember that Benford’s Law is by no means perfect. It will throw up false positives, and even Dilbert knows that.

Related articles: Recency: Hot Hands and the Gambler's Fallacy, Physics Risk Isn't Market Uncertainty, In Markets Bad Stuff Happens - Frequently


  1. Interesting post, thanks.


    "randomly sample naturally occurring distributions randomly"

    Now that’s random.

  2. Typo?:

    "randomly sample naturally occurring distributions randomly"

    Yep, too many "randomly"s: some kind of post-moderist joke - i.e. not funny.

    Fixed, thanks.

  3. We see it when we believe it. Great post.

  4. Were you inspired by IOT or is that a random coincidence?

  5. A fascinating phenomenon, thanks. On a simpler level, just asking people to produce random two digit numbers often makes them reluctant to choose repeated digits like 11, 22 etc.

  6. Hi Patrick

    Interesting program. But no, I wrote this about 3 months ago as a follow up to Cardano's Gambit. If you're interested in this stuff Mlowdinow's book is excellent: I found that because I was looking for an example of fraud on a corporate scale for the post: serendipity is everywhere, if you look hard enough :)

  7. Random House published a very good book with random numbers in the fifties. As was pointed out by several of the reviewers, the only problem was that it lacked an index or other tool to help if you wanted to locate your favorite random number.