Puzzler: Benford’s law

Posted by Doug Hull, April 8, 2010

8 views (last 30 days) | 0 Likes | 23 comments

I had a question come in recently from Denmark about wanting to implement Benford’s law.

I first heard about this law on the Radiolab episode about Numbers. Radiolab is one of my favorite podcasts, highly recommended!

This law basically says that

“in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 almost one third of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than one time in twenty. This distribution of first digits arises whenever a set of values has logarithms that are distributed uniformly, as is approximately the case with many measurements of real-world values.”

(wikipedia)

Apparently this law can be applied to measurements of natural phenomenon like coastal windfarm energy output, numbers of geese migrating, and various areas of waterflow if they follow the above described distribution.

Where this law becomes more than a novelty is looking at accounting expense reports. Don’t try and sneak those tickets to see the OB play or the Mighty Boosh in among your travel expenses (even if you are about to leave the company!) When people are making up fake expenses, they will choose numbers from a normal distributions. However, they should be choosing them with more ones, and less nines as the leading digit!

Anyways, I found it surprisingly challenging to extract the leading non-zero digit from a number. Remember, the number can be a decimal, negative or both. I am always amazed at the elegant solutions people come up with for simple problems like this.

SAfe to say that if you post your function in the comments, the most elegant (whatever that means at the time I am looking them over…) get’s some MATLAB swag.

0.001345 –> 1

3452.3 –> 3

-582.3 –> 5

etc…