Benford’s Law and the Hidden Geometry of All Numbers
When I came across Benford’s Law, also known as the Signifigant Digit Phenomenon, I was pissed off. Because I recently turned 26, and that means I’ve spent a quarter century of my life completely unaware of what would appear to be one of the most important mathematical laws in existence. Worse still, this particular mathematical law was discovered in 1881, and formally proven in 1935, so I am several lifetimes behind the curve.
Surprise is good—existiential shock is even better. As we’ll see in the course of this article, being whumped upside the head by the utterly absurd and totally unexpected is the best possible way for a human being to learn—there are whole disciplines of mathematics that define Information as “the difference that makes a difference.”
And man...this is a difference that messed my head up good.
Dr. Theodore P. Hill asks his mathematics students at the Georgia Institute of Technology to go home and either flip a coin 200 times and record the results, or merely pretend to flip a coin and fake 200 results. The following day he runs his eye over the homework data, and to the students’ amazement, he easily fingers nearly all those who faked their tosses.
“The truth is,” he said in an interview, “most people don’t know the real odds of such an exercise, so they can’t fake data convincingly.”
Dr. Hill is one of a growing number of statisticians, accountants and mathematicians who are convinced that an astonishing mathematical theorem known as Benford’s Law is a powerful and relatively simple tool for pointing suspicion at frauds, embezzlers, tax evaders, sloppy accountants and even computer bugs.
The income tax agencies of several nations and several states, including California, are using detection software based on Benford’s Law, as are a score of large companies and accounting businesses.
So what’s going on? Well, it turns out that random is not random after all. No matter where you assemble your numbers from—and dig the examples in the chart above, birth rates, census data, nearly anything—you will find that the distribution of digits will obey the curve depicted above. The digit 1 will occur more than 30% of the time, and there’s an even curve down to the digit 9.
Is all this explainable by a simple tautology? It’s also interesting to note that Benford’s Law is true for nearly anything—observe how Lottery numbers do not conform to the curve. As we will see later, this is red flag evidence that the Lottery, much like presidential elections or Las Vegas card games, is in fact rigged.
News Is Not Information But Information Is News
In a recent Skilluminati Research ramble about Rethinking Information and the Attention Economy, I quoted a passage from Mark Taylors The Moment of Complexity: Emerging Network Culture which bears repeating (and expanding) here:
Information, we have discovered, is inversely related to probability...the more improbable a phenomenon or event, the less it is anticipated, and thus the more information it communicates when it occurs.
Another source of confusion is that we generally do not think of information as a liability. We pay to have newspapers delivered, not taken away. Intuitively, the record of past actions appears to be valuable (or at worst, useless) commodity. Perhaps the increasing awareness of environmental pollution and the information explosion brought on by computers have made the idea that information can have a negative value seem more natural now than it would have seemed earlier in the century.
This is why so much of the progress in psychology this past century has come from the study of abnormal cases, and this is also why genetics and evolution theory are both deeply indebted to an obscure branch of science known as Teratology—the study of monsters. Monsters in this case meaning the unworkable and spectacular mutations and birth defects that fill up Philladelphia’s uniquely memorable Mutter Museum. When I was a little kid, I enjoyed reading my mom’s Oliver Sacks books, such as The Man Who Mistook His Wife for a Hat, because Sacks was an eminent neurologist who made startling discoveries and advances by studying the one-in-100-million rare disorders of the brain.
Explanations vs. Solutions
As I might have observed once or twice before, humans are exceedingly talented at solving problems. Explaining how our solutions work, though....yeah, not so much. Although we have abundantly proven that it exists and hold true for a dizzying array of phenomena at all levels of scale, nobody has provided an adquate explanation for why it happens.
Though empirically firmly established by Benford’s experiments (and by later findings), there has been no convincing physical or mathematical explanation for the Significant Digit Phenomenon. Several explanations have been proposed, including arguments based on scale- and base-invariance, extensions of the notion of natural density, and picking a probability distribution at random. However, none of these arguments have been fully convincing.
--courtesy of this rather sparse website
I cannot resist passing the microphone to the ghost of R. Buckminster Fuller for a second:
While it takes but meager search to discover that many well-known concepts are false, it takes considerable search and even more careful examination of one’s own personal experiences and inadvertently spontaneous reflexing to discover that there are many popularly and even professionally unknown, yet nonetheless fundamental, concepts to hold true in all cases and that already have been discovered by other as yet obscure individuals. That is to say that many scientific generalizations have been discovered but have not come to the attention of what we call the educated world at large, thereafter to be incorporated tardily within the formal education processes, and even more tardily, in the ongoing political-economic affairs of everyday life. Knowledge of the existence and comprehensive significance of these as yet popularly unrecognized natural laws often is requisite to the solution of many of the as yet unsolved problems now confronting society. Lack of knowledge of the solution’s existence often leaves humanity confounded when it need not be.
Why is this Dope, Hip, Fresh and/or Relevant?
I fully realize it’s a sign of moral weakness, or at least considerable marijuana abuse, to quote the classic Darren Aronofsky movie Pi, but I will do so now:
1. Mathematics is the language of nature.
2. Everything around us can be represented and understood through numbers.
3. If you graph these numbers, patterns emerge.
Therefore: There are patterns everywhere in nature.
After all, the best part about about restarting Brainsturbator is knowing exactly who I’m writing for—I have no need whatsoever to convince you that it’s a Very Good Idea to pay attention to reality, look for patterns, make connections, and do independent research. That’s why you’re here, and that’s a beautiful thing.
Speaking as a proud criminal, Benford’s Law has obvious and immediate applications in the realm of money laundering and data theft. Remember the NYT excerpt that began this article: Benford’s Law is a “powerful and relatively simple tool for pointing suspicion at frauds, embezzlers, tax evaders, sloppy accountants and even computer bugs”. As Professor Hill taught us (without even knowing he did it) if you’re going to fake the funk, fake the funk right. A recent Physorg article provides this invaluable hint:
For example, because a year’s accounting data of a company should fulfill the law, economists can detect falsified data, which is very hard to manipulate to follow the law. (Interestingly, scientists found that numbers 5 and 6, rather than 1, are the most prevalent, suggesting that forgers try to “hide” data in the middle.)
“Election Forensics”
Here’s an immediately useful application: determining wether or not the results of a democratic election are legitimate. Considering I live in a country where the past two—TWO—Presidential elections have been stolen in plain sight and I still have people telling me I should be excited about Barack Obama, this is pretty hip, dope, fresh and relevant to me. We have covered electronic voting machines and the delicate art of stealing elections once before, so here’s a hopeful update:
Fraudulent elections and disputes about election outcomes are nothing new. Gumbel (2005) reviews the sorry history of deceit and electoral manipulation in America, going back to the dawn of the republic. Throughout the world, in old and new democracies alike, allegations of vote fraud frequently occur (Lehoucq 2003). One new element is voting technologies that make some familiar methods for physically verifying the accuracy of vote totals impossible to use. The advent of electronic voting machines means that often now there are no paper ballots to be recounted. To steal an election it is no longer necessary to toss boxes of ballots in the river, stuff the boxes with thousands of phony ballots, or hire vagrants to cast repeated illicit votes. All that may be needed nowadays is access to an input port and a few lines of computer code. To detect such manipulations is a difficult and urgent problem. In terms of legitimacy it is not clear whether the worse problem is that erroneous election outcomes may occur or that many may not believe that correct outcomes are valid.
Long story short, according the numbers, Mexico’s 2006 election was rigged, Venezuela’s 2004 election was crooked, and the US 2004 election was completely fraudulent in the state of Florida. You can read the rest here.
Further Reading for Curious Primates
In 1986, Theodore Hill presented a “rigorous proof” of Benford’s Law which is surprisingly readable for the most part:
The white paper that introduced me to this concept—very readable:
Walter Mebane paper on using Benford’s law to analyze election results:
Recommended Reading
- Hacking Matter by Will McCarthy
- The Body Electric by Robert O. Becker and Gary Seldon
- The Invisible Landscape by Terence Mckenna
- Out of Control by Kevin Kelly
- Mycelium Running by Paul Stamets
- Lucifer Priciple by Howard Bloom
For more recommendations please visit our Store.
- Psychic Warfare from 1981-2008
- Bucky Fuller & his World Game: Intro to Saving Planets
- Saving the World Starts in Africa
- The 2008 Brainsturbator Update: Back to School
- The Mind of Tony Smith: A Guided Tour
- Welcome to Brainsturbator 2.0
- 10 Ways YOU Can Fight Fascism Around the World
- Networks, Bacteria, and the Illusion of Control
- The Quest for the Elusive Chronon
- Brainsturbator 101: Who I Am, What I Do
Brainsturbator on Twitter
#SmartHorror "Triangle" also has one of the best visual twists I've seen in any movie ever -- the payoff carries the film. 7/10 overall
#SmartHorror "Triangle" was like an improvement on "Time Crimes" -- still frustratingly flawed, but very smart and worth watching.
4 am vision of a futures market on google keyword values - a memetic stock exchange being gamed by @blustr and @wesunruh
Of course you're wrong. Embrace that and enjoy it. Few of us are qualified to talk about anything.
Vaguely ashamed I never knew this existed: http://www.archive.org/details/solar_system_1977
Tonight I will be staying up late and experimenting with chopping my growing B-movie collection into music videos for our roster. PSYCHED.
Tonight's viewing for sure: Chinese horror film from 1937 http://bit.ly/beEiZu
@m4l4k41 script as in screenplay
I'd be interested in a biography of Albert Stubblebine. What a long strange life he's been leading.
For more updates follow Brainsturbator.
Brainsturbator Favorites
-
Meta
-
The Abyss
-
Aikido Activist Anarchy
- 10 Ways YOU Can Fight Fascism Around the World
- How to Win the War on Terror and Save America, Too
- Brainsturbator Goes to War
- The History of “What if We Dosed the Water?”
- Food Not Bombs: or, Free Bread and Soup is a National Threat
- Brainsturbator Wants You….to drop outta school now
- Do YOU Have a Community Resistance Plan?
-
Weird Science
- The Mind of Tony Smith: A Guided Tour
- Networks, Bacteria, and the Illusion of Control
- The Quest for the Elusive Chronon
- Brainsturbator UFO Library Version 2.0
- Our Fractal Universe: A Sneak Peek at the New Cosmology
- More Chronon Theory: Jacques Vallee’s “Associative Universe”
- Get In Tune With Chronobiology: Part One
-
We Salute You

3 responses to "Benford’s Law and the Hidden Geometry of All Numbers"
May 15, 2007 at 4:18 PM
Weasel85 says...
Ok, Benford’s Law is what it is, but it’s not all it’s cracked up to be. As far as my research shows, Benford’s Law applies in two types of situations:
A) When there is no real, measurable limit to the data being analyzed (eg: address numbers - someone could start them at any given number and end them at any given number)
B) When there is a measurable limit to the data, and it falls into a range that favors 1 as the leading digit. (eg: rolling a 20 sided die, more than half the results should have a leading 1)
This is relatively simple to explain(I hope). Look at two things in particular, and it becomes obvious. The odds of the leading digit being a 1 vs. the odds of it being a 9. If your data is numbers from 1 to 9, the likelihood of the leading digit being 1 is 11.11%, as are the odds of it being 9, or any other number.
Now, look at the odds when the data can range from 1 through 19. Half of those numbers start with 1. The likelihood of the leading digit being 1 is 57.89%, whereas the odds of it being 9 (or any other number) is 5.26%. If your data ranges from 1 to 29, the odds of the leading digit being 1 or 2 are 37.93% each, and every other number 3 through 9 are at 3.45%. When you get to a range of 1 to 89, every number has odds of 12.36%, except 9, which has 1.12% odds. When the range is 1 to 99, the odds are once again even. Everything is back to 11.11%.
Now here’s the important part: through all of those possible input data ranges, the odds of the leading digit being 1 fluctuated between a low of 11.11% and a high of 50%. The odds of the leading digit being 9 fluctuated between a low of 1.12% and a high of 11.11%.
So this is why Benford’s Law works in some cases, but not in others. The odds of different data ranges average out to a curve close to the one described by Benford’s Law.
May 15, 2007 at 11:28 PM
thirtyseven says...
^^I love the clarity of your thought, man, much respect due on cracking that open. Once I can understand it better and synthesize it, I’ll do a follow-up and probably get back to you for help on visuals. Thanks for taking the time to post this, man—much appreciated.
I’m also very interested in that other project you mentioned with the primes. Keep me posted.
May 19, 2007 at 10:54 PM
mr_fnortner says...
As a former auditor who has used Benford’s law, I have some practical understanding of how it can be applied; and why it works. Numbers that fit a “normal” distribution (the standard bell curve) are poor candidates for Benford, as are uniform distributions like random drawings, lotteries, or the roll of one die. These distributions have their own behaviors and other laws describe them much better.
Distributions that fit Benford’s Law are those that come from growth, effort, or diminishing likelihood. That is, that harder it is for an ongoing phenomenon to produce a quantity of greater value, the more suitable to Benford’s Law is the series of quantities generated. Two side truths are also present in the law: it doesn’t matter what rate of growth or degree of effort is involved, and the unit of measure is unimportant.
Here are some examples: cash register receipts, bank deposits, payroll checks, charitable deductions, size of insurance claims, wait staff tip sizes, number of heads before a tail (or vice versa), and the like.
Here’s how it works: Benford knew that the log of 2 = 0.301029995663981, and the log of 3 = 0.477121254719662. He reasoned that quantities beginning with the digit 1 would take the first 30.1% of the logarithms. Likewise, quantities beginning with the digit 2 would take the second 17.6% of the logarithms, and so on. So he reasoned that by taking the logarithms of the first digit of certain types of quantities, he could estimate the probability of the occurrence of that digit. The formula, in crude form is probability = log[10] (digit+1) - log[10] (digit).
Simplification of terms yields the better version of Benford’s Law that says that the probability of any initial digit D occurring is log[10]{1+(1/D)}. Or, the base 10 logarithm of the quantity 1 plus the fraction 1 over D.
This means that numbers beginning with 1 will occur 30.1% of the time. Numbers beginning with 2, 17.6% of the time, and so on. Or put differently, quantities will spend 30.1% of their time being numbers that start with 1, and 17.6% of their time being numbers that start with 2.
Basically, it’s not easy being a large number. It’s hard to grow up. More quantities spend time being small numbers and fewer quantities spend time being large numbers. (This is not very mathematical, true, but it is very real-world.)
Here is the complete table:
1 30.1%
2 17.6%
3 12.5%
4 9.7%
5 7.9%
6 6.7%
7 5.8%
8 5.1%
9 4.6%
I can add more if anybody wants.