Thursday, September 10, 2009

The Drunkard's Walk

Four Stars
The Drunkard's Walk: How Randomness Rules Our Lives (Vintage)

Bit confusing and muddled beginning, but shines thereafter. Very readable, non-technical introduction to probability, randomness, and statistics, and more so of the people at the heart of the development of this science. However, the book does suffer somewhat as a result of this deliberate dumbing-down. For example, no formulae on how to calculate variance, standard deviation, or conditional probabilities. Even if the average reader never uses them or never even reads these formulae, it would still be of benefit to some at least to read these and try them out. Consider "Innumeracy: Mathematical Illiteracy and Its Consequences" as an excellent companion to this book. Again, a non-technical book, but with slightly more advanced examples. Read Nudge: Improving Decisions About Health, Wealth, and Happiness (my review, and on Amazon.com) and Stumbling on Happiness (my review, and on Amazon.com) for a better understanding on why and how people make mistakes when attempting to make decisions under uncertainty and on Kahneman and Tversky's research in behavioral economics Judgment under Uncertainty: Heuristics and Biases.



Books on subjects like probability, randomness, statistics, and such mathematical topics run the risk of becoming too number-focused, which while adding tremendous value to the book alienate 99% of the intended audience. On the other hand, deliberately dumbed-down books do little to inform the reader, much less educate him. This book falls somewhere in the middle, skewed more towards the dumbed-down end. What redeems the book however is the attempt to bring somewhat of a historical perspective, by way of tying each chapter to not only an important discovery and development in the science of probability, randomness, and statistics, but also weaving a very readable and well-written story of the people behind these developments.

I found the beginning a bit muddled because it was not clear whether the author was going to delve into errors in judgment that people make, getting into territory already covered by such books as "Nudge", "Predictably Irrational", "Sway", and others, or whether it was going to go down the mathematical road. It takes a chapter or two for that to become clear, and till then you have to persevere with the book. The effort is rewarded thereafter. Also, since some of the book talks about randomness, how people are misled by it, and how people often end up erring when making decisions, some of the examples also are common with other books like Nudge and Stumbling on Happiness. These books also do a better job of describing these examples. The one on bombing clusters in London during World War II or cancer clusters are two examples.
The author has a restrained sense of humour that leaps out unexpectedly. Like "... your mother-in-law yells, "Look out for that moose!" and you swerve into a warning sign that says essentially the same thing." [page 147]
or
"People like to exercise control over their environment, which is why many of the same people who drive a car after consuming half a bottle of scotch will freak out if the airplane they are on experiences minor turbulence." [page 185]

So, a relatively unknown Cardano starts the reader off on the journey in chapter 3, whose "... insight into how chance works came embodied in a principle we shall call the law of the sample space." (page 42), Pascal in Chapter 4, Bernoulli in Ch 5, Bayes and conditional probability in Ch 6, Laplace and Gauss in Ch 7, Galton in Ch 8, and so on. The last chapter, aptly titled "The Drunkard's Walk", briefly describes chaos theory and the butterfly effect. To that end, two books mentioned are Normal Accidents: Living with High-Risk Technologies and Chaos: Making a New Science

There is of course a steady parade of terms you would encounter in statistics and probability, like "regression towards the mean - Wikipedia link, the paper by Francis Galton" (page 8), "isomorphism", "frequency interpretation of randomness" (page 85), "standard deviation" (page 134), "error law" (page 136), "margin of error" (page 141), "central limit theorem" (page 143), "coefficient of correlation" (page 163), "chi-squared test" (page 164), "significance testing" (page 171), and so on. But there are no formulae in the book, so have hope.

Some excerpts from the book:
"... for the Hindus had taken the first large steps toward employing arithmetic as a powerful tool. It was in that milieu that positional notation in base ten developed, and became standard around AD 700. The Hindus also made great progress in the arithmetic of fraction - something crucial to the analysis of probabilities, since the chances of something occurring are always less than one. This Hindu knowledge was picked up by the Arabs and eventually brought to Europe." [page 49]

"Pascal's great innovation was his method of balancing those pros and cons, a concept that is today called mathematical expectation - Wikipedia page. ... Pascal's wager - Wikipedia page is often considered the founding of the mathematical discipline of game theory..." [pages 76, 77]

"In 1896, the American philosopher Charles Sanders Peirce wrote .... (what) is called the frequency interpretation of randomness. The main alternative to it is called the subjective interpretation. Whereas in the frequency interpretation you judge a sample by the way it turned out, in the subjective interpretation you judge a sample by the way it is produced. .... For example, in a perfect world a throw of a die would be random by the first definition but not the second. .... In the imperfect world, however, a throw of a die is random according to the second definition but not the first." [page 84, 85]

The gambler's fallacy is the "mistaken notion connected with the law of large numbers is the idea that an event is more or less likely to occur because it has or has not happened recently." [page 101]

"... the fundamental difference between probability and statistics: the former concerns predictions based on fixed probabilities; the latter concerns the inference of those probabilities based on observed data." [page 122]
Recently there were some investigations launched into companies where there was a strong suspicion that stock options granted to their CEOs and other executives had been backdated so as to net the maximum gains to these people. If so, that would be illegal. It should come as no surprise therefore that centuries ago:
"Laplace argued that this new mathematics could be employed to assess legal testimony, predict marriage rates, calculate insurance premiums. ... " These ideas were developed by Adolphe Quetlet. "Quetlet had stumbled on a useful discovery: the patterns of randomness are so reliable that in certain social data their violation would taken as evidence of wrongdoing." [pages 154, 156]
Screenshot from the Wall Street Journal on backdating. Also see these links from the WSJ: link 1, link2, link 3.
"That smooth bell curve is more than just a visualization of the numbers in Pascal's triangle; it is a means of obtaining an accurate and easy-to-use estimate of the numbers that appear in the triangle's lower lines. This was DeMoivre's discovery.
Today the bell curve is usually called the normal distribution and sometimes the Gaussian distribution. ... The normal distribution is actually not a fixed curve but a family of curves, in which each depends on two parameters to set its specific position and shape. The first parameter determines where its peak is located, .... The second parameter determines the amount of spread in the curve. ... This measure is called the standard deviation." [page 138]

"(Galton) He dubbed the phenomenon - that in linked measurements, if one measured quantity is far from its mean, the other will be closer to its mean - regression toward the mean.
... Galton's other major contribution to statistics: defining a mathematical index describing the consistency of such relationships. He called it the coefficient of correlation." [pages 162, 163]

"... mathematician George Spencer-Brown, who wrote that in a random series of 10 (to the power 1000007) zeroes and ones, you should expect at least 10 nonoverlapping subsequences of 1 million consecutive zeros. ... Spencer-Brown's point was that there is a difference between a process being random and the product of that process appearing to be random." [page 174, 175]


Other Covers of the book:


    Leonard Mlodinow's talk at Google as part of the Authors@Google program:


    © 2009, Abhinav Agarwal. All rights reserved.