Wednesday, July 15, 2009

Drunkard's Walk, update

I've been blundering through the rest of Drunkard's Walk, a book about randomness. If you flat out ignore the crappy first chapter and get over his tendency to find himself more amusing than he really is, it's a charming book that will help wrap your mind around the central tenets of probability and statistics. Considering the horror with which most people refer to probability and statistics you may be surprised to have me refer to it as charming. Certainly it lessens the angst to know it's a field primarily brought into being by gambling, astronomy and counting the dead.

Usually I don't much like reading about the accomplishments of dead white men, but this is entertaining. And relevant. It's kind of overwhelming to think about how much of our lives are dependent upon randomness, but since I rather got my degree in that, I've had time to get used to it. And I still haven't quite sorted it all out to my satisfaction.

For one thing, I've had classes in probability and statistics and innumerable classes and projects at work which require their regular use. Yet I'd always considered them as one idea with a two idea name. Now I can sort them out more cleanly. Probability is the prediction of outcomes based on known inputs. Statistics is the analysis of known outcomes to predict inputs. Clean, no? This book is helping despite the shaky start.

One of the primary things I do as a semiconductor process engineer is measure things and make decisions based on those measurements. I can tell you though, that most measurements have a lot more error involved than you'd think to the point where some days I wonder why I bother to measure at all. Turns out "bad" data can be more useful than no data. I can talk your ear off about reproducibility and reliability of measurement devices, called gauges by normal people and gages by the engineers who wrote the GR&R specs. I can explain to you why set of measurements that differ by 8 Angstroms are significant and not nonsense, but two other sets of measurements that differ by 50 microns (500,000A) cannot be called different. I can also explain how to get flatness readings of 3,5,8, and 15A from not only the same sample, but the same measurement of that sample. (How flat can you make it? How flat must I?) The more I measure things, the more I realize that measurements are somewhat random in and of themselves, and for the most part, are like money - people agree that they mean what we think they do, so they mean that. And this book is good at describing that, reassuring us that it's ok, and how randomness is actually a hard concept.

We use measurements all the time - how tall or heavy are you and how fast are you going? How many words per minute can you blog? How many unique readers are seeing my blog (and what's the chance they'll buy me a beer)? But measurements are actually pretty sketchy things. By using statistics you gain confidence that the mess of numbers your tool just threw at you actually mean something that you can use. It turns out that the path to this understanding was actually long and convoluted and didn't really even settle down in many ways until Einstein tossed in his lot with it back in 1905. So much of math and physical understanding we take for granted today - averages, standard deviations, bell curve distributions, were not always available, accepted, or useful. (Massive amounts of computer power doesn't hurt either, but without probability and statistics we probably wouldn't have massive amounts of computer power.) It's been quite the education to take and analyze thousands of measurements, know just how fragile they are, know how little we as a culture really understand measurement, and still be able get out of bed in the morning. I'm up to the really quite interesting chapter that overlaps a lot with my other recent readings on human behavior, thought, and decision making. It discusses where we see patterns in randomness even when patterns aren't there. It's a lot easier to see the lack of pattern when you have the tools and language to sort it all out.

On the whole, humans are biased to see patterns and faces from really skimpy evidence, and to generate biased data unknowingly. To tie into my last post, old white men keep pestering Sotomayor about bias. Of course she's biased; she's human. All people are biased and generate and perceive bias. But more than most people she seems to have a really good handle on it. On the other hand, those that are trying to call her out on it are doing so by insisting that acknowledging bias is a failure rather than discussing bias it in a way that isn't accusatory to figure out whether her bias would make the court healthier. Instead what they mean is "we're concerned you'll no longer be biased in favor of us" by calling her a racist in polite terms. To me this shows a decided lack of understanding that bias even exists. They sound more like the astronomers of the 1700s who saw measurement variability as a moral failing so instead of computing averages to get a sound number, would chose their "best" measurement and go with their gut. The lack of consciousness of bias shows the inquisitors to have absolutely no understanding of the amount of bias they've had on their side, nor the value in having different biases represented on a court that speaks for the law of the land of a melting-pot culture that we're rather proud of.

Whether we interpret the law, preferentially choose lots for experiments that have sequential or repeated digits in the lot number (that would be my quirk), think that sugar pill is helping with our arthritis, consult with a Ouiji board, or think we can beat the stock market, it's helpful to know when you're conning yourself by finding patterns that don't exist, ascribing meaning to patterns that have no meaning, or by knowing your existing biases and how to work with and around them. The author isn't quite as entertaining as he thinks he is, but he is much more entertaining than most while making me think I'm smarter after reading his stuff. So I do recommend the book, for any casual or advanced reader who wonders how we know what we do and how we're so sure we know what we don't. And my beer is wearing off so I'll stop and hit publish before I think the better of it.

No comments: