Archive for August 2010
I love reading. I love reading about statistics. I love reading about psychology. But most of all, I love reading quite a few books about statistics and psychology. I might just be a little weird, but statistics and psychology do make up a large part of my job.
One study that fascinates me is Extension versus intuitive reasoning: The conjunction fallacy in probability judgment by Daniel Kahneman and Amos Tversky (and I am not the only one, this study appears in many books by different authors and Kahneman was rewarded the Nobel Memorial Prize in Economics for his efforts in 2002). In this study they asked participants (among other things) the following question.
Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.
Which is more probable?
- Linda is a bank teller.
- Linda is a bank teller and is active in the feminist movement.
If you are like the vast majority (~85%) of the participants, you feel the second option is more probable. You will also, like that same vast majority, be wrong. Wikipedia does a pretty good job explaining why this is a conjunction fallacy. How can A + B be more probable than just A (and possibly B, but perhaps not B)? Kahneman and Tversky found that this effect does not occur when the questions seem unrelated to what the participants already knew about Linda.
The perceived probability, it seems, depends on how well the answer fits in our internalized story of Linda, not logic or statistics. A good story is viewed as more likely than a poorly framed fact. This is important, if your job is helping people make decisions based on numbers and statistics.
What do you think is more likely: that you understand the implications of this study or that you understand the implications of this study because I’ve provided you with a vivid (and self-referential) example of its application?
[do you see what I did there?]
A representative of a large company (which shall remain unnamed) recently called on me for some advice. They had accidentally loaded polluted data into their data warehouse and wanted to know if there was anything they could do to get rid of it. I told them that restoring their most recent backup and reloading any more recent, unpolluted data was probably the simplest solution. They concurred, but regretted to inform me that they had not made a single backup of their production system since I helped them set it up; many, many months ago.
Let me repeat that for you. Large company. Production system. No backups. Ever.
Now, you’re probably thinking that is pretty silly for a large company to not have backups of a production system. Well, wait until you hear what happened next.
A few days later, a different representative from the same company demanded to know where exactly in the documentation we had explained that they were expected to create backups of their production system. How were they supposed to know that anything could possibly go wrong? It wasn’t their fault that they had not anticipated this disaster, right?
I’m glad to report that, in the end, everything worked out and everyone lived happily ever after. But for future reference, here’s a free (as in beer, not as in speech) tip.
Backup. Your. Production. Database.
[Also, do not attempt to dry your poodle in a microwave oven]