Archive for September 2012
Snake Oil and Tiger Repellant
The Wall Street Journal has an interesting article explaining how companies are starting to use (big) data to support their recruiting efforts. It provides a good example of the more general trend in businesses towards evidence-based decisioning and data science, but it also shows how some crucial aspects of these techniques are easily overlooked or oversimplified.
My big-data-science-bogus-alarm started ringing upon reading the last sentence in this short paragraph.
Applicants for the job take a 30-minute test that screens them for personality traits and puts them through scenarios they might encounter on the job. Then the program spits out a score: red for low potential, yellow for medium potential or green for high potential. Xerox accepts some yellows if it thinks it can train them, but mostly hires greens.
Sounds smart, right? Well, maybe.
If Xerox never hires any “reds” and only very few “yellows”, how will they know the program is actually working? How will they know that all that complicated math is doing something more than simply returning random colour values? An evidence-based approach should always include some form of scientific control. If it doesn’t, it might as well be snake oil.
Of course, this is probably just a simple journalistic crime of omission of a trivial implementation detail, but it reminded me of that old chestnut “the tiger repellant”. For your convenience, this blogpost has been equipped with some very strong Tiger Repellant tonic. If you do not see any tigers around you right now, you will know it is working.
See? No tigers?
Proven to work like a charm. Order yours today! Great prices! Limited availability! Now taking applications in the comments.
[ Disclaimer: Tiger Repellant is not certified for use in South-East Asia or zoological parks. Tiger Repellant inc. and its employees and subsidiaries cannot be held liable for any damage caused to your person in the event of being eaten by a tiger. ]
Bin Packing Too Many Features
My girlfriend has been struggling with an interesting little problem lately. She was asked to determine the optimal distribution of medicine boxes and bottles over a set of adaptable cabinets; under volume as well as weight constraints. Not an easy task for a computer scientist; much less for a hospital pharmacist in training.
After describing the problem to me last night I (unhelpfully) mumbled that “this sounds like a variable sized bin packing problem to me, you can’t solve the kind of thing in Excel, you probably need an LP solver”.
Apparently I was wrong. It already seemed obvious to me that Excel suffers from a severe case of feature bloat, but this is just absurd.
Future Felony
Written by Arthur C. Clarke in 1976, Imperial Earth is set in faraway 2276.
As the beautiful old car cruised in almost perfect silence under the guidance of its automatic controls, Duncan tried to see something of the terrain through which he was passing. The spaceport was fifty kilometers from the city—no one had yet invented a noiseless rocket—and the four-lane highway bore a surprising amount of traffic. Duncan could count at least twenty vehicles of various types, and even though they were all moving in the same direction, the spectacle was somewhat alarming.
“I hope all those other cars are on automatic,” he said anxiously.
Washington looked a little shocked. “Of course,” he said “It’s been a criminal offence for—oh, at least a hundred years—to drive manually on a public highway. Though we still have occasional psychopaths who kill themselves and other people.”
The future sounds fascinating, but I want my Google Driverless Car now.