Forever Learning

Forever learning and helping machines do the same.

Predictive Analytics World London

leave a comment »

In October I’ll speak at Predictive Analytics World in London. Once again, I’ll be talking about Data Science.

Special Featured Sessions at Predictive Analytics World London

You can register for the event on the site. Slides are already available online.

Written by Lukas Vermeer

September 9, 2013 at 17:24

Posted in Data Science, Meta

Tagged with

Simulating Repeated Significance Testing

with 2 comments

My colleague Mats has an excellent piece on the topic of repeated significance testing on his blog.

To demonstrate how much [repeated significance testing] matters, I’ve ran a simulation of how much impact you should expect repeat testing errors to have on your success rate.

The simulation simply runs a series of A/A conversion experiments (e.g. there is no difference in conversion between two variants being compared) and shows how many experiments ended with a significant difference, as well as how many were ever significant somewhere along the course of the experiment. To correct for wild swings at the start of the experiment (when only a few visitors have been simulated) a cutoff point (minimum sample size) is defined before which no significance testing is performed.

Although the post includes a link to the Perl code used for the simulation, I figured that for many people downloading and tweaking a script would be too much of a hassle, so I’ve ported the simulation to a simple web-based implementation.

Repeated Significance Testing Simulation Screenshot

You can tweak the variables and run your own simulation in your browser here, or fork the code yourself on Github.

Written by Lukas Vermeer

August 23, 2013 at 15:47

Data Science: for Fun and for Profit

with 5 comments

In the next few weeks I’ll be giving two talks on the topic of Data Science at Xebicon and another event affiliated with Xebia. There is an abstract of my spiel available on the Xebicon site.

Data Science is one of the most exciting developing fields in technology today. Ever expanding data sets and increasing computing power allow statisticians and computing scientists to explore new business opportunities that were simply not possible merely a few years ago. Although their applications are new, the ideas and techniques that form the underpinnings for this evidence-oriented discipline have a solid foundation in hundreds of years of scientific development. In order then to understand the new science of data, one must first understand the science of science.

The Scientific Method, the unintended effects of repeated significance testing and Simpson’s paradox: this talk will focus on the practical applications of the theoretical constructs that lie at the heart of Data Science; and expand on some potential pitfalls of statistical analysis that you are likely to encounter when venturing into the field.

If you’re interested, feel free to sign up for either event. I’ll also post slides and additional thoughts here afterwards.

Written by Lukas Vermeer

May 17, 2013 at 10:37

Posted in Data Science, Meta

Tagged with

A New Kind Of Science?

with one comment

I am no longer a Corporate Ninja. As of a few weeks ago I can now call myself “Data Scientist at Booking.com“.

Although I am really excited about the new challenges and opportunities that await me in the sexiest job of the 21st century, I must say this new title bothers me ever so slightly. It somehow seems so redundant.

If you’re not using data, is it really science?

Written by Lukas Vermeer

May 2, 2013 at 17:20

Posted in Meta

Tagged with

Restaurant Reviews and the Availability Heuristic

with 3 comments

You could say fine dining is a bit of a hobby of mine; and as I’ve mentioned before, I’ve composed quite a few restaurant reviews over the years. I enjoy writing about food almost as much as I love eating it.

Whilst fantasising about fancy food with a colleague the other day, we wondered whether there is any relation between the lengthiness of my reviews and the associated score. In some strange way it made intuitive sense to me that I would devote more words to describe why a particular restaurant did not live up to my expectations.

Thinking about this, the first negative review that came to my mind was one I wrote for “The Good View” in Chiang Mai, Thailand.

If service were any slower than it already is, cobwebs would certainly overrun the place. When food and drinks eventually do arrive they’re hardly worth the wait.

Fruit juice contained more sugar than a Banglamphu brothel and cocktails had less alcohol in them than a Buddhist monk. The mixed Northern specialties appetizer revealed itself to be three kinds of sausage and some raw chillies; very special indeed.

The spicy papaya salad probably tasted alright, but I was unable to tell because my taste buds were destroyed on the first bite. (Yes, I see the irony in complaining a spicy papaya salad was too spicy, but in my mind there’s a difference between spicy food and napalm.)

Also, the view is terribly overrated.

Conversely, the first positive review that popped into my brain was this rather terse piece for “Opium” in Utrecht, the Netherlands.

Om nom nom.

Judging by this tiny sample there might indeed be something to the hypothesis that review length and review score are negatively correlated. To confirm my hunch, I decided to load my reviews into R for a proper statistical analysis.

> cor.test(nn_reviews$char_count, nn_reviews$score)

Pearson's product-moment correlation
data: nn_reviews$char_count and nn_reviews$score
t = 0.2246, df = 121, p-value = 0.8227
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.1571892 0.1967366

sample estimates:
   cor
0.02041319

To my surprise, the analysis shows there is practically no relation between length and score. Contrary to what the two reviews above seem to suggest I do not require more letters to describe an unpleasant dining experience as opposed to a pleasant one.

A simple plot of the two variables gives some insight into a possible cause for my misconception.

Review scores vs review length

Review scores vs review length

The outlier in the bottom right happens to represent my review for the Good View. All my other reviews are much shorter in length and seem to be quite evenly distributed over the different scores.

My misjudgement is an excellent example of the availability heuristic. The pair of examples that presented themselves to me upon initial reflection were not representative of the complete set, but that did not stop me from drawing overarching, and incorrect, conclusions based on a sample of two.

This is why I use statistics, because I am a fallible human being; just like everyone else

Written by Lukas Vermeer

March 22, 2013 at 18:11

Evidence-Based Everything

leave a comment »

I’m not really interested in an exposition of your facts. I don’t very much care to learn about your reasons.

First, show me your evidence.

Once we’ve established what you think you’ve seen, we can talk about what you think it means. Supported by sufficient proof, your theories and derived truths should be much easier to express; sometimes they may even be self-evident.

Then we can both decide what to believe.

Written by Lukas Vermeer

December 4, 2012 at 13:14

Posted in Meta, Psychology

Tagged with , ,

A/B Testing XXL

with 4 comments

[I've tweeted about this before.]

If fashion stores believed in A/B testing, they would probably only sell white XXL shirts. Most customers would fit tent-sized garments; most colours go well with white. Giant colourless shirts would presumably have the better sales conversion rate by far.

But of course this would be far from optimal.

Customers come in different shapes and sizes. If you really want to maximise conversion, you will have to tailor to their specific needs and personal preferences. A/B testing might be the latest fashion, but the truth is that some customers will have a taste for B even though the majority might fancy A. This is why these 20 lines of code will beat A/B testing every time.

The trick is not to figure out whether A is better than B, but when A is better than B; and for whom.

Marketing should not be one-size-fits-all.

Written by Lukas Vermeer

November 30, 2012 at 18:05

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: