Forever Learning

Forever learning and helping machines do the same.

Archive for the ‘Javascript’ Category

Simulating Repeated Significance Testing

with 2 comments

My colleague Mats has an excellent piece on the topic of repeated significance testing on his blog.

To demonstrate how much [repeated significance testing] matters, I’ve ran a simulation of how much impact you should expect repeat testing errors to have on your success rate.

The simulation simply runs a series of A/A conversion experiments (e.g. there is no difference in conversion between two variants being compared) and shows how many experiments ended with a significant difference, as well as how many were ever significant somewhere along the course of the experiment. To correct for wild swings at the start of the experiment (when only a few visitors have been simulated) a cutoff point (minimum sample size) is defined before which no significance testing is performed.

Although the post includes a link to the Perl code used for the simulation, I figured that for many people downloading and tweaking a script would be too much of a hassle, so I’ve ported the simulation to a simple web-based implementation.

Repeated Significance Testing Simulation Screenshot

You can tweak the variables and run your own simulation in your browser here, or fork the code yourself on Github.

Written by Lukas Vermeer

August 23, 2013 at 15:47

Cornamified Intersite

leave a comment »

Written by Lukas Vermeer

November 16, 2011 at 08:13

Posted in Javascript, Meta

Connect Four and Minimax

with 4 comments

Computers are incredibly fast calculators. That makes them good at maths, but it does not make them smart. People that program computers have to invent clever ways to exploit that arithmetical ability to achieve intelligent behavior.

Using the force.

Often the simplest way to allow a computer to make the ‘right’ decision is to calculate all possible outcomes and select a path that leads to the best end result. When planning a route from A to B, we consider every possible sequence of turns for every intersection between A and B. This so-called brute-force approach does not require a lot of thinking, it requires a lot of calculating; and computers are pretty good at the latter, less so at the former.

Sometimes it is possible to trim certain paths of exploration because we are certain they will lead to sub-optimal results. For example, when a sequence of turns leads us back to an intersection we have visited before along this route. But even then, this method quickly results in long computation times as the number of possible outcomes grows exponentially with the number of possible decisions the computer has to consider.

This is one of the reasons why we need a super-computer to beat a chess grandmaster. There are a lot of possible moves to consider in a game of chess.


Another problem when programming computers to play a two-player zero-sum game like chess, is that half of the decisions in the game are made by the opponent. The computer can decide which moves to make, but it has no control over what the opposing player does in response.

Even worse, assuming they both want to win, the two players have completely diametric goals. When plotting a potential path through solution space, we have to consider that our opponent probably wants to move towards a different solution altogether. The other player is actively trying to stop us from getting to where we’re going.

Mini and Maxi.

One easy way to deal with the uncertainty of the opponent’s decisions when looking ahead is to assume the other player will always make the move that leads to the worst possible result for the computer. The opponent is not trying to win, he is trying to stop the computer from winning. When plotting a path (succession of possible moves) we alternate between making the best (maximum) move when it’s our turn and the worst (minimum) when the opponent is up to make a move.

A game of connect four

Connect Four

This approach is called minimax. It is not particularly clever, but it does the trick and is relatively easy to implement.

Connect Four.

Last year, a Microsoft SharePoint consultant I’d met at a customer site asked me to explain how to implement a computer opponent for a Connect Four game he was creating for Microsoft Windows Phone Mobile XP 7 Vista Home Edition (or whatever they call it nowadays). Instead of explaining how, I decided to build a simple example myself in JavaScript using minimax. I don’t think he ever finished his version of the game.

You can play my version here. Somewhat ironically, it is excruciatingly slow when using Microsoft Internet Explorer; so please consider using any other browser.

Update (June 29th 2012): The code for this example is available on Github.

Update (July 5th 2012): This is now available as a web service. Read this post.

Written by Lukas Vermeer

September 24, 2011 at 15:10

Snakes: Evolution

leave a comment »

After one of my previous posts about the snake project Casper had some questions.

Are they retaining their knowledge beyond death? Sometimes it doesn’t seem so. Also, you say they learn to keep themselves alive and how to get to the pellet, is the need to feed programmed (say, as an instinct) or are they learning that feeding = more length? 🙂

These are good questions, and I understand the confusion. But it seems to me that Casper is looking at this the wrong way around. The snakes as a group can be said to learn to play the game, but individual snakes do not change their behavior during their lifetime. Individuals do not learn. Snakes die and new ones appear, but these are really new snakes; mutations and clones of the previous generation, not the same specimens.

The questions are understandable, but in light of the technique used they do not make any sense. Allow me to try and explain.

Evolution, baby!

Bobo the poodle did not learn how to look as silly as he (or she) does. His appearance is the result of generations of selective breeding. Likewise, the snakes do not learn to play the game; the program selects and breeds those that have more potential.

The type of algorithm I’ve used for this project is called a Genetic Algorithm. This technique is based on the Darwinian idea of evolution. A set of digital genes determine the behavior of each snake, its length at death defines its fitness and the next generation is based on the previous gene pool. Snakes that were long at death get more descendants. There is no knowledge to retain (e.g. everything depends on genes that are set from birth) and death is permanent (e.g. each snake lives only once).

Because length defines fitness the population will evolve towards snakes that eat more. There is no ‘need to feed’ or instinct pre-programmed (just as, presumably, poodles do not ‘want’ to look as daft as they do). The direction in which the food is located is simply one of the inputs to the decision each snake has to make. I’m not telling them how important that input is. Evolution takes care of that.

It’s all in the genes.

Each snake has a digital set of genes each represented by a number between minus and plus one hundred.

basic_need	= genes[1]; // starting score for each direction.
wall_need	= genes[2]; // added when the snake is adjacent to a wall.
snake_need	= genes[3]; // added when the snake is adjacent to another snake.
bitex_need	= genes[4]; // added when the pellet is up.
bitex2_need	= genes[5]; // added when the pellet is down.
bitey_need	= genes[6]; // added when the pellet is left.
bitey2_need	= genes[7]; // added when the pellet is right.
last_need	= genes[8]; // added when the snake moved this direction in previous step.

These values are combined with limited boolean knowledge of the current environment to assign a score to each direction (up, down, left and right). Each step, the snake will move in the direction with the highest score (above zero that is, the snakes will commit sepuku if all scores are negative).

Initially these values are assigned at random. This is why initially most snakes seem to wobble around without purpose until they starve from hunger, while others bang head-first into walls and some even disappear right away (this last group of ‘pessimists’ conclude life is pointless from the get-go and give up at their first move).

But eventually (by accident or chance or fate; or whatever you want to call it) a snake will get the pellet. This first feeder might not make it very far after this initial triumph, but the seeds of succes are sown. Now all the algorithm (or evolution) needs to do is nurture it to its full potential.

The dating game.

To ensure that the population is somewhat stable (mostly for esthetic reasons), the algorithm will introduce four new snakes as soon as four old ones have ended their game. To keep things simple I used a deterministic version of tournament selection and implemented only mutation without crossover.

function compare_and_breed(p) { // This code has been slightly altered for clarity.
	score = new Array;
	genes = new Array;
	var best = -1;
	var bestman = null;

	for (z=0;p[z]!=null;z++) {
		if (score[z]>=best) {
			best = score[z];
			bestman = z;


Of the four old snakes, the one who is deemed the most fit (the longest) will be selected for procreation. The next generation will consist of one clone of the champion and three mutations. In the mutated snakes, each gene value has a 1 in 8 chance of being raised or lowered slightly and the same odds of being completely randomized.

function mutateGenes(p) { // This code has been slightly altered for clarity.
	oldgenes = historyL[p]
	newgenes = new Array();

	for (i=1; i<=8; i++) {
		if (mutationInhibitor()) {newgenes[i] = mutateGene(oldgenes[i]);}
		else if (unstableMutationInhibitor()) {newgenes[i] = randomGene();}
		else {newgenes[i] = oldgenes[i];}

	return newgenes;

function mutationInhibitor()			{ return (Math.round(Math.random()*8) == 1); }
function unstableMutationInhibitor()	{ return (Math.round(Math.random()*8) == 1); }
function mutateGene(x)					{ return x + ((Math.round(Math.random()*10)/2) - 5); }
function randomGene()					{ return (Math.round(Math.random()*200)/2-50); }

And that is really all there is too it. The rest is just repetition of tweaking and testing; recombining and re-evaluating; mutation and measuring fitness.

Pay attention to that man behind the curtain.

In my previous post about this project I had the audacity of calling these snakes intelligent. Judging by the questions Casper and others raised they are certainly perceived that way. Now that you (hopefully) understand how this magic works, I wonder if you would still consider what you see to be intelligence. Perhaps lifting the curtain of uncertainty has broken the spell; revealing it to be a cheap trick, nothing but smoke and mirrors.

And if you think that the latter is the case, what does this imply for your own intellect?

You might not want to think about that for too long, because you might not like the answer.

[I am deeply sorry if I’ve offended any poodle fanatics with the examples used in this article. You have to admit, they do look rather silly. If evolution were ever to blame for causing something awful it would be those poor poodles.]

Update (april 2nd 2011): Code for this project is now available on Github.

Written by Lukas Vermeer

February 12, 2011 at 15:21

Snakes on a Two-dimensional Plane

with 2 comments

[This post is based around a very old pet project of mine. I thought it worthwhile to describe this golden oldie some more detail and relate it to my current work.]

Do you remember those old Nokia 3210 phones? Remember what an awesome game Snake used to be?

Ah, good times!

Back then, I was a freshman at Utrecht University, where I studied Computing Science with a minor in Artificial Intelligence. I must admit that I arguably wasn’t a very good student in those days. I was far to young, naively enthusiastic, impulsively inquisitive and stubborn to take any wisdom being presented by teachers at face value. I would call anything, everything and anyone into question. I must have missed many opportunities to stop, listen and learn from those around me.

I’ve grown older, bolder and balder, and perhaps even a little wiser. And some of the traits that made me a difficult student are the ones I value most in my work as consultant. Listening still requires some effort, but one is never to old to learn.

One recurring thesis posed in those first semesters seemed to be that Artificial Intelligence is immensely difficult to achieve. More than once, it was stressed in class that a lot of expert knowledge and hard work (and also hardware) was required to make a piece of software seem even remotely intelligent.

I didn’t agree.

I thought, and still think, that ‘intelligence’ is something that is always defined and evaluated within a domain of possibilities. While it may be extremely difficult to create a piece of software that makes decisions perceived to be intelligent in a large domain, this can be almost trivial if the domain is sufficiently small.

Detail of a screenshot of the Snake Project

Three Snakes Converging on the Target

Small, like the realm of a game of Snake.

In Snake, the world consists of a two-dimensional plane divided into a limited number of squares (or pixels). Some of these are occupied by the snake itself and only one is occupied by the target (or food pellet). At any given point in the game, a player has exactly four possible actions. Should he go op, down, left or right? When the snake steps on the target he grows in length and the target moves to a new location on the plane. When a snake steps off the plane (hits a wall) or on itself the game is over. Simple.

So, to prove my point (mostly to myself), I built a small website that ‘learns’ to play snake. Starting with a random set of priorities, the group of twenty snakes evolves (using a Genetic Algorithm) to the point where they can be seen to actively compete over the target. Their success seems only limited by the fact that there is only one such target and the snakes will starve when they do not capture it often enough. Also, as snakes become longer, pixels will be at a premium and the risk of collisions will increase.

Because the snakes learn from a random set of priorities, I do not know what the optimal priorities are (if such an optimum even exist). I also have no idea what decisions each snake is making every step during the game.

And I couldn’t care less. I care about the end result.

Of all the people I have shown this to over the years, not many have voiced any doubts whether these snakes are ‘intelligent’ or not. Most people seem convinced that learning is happening and that the snakes play with sufficient skill; they conclude that some form of intelligence must be involved.

Lesson learned.

In my work as a consultant for Oracle Real-Time Decisions (RTD) my experience with Snake proves to be a very valuable lesson. One of the key features of RTD is its ability to learn to predict consumer behavior. The use of relatively simple (when compared to other data mining techniques) Predictive Analytics methods in this confined domain enables intelligent personalization that is both technically and functionally scalable and relatively easy to implement. The process is in some ways quite similar to the way the snakes learn to play. It is the result that counts, not the complexity of the solution.

Implementing intelligence like this is not difficult, but does require that you relinquish some control and focus on what is really important.

Intelligence requires autonomy.

Update (april 2nd 2011): Code for this project is now available on Github.

Written by Lukas Vermeer

December 17, 2010 at 19:16

More Pi

with 2 comments

Hmmm. Pie.

Hmmm. Pie.

In a previous post I’ve explained how we can estimate the number Pi using Monte Carlo Methods (basically throwing darts at a circle). I personally think this method is quite elegant, but it is not particularly efficient. It might take a very, very, very long time for it to converge to anything close to the actual number. Still, pretty awesome.

This week I’ve compiled a simple, interactive and more visual example using the jQuery library and SVG graphics. You can find it here. Let me know what you think.

Update (april 2nd 2011): Code for this project is now available on Github.

Written by Lukas Vermeer

November 6, 2010 at 11:58

Benford’s Law Revised

leave a comment »

After posting my previous article I started wondering why my transactional data contained a disproportional amount of twos. Something must be distorting the natural spread of first digits that Benford predicts. Later, while discussing this aberration with a friend, I finally figured it out.

It’s me. I distort the data. In fact, I distort the data every time I draw cash from an ATM.

Like most people I have habits. One of them is that when I use an ATM I almost always withdraw twenty euros (I find cash a nuisance and prefer plastic, so I withdraw as little as I can manage). This, almost deterministic behavior, is the reason why there are a disproportionate amount of twos in the data. It seems my previous conclusion applies to the data in more ways than one.

Real-life data is obviously not random data, and when you think about it there are perfectly logical explanations for this result.

If I remove ATM transactions from the original dataset the graph looks different. Still not exactly matching Benford’s prediction, but certainly a lot closer.

Tally of the first digits of bank transaction amounts excluding ATM transactions

Tally of the first digits of bank transaction amounts excluding ATM transactions

Update (april 2nd 2011): Code for this project is now available on Github.

Written by Lukas Vermeer

June 16, 2010 at 21:23

Benford’s Law

with 2 comments

In a (not so) recent episode of Radio Lab (one of my favorite podcasts) I was introduced to Benford’s Law. I’d never heard of this phenomenon, which in hindsight is rather strange, because it’s effects are so profound. Let me explain.

Take a some numerical data from a real-life source, bank transaction amounts for instance, and tally the number of transactions whose amount starts with a one (i.e. whose first digit is a ‘1’ and not ‘2’ or ‘9’). What percentage of the transactions would you expect to match this criterion? What about the number of transactions whose amount starts with a three?

I’d never really though about this before. Implicitly I had always assumed that in large sets of data the distribution of first digits was equal amongst the numbers one through nine. This would result in the answer to the two questions above being “eleven percent” in both cases (zero is not considered, so there are nine possible first digits).

Benford’s Law predicts that I was wrong, and it turns out he is right.

The answers according to Benford’s Law are “probably about thirty percent” and “probably about twelve percent”. And using my own bank transaction amounts and some javascript gimmickry we can see that he is closer to the actual numbers than I was.

Benford's Law

Tally of the first digits of bank transaction amounts

Real-life data is obviously not random data, and when you think about it there are perfectly logical explanations for this result. Still, for me, this is something I had to see in order to believe.

I encourage you to try it out yourself on your own data. I’m curious to know what you find.

Update (april 2nd 2011): Code for this project is now available on Github.


Written by Lukas Vermeer

May 24, 2010 at 14:56

%d bloggers like this: