Forever Learning

Forever learning and helping machines do the same.

Archive for February 2011

Snakes: Evolution

leave a comment »

After one of my previous posts about the snake project Casper had some questions.

Are they retaining their knowledge beyond death? Sometimes it doesn’t seem so. Also, you say they learn to keep themselves alive and how to get to the pellet, is the need to feed programmed (say, as an instinct) or are they learning that feeding = more length? 🙂

These are good questions, and I understand the confusion. But it seems to me that Casper is looking at this the wrong way around. The snakes as a group can be said to learn to play the game, but individual snakes do not change their behavior during their lifetime. Individuals do not learn. Snakes die and new ones appear, but these are really new snakes; mutations and clones of the previous generation, not the same specimens.

The questions are understandable, but in light of the technique used they do not make any sense. Allow me to try and explain.

Evolution, baby!

Bobo the poodle did not learn how to look as silly as he (or she) does. His appearance is the result of generations of selective breeding. Likewise, the snakes do not learn to play the game; the program selects and breeds those that have more potential.

The type of algorithm I’ve used for this project is called a Genetic Algorithm. This technique is based on the Darwinian idea of evolution. A set of digital genes determine the behavior of each snake, its length at death defines its fitness and the next generation is based on the previous gene pool. Snakes that were long at death get more descendants. There is no knowledge to retain (e.g. everything depends on genes that are set from birth) and death is permanent (e.g. each snake lives only once).

Because length defines fitness the population will evolve towards snakes that eat more. There is no ‘need to feed’ or instinct pre-programmed (just as, presumably, poodles do not ‘want’ to look as daft as they do). The direction in which the food is located is simply one of the inputs to the decision each snake has to make. I’m not telling them how important that input is. Evolution takes care of that.

It’s all in the genes.

Each snake has a digital set of genes each represented by a number between minus and plus one hundred.

basic_need	= genes[1]; // starting score for each direction.
wall_need	= genes[2]; // added when the snake is adjacent to a wall.
snake_need	= genes[3]; // added when the snake is adjacent to another snake.
bitex_need	= genes[4]; // added when the pellet is up.
bitex2_need	= genes[5]; // added when the pellet is down.
bitey_need	= genes[6]; // added when the pellet is left.
bitey2_need	= genes[7]; // added when the pellet is right.
last_need	= genes[8]; // added when the snake moved this direction in previous step.

These values are combined with limited boolean knowledge of the current environment to assign a score to each direction (up, down, left and right). Each step, the snake will move in the direction with the highest score (above zero that is, the snakes will commit sepuku if all scores are negative).

Initially these values are assigned at random. This is why initially most snakes seem to wobble around without purpose until they starve from hunger, while others bang head-first into walls and some even disappear right away (this last group of ‘pessimists’ conclude life is pointless from the get-go and give up at their first move).

But eventually (by accident or chance or fate; or whatever you want to call it) a snake will get the pellet. This first feeder might not make it very far after this initial triumph, but the seeds of succes are sown. Now all the algorithm (or evolution) needs to do is nurture it to its full potential.

The dating game.

To ensure that the population is somewhat stable (mostly for esthetic reasons), the algorithm will introduce four new snakes as soon as four old ones have ended their game. To keep things simple I used a deterministic version of tournament selection and implemented only mutation without crossover.

function compare_and_breed(p) { // This code has been slightly altered for clarity.
	score = new Array;
	genes = new Array;
	var best = -1;
	var bestman = null;

	for (z=0;p[z]!=null;z++) {
		score[z]=getScore(p[z]);
		genes[z]=cloneGenes(p[z]);
		if (score[z]>=best) {
			best = score[z];
			bestman = z;
		}
	}

	AddSnake(genes[bestman]);
	AddSnake(mutateGenes(p[bestman]));
	AddSnake(mutateGenes(p[bestman]));
	AddSnake(mutateGenes(p[bestman]));
}

Of the four old snakes, the one who is deemed the most fit (the longest) will be selected for procreation. The next generation will consist of one clone of the champion and three mutations. In the mutated snakes, each gene value has a 1 in 8 chance of being raised or lowered slightly and the same odds of being completely randomized.

function mutateGenes(p) { // This code has been slightly altered for clarity.
	oldgenes = historyL[p]
	newgenes = new Array();

	for (i=1; i<=8; i++) {
		if (mutationInhibitor()) {newgenes[i] = mutateGene(oldgenes[i]);}
		else if (unstableMutationInhibitor()) {newgenes[i] = randomGene();}
		else {newgenes[i] = oldgenes[i];}
	}

	return newgenes;
}

function mutationInhibitor()			{ return (Math.round(Math.random()*8) == 1); }
function unstableMutationInhibitor()	{ return (Math.round(Math.random()*8) == 1); }
function mutateGene(x)					{ return x + ((Math.round(Math.random()*10)/2) - 5); }
function randomGene()					{ return (Math.round(Math.random()*200)/2-50); }

And that is really all there is too it. The rest is just repetition of tweaking and testing; recombining and re-evaluating; mutation and measuring fitness.

Pay attention to that man behind the curtain.

In my previous post about this project I had the audacity of calling these snakes intelligent. Judging by the questions Casper and others raised they are certainly perceived that way. Now that you (hopefully) understand how this magic works, I wonder if you would still consider what you see to be intelligence. Perhaps lifting the curtain of uncertainty has broken the spell; revealing it to be a cheap trick, nothing but smoke and mirrors.

And if you think that the latter is the case, what does this imply for your own intellect?

You might not want to think about that for too long, because you might not like the answer.

[I am deeply sorry if I’ve offended any poodle fanatics with the examples used in this article. You have to admit, they do look rather silly. If evolution were ever to blame for causing something awful it would be those poor poodles.]


Update (april 2nd 2011): Code for this project is now available on Github.

Advertisements

Written by Lukas Vermeer

February 12, 2011 at 15:21

Solving Some Sudoku Puzzles Faster (I Stand Corrected)

with 10 comments

Okay. So maybe I was jumping to conclusions a little early in the game.

After I posted my own modified version of Peter Norvig‘s Sudoku Solver, Peter and I had some brief discussion about my results and conclusions. I had only compared the performance of the two algorithms based on the ninety-five difficult puzzles provided by Peter, but he was not convinced. So I set out to make a more rigorous analysis of the relative strengths of our approaches using the method for generating random puzzles included in the original code. The results were a little unexpected, but not very surprising.

Fair game.

To make this a fair competition, I made some slight modifications.

Firstly, I had to ensure that both algorithms were solving the exact same random puzzles. Otherwise one algorithm might, simply by virtue of some serious bad luck, end up with more difficult puzzles. This is especially important since Peter’s experiments showed that long execution times (difficult puzzles) are rare but have an extreme impact; a few puzzles taking more than a minute to solve.

Secondly, I modified the code so that it uses the timeit module for measuring execution times. This allowed me to measure execution times more precisely.

More pudding.

Overnight, one million random puzzles were attempted by the two competing algorithms (999.925 were actually solved). Peter’s original algorithm required an average of 0.007 seconds per puzzle (142 puzzles per second) with a maximum execution time of 91.06 seconds. For the same puzzles, my modified version required an average of 0.008 seconds per puzzle (124 puzzles per second) with a maximum execution time of merely 5.75 seconds.

Both these averages are far lower than those for the ninety-five difficult puzzles attempted earlier, but the maxima are orders of magnitude higher. This seems to indicate that most of these random puzzles are easier to solve; save for a few extreme puzzles. This is in line with Peter’s earlier findings.

Sudoku algorithm mean execution times and standard deviations

Sudoku algorith execution times

The original algorithm solves these random puzzles about fourteen percent quicker on average, but the distribution of execution times is more spread out; as is evident from the much larger standard deviation and higher maximum.

Mea culpa. But not for maxima.

Based on these findings I have to admit I was wrong. My modified algorithm does not solve every Sudoku puzzle faster, as the title of my previous post implied. When solving many, randomly generated puzzles the original is slightly faster on average.

But when the task at hand involves more difficult Sudoku puzzles, or when you are operating under strict time constraints (i.e. you want more certainty about the maximum time required to solve a single puzzle), my money is still on the modified version. The exact puzzle that required the original algorithm 91.06 seconds to solve took the modified version a mere 0.0079870224 seconds. I’d say that is an improvement.

Longest execution times original algorithm

Longest execution times modified algorithm

There’s a tool for every job.

When we were discussing how Peter had generated the graphs he used in his article, I remarked (tongue-in-cheek) that copying and pasting execution times into a spreadsheet sounded a lot like brute-force manual labour to me. I think his response summarizes this whole exercise for me quite well.

Yes, but sometimes a little brute force is easier than figuring out how to do an elegant solution.

It seems to me now that sometimes a little brute force is not only easier, but that it can be faster than a more elegant solution as well. Many thanks to Peter for posing questions and pushing me to dig a little deeper.

Sir, I stand corrected!

Written by Lukas Vermeer

February 5, 2011 at 14:12

Posted in Mathematics

%d bloggers like this: