A recent article called Genetic Algorithm for Hello World describes the basic concepts involved in programming genetic algorithms. The original implementation is in Javascript, and has a nice online simulator. I thought it would be fun to contribute an implementation in Factor.

Genetic algorithms are generally comprised of the following concepts:

- A
**target**chromosome which expresses a possible solution to the problem - A
**fitness**function which takes a chromosome as input and returns a higher value for better solutions - A
**population**which is just a set of many chromosomes - A
**selection**method which determines how parents are selected for breeding from the population - A
**crossover**operation which determines how parents combine to produce offspring - A
**mutation**operation which determines how random deviations manifest themselves

First, we need to define the vocabularies that we will use and a namespace:

USING: fry kernel make math math.order math.ranges random sequences ; IN: hello-ga

## Target

Our goal is to generate the target string "`Hello World!`

".

CONSTANT: TARGET "Hello World!"

## Fitness

The fitness of a chromosome is the "sum of the character-wise differences between the chromosome and the target string".

: fitness ( chromosome -- n ) TARGET 0 [ - abs - ] 2reduce ;

## Population

Our starting population will be made up by "creating 400 totally random 12 character strings".

CONSTANT: POPULATION 400 : random-chromosome ( -- chromosome ) TARGET length [ 256 random ] "" replicate-as ; : random-population ( -- seq ) POPULATION [ random-chromosome ] replicate ;

## Selection

We select two parents to survive or breed into the next generation by "taking two members of the population at complete random and keep the fittest as the first parent, then do the same with another two members and keep the fittest as the other parent".

: fittest ( parent1 parent2 -- parent1' parent2' ) 2dup [ fitness ] bi@ > [ swap ] when ; : tournament ( seq -- parent ) dup [ random ] bi@ fittest nip ; : parents ( seq -- parent1 parent2 ) dup [ tournament ] bi@ ;

## Crossover

After choosing two parents, we want to "ensure their genes survive in the next generation". Sometimes (10% chance) the parents survive into the next generation, but most frequently, we mix the parents by picking a split point and then making one child from the head of parent1 plus tail of parent2 and another from the head of parent2 plus tail of parent1.

CONSTANT: CHILDREN-PROBABILITY 0.9 : children? ( -- ? ) 0.0 1.0 uniform-random-float CHILDREN-PROBABILITY < ; : head/tail ( seq1 seq2 n -- head1 tail2 ) [ head ] [ tail ] bi-curry bi* ; : tail/head ( seq1 seq2 n -- tail1 head2 ) [ tail ] [ head ] bi-curry bi* ; : children ( parent1 parent2 -- child1 child2 ) TARGET length 1 - [1,b) random [ head/tail append ] [ tail/head prepend ] 3bi ;

## Mutation

When a mutation occurs (20% chance), we "choose a random position and alter the character that's there by up to 5 places".

CONSTANT: MUTATION-PROBABILITY 0.2 : mutation? ( -- ? ) 0.0 1.0 uniform-random-float MUTATION-PROBABILITY < ; : mutate ( chromosome -- chromosome' ) dup length random over [ -5 5 [a,b] random + ] change-nth ;

## Simulation

Perform a single generation by selecting the parents (or children) and allowing mutations to occur. We do this in such a way that the number of chromosomes in each generation remains constant.

: (1generation) ( seq -- child1 child2 ) parents children? [ children ] when mutation? [ [ mutate ] bi@ ] when ; : 1generation ( seq -- seq' ) [ length 2 / ] keep '[ _ [ _ (1generation) , , ] times ] { } make ;

Computing all generations required to achieve the target string is fairly easy:

: finished? ( seq -- ? ) TARGET swap member? ; : all-generations ( seq -- seqs ) [ [ 1generation dup , dup finished? not ] loop drop ] { } make ;

## Try It

Compute all generations for a random population.

( scratchpad ) random-population all-generations

See how many generations it took.

( scratchpad ) dup length . 56

See how the fitness of the best chromosome changes over the generations.

( scratchpad ) dup [ [ fitness ] [ max ] map-reduce ] map . { -414 -382 -329 -288 -271 -238 -217 -191 -169 -167 -160 -143 -134 -113 -119 -94 -83 -79 -63 -59 -61 -54 -50 -43 -39 -38 -36 -31 -32 -32 -28 -27 -22 -18 -15 -15 -14 -13 -13 -11 -11 -10 -8 -7 -7 -7 -6 -6 -4 -4 -3 -3 -2 -2 -2 0 }

The code is available on my Github.

## 3 comments:

It's funny, I played with genetic algorithm in factor a while back and did pretty much the same thing. I used the combinators.random vocabulary to simplify all the probabilities :

http://paste.factorcode.org/paste?id=2097

Great suggestion! I forgot about combinators.random. That has some really useful things in it!

Post a Comment