A recent article called Genetic Algorithm for Hello World describes the basic concepts involved in programming genetic algorithms. The original implementation is in Javascript, and has a nice online simulator. I thought it would be fun to contribute an implementation in Factor.
Genetic algorithms are generally comprised of the following concepts:
- A target chromosome which expresses a possible solution to the problem
- A fitness function which takes a chromosome as input and returns a higher value for better solutions
- A population which is just a set of many chromosomes
- A selection method which determines how parents are selected for breeding from the population
- A crossover operation which determines how parents combine to produce offspring
- A mutation operation which determines how random deviations manifest themselves
First, we need to define the vocabularies that we will use and a namespace:
USING: fry kernel make math math.order math.ranges random sequences ; IN: hello-ga
Target
Our goal is to generate the target string "Hello World!
".
CONSTANT: TARGET "Hello World!"
Fitness
The fitness of a chromosome is the "sum of the character-wise differences between the chromosome and the target string".
: fitness ( chromosome -- n ) TARGET 0 [ - abs - ] 2reduce ;
Population
Our starting population will be made up by "creating 400 totally random 12 character strings".
CONSTANT: POPULATION 400 : random-chromosome ( -- chromosome ) TARGET length [ 256 random ] "" replicate-as ; : random-population ( -- seq ) POPULATION [ random-chromosome ] replicate ;
Selection
We select two parents to survive or breed into the next generation by "taking two members of the population at complete random and keep the fittest as the first parent, then do the same with another two members and keep the fittest as the other parent".
: fittest ( parent1 parent2 -- parent1' parent2' ) 2dup [ fitness ] bi@ > [ swap ] when ; : tournament ( seq -- parent ) dup [ random ] bi@ fittest nip ; : parents ( seq -- parent1 parent2 ) dup [ tournament ] bi@ ;
Crossover
After choosing two parents, we want to "ensure their genes survive in the next generation". Sometimes (10% chance) the parents survive into the next generation, but most frequently, we mix the parents by picking a split point and then making one child from the head of parent1 plus tail of parent2 and another from the head of parent2 plus tail of parent1.
CONSTANT: CHILDREN-PROBABILITY 0.9 : children? ( -- ? ) 0.0 1.0 uniform-random-float CHILDREN-PROBABILITY < ; : head/tail ( seq1 seq2 n -- head1 tail2 ) [ head ] [ tail ] bi-curry bi* ; : tail/head ( seq1 seq2 n -- tail1 head2 ) [ tail ] [ head ] bi-curry bi* ; : children ( parent1 parent2 -- child1 child2 ) TARGET length 1 - [1,b) random [ head/tail append ] [ tail/head prepend ] 3bi ;
Mutation
When a mutation occurs (20% chance), we "choose a random position and alter the character that's there by up to 5 places".
CONSTANT: MUTATION-PROBABILITY 0.2 : mutation? ( -- ? ) 0.0 1.0 uniform-random-float MUTATION-PROBABILITY < ; : mutate ( chromosome -- chromosome' ) dup length random over [ -5 5 [a,b] random + ] change-nth ;
Simulation
Perform a single generation by selecting the parents (or children) and allowing mutations to occur. We do this in such a way that the number of chromosomes in each generation remains constant.
: (1generation) ( seq -- child1 child2 ) parents children? [ children ] when mutation? [ [ mutate ] bi@ ] when ; : 1generation ( seq -- seq' ) [ length 2 / ] keep '[ _ [ _ (1generation) , , ] times ] { } make ;
Computing all generations required to achieve the target string is fairly easy:
: finished? ( seq -- ? ) TARGET swap member? ; : all-generations ( seq -- seqs ) [ [ 1generation dup , dup finished? not ] loop drop ] { } make ;
Try It
Compute all generations for a random population.
( scratchpad ) random-population all-generations
See how many generations it took.
( scratchpad ) dup length . 56
See how the fitness of the best chromosome changes over the generations.
( scratchpad ) dup [ [ fitness ] [ max ] map-reduce ] map . { -414 -382 -329 -288 -271 -238 -217 -191 -169 -167 -160 -143 -134 -113 -119 -94 -83 -79 -63 -59 -61 -54 -50 -43 -39 -38 -36 -31 -32 -32 -28 -27 -22 -18 -15 -15 -14 -13 -13 -11 -11 -10 -8 -7 -7 -7 -6 -6 -4 -4 -3 -3 -2 -2 -2 0 }
The code is available on my Github.
3 comments:
It's funny, I played with genetic algorithm in factor a while back and did pretty much the same thing. I used the combinators.random vocabulary to simplify all the probabilities :
http://paste.factorcode.org/paste?id=2097
Great suggestion! I forgot about combinators.random. That has some really useful things in it!
Post a Comment