X. Encoding



Introduction

Encoding of chromosomes is the first question to ask when starting to solve a problem with GA. Encoding depends on the problem heavily.

In this chapter some encodings will be introduced that have been already used with some success.





Binary Encoding

Binary encoding is the most common one, mainly because the first research of GA used this type of encoding and because of its relative simplicity.

In binary encoding, every chromosome is a string of bits - 0 or 1.

Chromosome A 101100101100101011100101
Chromosome B 111111100000110000011111

Example of chromosomes with binary encoding

Binary encoding gives many possible chromosomes even with a small number of alleles. On the other hand, this encoding is often not natural for many problems and sometimes corrections must be made after crossover and/or mutation.


Example of Problem: Knapsack problem
The problem: There are things with given value and size. The knapsack has given capacity. Select things to maximize the value of things in knapsack, but do not extend knapsack capacity.
Encoding: Each bit says, whether the corresponding thing is in knapsack.





Permutation Encoding

Permutation encoding can be used in ordering problems, such as travelling salesman problem or task ordering problem.

In permutation encoding, every chromosome is a string of numbers that represent a position in a sequence.

Chromosome A 1  5  3  2  6  4  7  9  8
Chromosome B 8  5  6  7  2  3  1  4  9

Example of chromosomes with permutation encoding

Permutation encoding is useful for ordering problems. For some types of crossover and mutation corrections must be made to leave the chromosome consistent (i.e. have real sequence in it) for some problems.


Example of Problem: Travelling salesman problem (TSP)
The problem: There are cities and given distances between them. Travelling salesman has to visit all of them, but he does not want to travel more than necessary. Find a sequence of cities with a minimal travelled distance.
Encoding: Chromosome describes the order of cities, in which the salesman will visit them.




Value Encoding

Direct value encoding can be used in problems where some more complicated values such as real numbers are used. Use of binary encoding for this type of problems would be difficult.

In the value encoding, every chromosome is a sequence of some values. Values can be anything connected to the problem, such as (real) numbers, chars or any objects.

Chromosome A

1.2324  5.3243  0.4556  2.3293  2.4545
Chromosome B

ABDJEIFJDHDIERJFDLDFLFEGT

Chromosome C

(back), (back), (right), (forward), (left)

Example of chromosomes with value encoding

Value encoding is a good choice for some special problems. However, for this encoding it is often necessary to develop some new crossover and mutation specific for the problem.


Example of Problem: Finding weights for a neural network
The problem: A neural network is given with defined architecture. Find weights between neurons in the neural network to get the desired output from the network.
Encoding: Real values in chromosomes represent weights in the neural network.




Tree Encoding

Tree encoding is used mainly for evolving programs or expressions, i.e. for genetic programming.

In the tree encoding every chromosome is a tree of some objects, such as functions or commands in programming language.

Chromosome A

Chromosome B

( +  x  ( /  5  y ) )

( do_until  step  wall )

Example of chromosomes with tree encoding

Tree encoding is useful for evolving programs or any other structures that can be encoded in trees. Programing language LISP is often used for this purpose, since programs in LISP are represented directly in the form of tree and can be easily parsed as a tree, so the crossover and mutation can be done relatively easily.


Example of Problem: Finding a function that would approximate given pairs of values
The problem: Input and output values are given. The task is to find a function that will give the best outputs (i.e. the closest to the wanted ones) for all inputs.
Encoding: Chromosome are functions represented in a tree.


           
(c) Marek Obitko, 1998