AN INTERNATIONAL CONFERENCE ON

GENETIC ALGORITHMS

AND THEIR APPLICATIONS

at

Carnegie-Mellon University

Pittsburgh, PA

Sponsored By

Texas Instruments, Inc.

U.S. Navy Center for Applied Research

in Artificial Intelligence

(NCARAI)

John J. Grefenstette

Editor

[pp 183-187]

Nichael Lynn Cramer

Texas Instruments Inc.

PO Box 226015, MS 238

Dallas, TX 75266

*
[Current address: GTE Internetworking
10 Moulton St, Cambridge MA, 02138
ncramer@bbn.com
]
*

**ABSTRACT**

An adaptive system for generating short sequential computer functions is described.
The created functions are written in the simple "number-string" language **JB**, and in **TB**, a
modified version of **JB** with a tree-like structure. These languages have the feature that
they can be used to represent well-formed, useful computer programs while still being
amenable to suitably defined genetic operators. The system isused to produce two-input,
single-output multiplication functions that are concise and well-defined. Future work,
dealing with extensions to more complicated functions and generalizations of the
techniques, is also discussed.

**INTRODUCTION**

The techniques of adaptive Genetic Algorithms [**GA**s]
^{[1]}
been shown to be
useful in many areas. Initially, these systems involved the adjusting of a fixed set of
parameters in order to optimize the performance of a given algorithm
^{[2]}
. Much work has
been done toward the goal of evolving the algorithms themselves, particularly in
Production System-like domains
^{[}
^{1(chap 8)}
^{,}
^{3}
^{,}
^{4}
^{]}
. This paper discusses work towards
developing a sequential programming language that is suitable for manipulation by **GA**s so
as to permit the adaptive generation of simple computer functions from low-level
computational primitives.

**FUNCTIONAL REPRESENTATION**

The scheme that we will follow is first to find a suitably powerful programming
language, and then encode the programs in this language in such a way as to make them
amenable to the standard Genetic Operators[**GO**s].

The basic language to be used is a variation of the algorithmic language **PL** having
the following operators:

```
(
```

**:INC** **VAR**) *;; add 1 to the variable VAR*

(**:ZERO** **VAR**) *;; set the variable VAR to 0*

(**:LOOP** **VAR** **STAT**) *;; perform the statement STAT VAR times*

(**:GOTO** **LAB**) *;; jump to the statement with label LAB*

Programs in **PL** consist of an arbitrary number of globally-scoped (positive) integer
variables and statements containing operators of the above forms. Two simple example **PL**
Programs are:

```
```

*;;Set variable V0 to have the value of* **V1**

**(:ZERO V0)**

**(:LOOP V1 (:INC V0))**

*;;Multiply* **V3** *by* **V4** *and store the result in* **V5**

**(:ZERO V5)**

**(:LOOP V3 (:LOOP V4 (:INC V5)))**

While **PL** can be shown to be Turing Equivalent
^{[5]}
we will be interested in the
language subset
**PL-{ :GOTO}**
. This language subset has two useful properties: first,
while it is not fully Turing Equivalent, it still comprises a powerful set of functions
(specifically, the set of primitive recursive functions)

`:GOTO`

}` `**:SET**

operator, which accepts two variables and sets the value of the first variable
equal to that of the second. (As can be seen in the examples above, this operation is trivially
definable in
`:GOTO`

}**:BLOCK**

operator that accepts two statements as
arguments and evaluates the two statements sequentially. (This is essentially just a
grouping operation that has no effect on the overall structure of the
language.)
Now, the encoded representation for our programs should have two characteristics:

(Goal 1) It should be amenable to the standard **GO**s.

(Goal 2) The representation should produce only well-formed programs, even
when subjected to the **GO**s. While some representations, e.g. character-strings, might be
well suited for the mechanisms of **GO**s, the random generation and/or altering of characters
is not likely to produce, say, a useful **FORTRAN** program. Consequently, it is strongly
desirable that the chosen representation be such that all such generated programs stay in the
space of syntactically correct programs. Not all such generated programs would be useful
(adapation would be expected to correct that); it is only important at this point that such
programs be well formed.

This paper will consider lists of integers as a representation for these programs
where the object the integer represents (variable, operator, etc.) is determined by the
integer's position in the list. Clearly such a representation satisfies Goal 1 above, the
standard **GO**s (Crossover, Mutation, Inversion) would be well defined on such a list. To
satisfy Goal 2, we need to define a decoding of an arbitrary list nto
a well-formed program.

**THE JB LANGUAGE**

A first attempt at such a decoding is the language **JB**. The list of integers is first
divided into statements of some length large enough for the longest statement size, (three in
the present case). Any integers left over at the end of this list are ignored. The first of these
statements is defined to be the Main Statement [**MS**] and the remaining *N _{as}* statements are
the Auxiliary Statements [

```
```

**( 0 4 2) -> (:BLOCK** *AS _{4}*

Here the symbols of the forms
` `

and
*V _{n}*

` `*AS*_{n}

represent, respectively, example
Variables and Auxiliary Statements.
This body of statements is embedded in an environment containing
```
```

body-variables (initialized to 0) and
*N _{bv}*

```
```*N*_{vtot}= (*N*_{iv} + *N*_{bv})

available variables can be
returned as ouput.
The function is entered by executing the **MS**, which, typically, will call on one or
more of the **AS**s. An example **JB** program would be:

```
```

**(0 0 1 3 5 8 1 3 2 1 4 3 4 5 9 9 2)**

This would be grouped into the following Statements:

```
```

**(0 0 1)** *;;main statement->* (**:BLOCK** *AS _{0}*

This is the same as the **PL** multiplication program above.

As can be seen, virtually (see below) any list (of sufficient length) of integers
chosen from the range
```
[0,
```

can be used to generate a well-formed *N _{rand}*-1]

```
```*N*_{rand} = *N*_{vtot} * *N*_{as} * *N*_{op}

(
A second, more serious problem is that while the mechanisms of the applications of
the **GO**s are very simple in the **JB** language, the semantic implications of their use are quite
complicated. Because of the structure of **JB**, semantic positioning of a integer-list element
is extremely sensitive to change. As a specific example, consider a large complicated
program beginning with a
```
```

statement in the top-level Main Statement. A single,
unfortunate, mutation converting this operator to a
**:BLOCK**
```
```

would destroy any useful features
of the program. Secondly, this strongly epistatic nature of **:SET**
**JB** seems incompatible with
Crossover, given Crossover's useful-feature-passing nature. A useful **JB** substructure
shifted one integer to the right will almost certainly contain none of its previously useful
properties .

**THE TB LANGUAGE**

In an effort to alleviate these problems, we consider a modified version of **JB**. This
language, called **TB**, takes advantage of the implicit tree-like nature
of **JB** programs.

**TB** is fundamentally the same as **JB** except that the Auxiliary Statements are no
longer used. Instead, when a **TB** statement is generated, either at its initial creation or as a
result of the application of a **GO** (defined below), any subsidiary statements that the
generated statement contains are recursively expanded at that time. The TB programs no
longer have the simple list structure of **JB**, but instead are tree-like. Because we are simply
recursively expanding the internal statements without altering the actual structure of the
resulting program, the **TB** programs still satisfy Goal 2. Indeed, it can be seen that,
because of its tree-like structure, **TB** does not suffer from the problem of internal loops
described above. Thus, all possible program trees do indeed describe syntactically correct
programs.

An example of a TB program is:

```
```

**(0 (3 5) (1 3 (1 4 (4 5) ) ) )**

This expands to the same **PL** and **JB** multiplication
programs given above.

The standard **GO**s are defined in the following way:

Random Mutation could be defined to be the random altering of integers in the
program tree. This would be valid but would encounter the same "catastrophic minor
change" problems as did **JB**. Instead, Random Mutation is restricted to the statements near
the fringe of the program tree. Specifically: 1) to leaf statements, i.e., those that contain
operators that do not themselves require statements as arguments
```
```

And 2) to non-leaf statements (with operators
**(:INC, :SET, :ZERO).**
```
```

) whose
sub-statement arguments are themselves leaf operators. Inside a statement, mutation of a
variable simply means randomly changing the integer representing that variable.Mutating an
operator involves randomly changing the integer representing the operator and making any
necessary changes to its arguments, keeping any of the integers as arguments that are still
appropriate, and recursively expanding the subsidiary statements as
necessary.
**:BLOCK, :LOOP**

Similarly, following Smith
^{[6]}
, we restrict the points at which Crossover can occur.
Specifically, Crossover on **TB** is defined to be the exchange of subtrees between two
parent programs; this is well-defined and clearly embodies the intuitive notion of Crossover
as the exchange of (possibly useful) substructures. This method is also without the
problems that Crossover entails in **JB**. In a similar manner, we could define Inversion to
be the exchange of one or more subtrees within a given program.

**EXAMPLE**

As a concrete example, an attempt was made to "evolve" concise,
two-input, one-
output multiplication functions from a population of randomly generated functions. As
^{[3(chap 5)]}
discussed by Smith
a major problem here is one of "hand-crafting" the evaluation
function to give partial credit to functions that, in some sense, exhibit multiplication-like
behavior, without actually doing multiplication.

After much experimentation, the following scheme for giving an evaluation score
was used. For a given program body to be scored, several instantiations of the function
were made, each having a different pair of input variables [**IV**s]. Each of these test
functions was given a number of pairs of input values and the values of all of the function's
variables were collected as output variables [**OV**s]. The resulting output values were
examined and compared against the various combinations of input values and **IV**s. The
following types of behavior were noted and each successive type given more credit:
1] **OV**s that had changed from their initial values. (Is there any activity in the function?)
2] Simple functional dependence of an **OV** on an **IV**. (Is the function noticing the input?)
3] The value of an **IV** is a factor of the value of an **OV**. (Are useful loop-like structures
developing?) 4] Multiplication. (Is an **OV** exactly the product of two
**IV**s.)

Furthermore rather than accept input and/or output in arbitrary variables, scores were given an extra weight if the input and/or output occurred in the specific target variables. To ensure that the functions remain reasonably short, functions beyond a certain length are penalized harshly. Finally, a limit is placed on the length of time a function is permitted to run; any function that has not halted within in this time is aborted.

A number of test runs were made for the system with a population size of fifty. These were compared against a set of control runs. The control runs were the same as the regular runs except that there was no partial credit given; all members of the population were given a low, nominal score until they actually started multiplying correctly. All runs were halted at the thirtieth generation. The system produced the desired multiplication functions 72% more often than the control sample.

**FUTURE WORK**

Finally, a number of questions remain concerning the present system and its
various extensions:

Extensions of the present system: Generation of other types of simple arithmetic
operations seem to be the next step in this direction. Given the looping nature of the
underlying **PL** language it seems obvious that the system should be well suited for also
generating addition functions. However, it is less clear that it would do equally well
attempting to generate, e.g., subtraction or division functions, to say nothing of more
complicated mathematical functions. Indeed, the results of the control case above show that
it is difficult not to produce multiplication in this language; generation of other types of
functions would prove an interesting result. On the other hand are there other, comparably
simple, languages that are better suited to other types of functions?

Concerning Extensions of the Language: A useful feature of the original **JB**
language is its suitability for the mechanisms of the **GO**s. Can some further modification
be made to the current **TB** language to bring it back into line with a more traditional bit-
string representation? Are these modifications, in fact, really desirable? Alternatively,
would it be useful to modify the languages to make **GO**s less standard? For example,
would it be productive to formalize the subroutine swapping nature of the present method
of Crossover and define a program as a structure comprising a number of subroutines,
where the application Crossover and Inversion was restricted to the swapping of entire
subroutines, and Random Mutation restricted to occurring inside the body of a subroutine?

**ACKNOWLEDGMENTS**

I would like to thank Dr. Dave Davis for innumerable valuable discussions and Dr.
Bruce Anderson for preserving the environment that made this work possible.

**[2]**
Bethke, A., *Genetic Algorithms as Function Optimizers*, Ph.D. Thesis, University of
Michigan, 1980.

**[3]**
Smith, S.F., *A Learning System Based on Genetic Adaptive Algorithms*, Ph.D. Thesis,
Univ. of Pittsburgh, December, 1980.

**[4]**
Holland, J.H. and J. Reitman, *Cognitive Systems Based on Adaptive Algorithms*, in
**Pattern Directed Inference Systems**, Waterman and Hayes-Roth, Ed. Academic
Press, 1978.

**[5]**
Brainerd, W.S. and Landweber L.H., **Theory of Computation**, Wiley-Interscience,
1974.

**[6]**
Smith, S.F., *Flexible Learning of Problem Solving Heuristics through Adaptive Search*,
Proc. IJCAI-83, 1983.

Return to Nichael Cramer's HomePage-- Publications work: ncramer@bbn.com

home: nichael@sover.net