A Localized Path-Finding Genetic Algorithm

Posted August 17, 2013 by Ian Kilgore

Localized Path-Finding

Below I implement a genetic algorithm which attempts the following problem:

Find a model which can, as an autonomous agent, traverse the lowest-cost path across some weighted map using only local information.

If this agent had prior knowledge of the entire map, determining the best path across it would be an objective (and well-studied) matter. In the following I attempt to train an agent to solve this problem with limited (local) information. These agents make decisions based only on four numbers: the relative heights of the terrain one "move" away in each of four directions (i.e., they can see no further than they can move).

On a uniformly-weighted space the lowest-cost path is simply the shortest, but here cost may be a proxy for difficult or rough terrain, or an abstraction. Below I will discuss the formulation and implications of the cost function used.

Terrain and Sample Problem

The figure above shows one possible problem space. The grey levels in the image can be understood as a height map, with lighter pixels representing higher altitudes. The red circle is a starting position, and an agent must attempt to cross the vertical green line on the right border of the image. Shown in yellow and blue are two possible paths from source to destination.

A successful agent should strike a balance between the shortest path and one that involves the least amount of steep climbs or descents. It should attempt to follow altitude contours when possible, but occasionally climb (or descend) a hill for a significant shortcut.

Readers uninterested in the details of implementation may wish to skip to results below.

Cost Function

It is intuitive that the yellow, longer path is a worse solution than the blue, straightforward one. This notion can be made objective by computing a cost which is equal to the length of the path. I also discretize the problem: an agent only moves in quantum steps of length $\epsilon$, where $\epsilon$ is measured in pixels. This cost function is then proportional (or equal) to $N$, the number of steps required to complete the path:

\begin{equation} C = N \end{equation}

The shortest path (a straight line) is now always the best path, so this cost function must be augmented to take changes in altitude into account. The cost calculation becomes:

\begin{equation} C = \sum\limits_{n=1}^N g_n^2 + \lambda N \end{equation}

where

\begin{equation} g_n = \begin{cases} \Delta_n, & \text{if }\Delta_n \geq 0\\ -\frac{\Delta_n}{2}, & \text{if }\Delta_n < 0 \end{cases} \end{equation}

for each step $g_n$ along the path. $\Delta_n$ is the change in height for that step. This cost is a function of both path length and the amount of climbing and descending one must do to complete the path. Note that cost scales with $g_n$ squared, meaning that very steep climbs and descents are penalized more strongly than shallow ones. Also, the cost for a descent is less than for a climb, but still positive (consider driving a vehicle down a steep, unpaved slope).

The parameter $\lambda$ controls the trade-off in cost between path length and altitude change: when $\lambda$ is very large, the path length component dominates and shorter paths will be selected at the cost of climbing. When $\lambda$ is very small, paths which minimize climbing will be favoured.

Finally, many agents never find a complete path. A (very large) fixed-cost penalty, $Q$, is added to their total costs.

Agent Model

Model Input

I model an agent as a function which, at each step, decides which direction to move based on information about the relative height of its immediate surroundings. The input at each step, $\mathbf{x}$, is a vector of length $d$ of (an approximation of) the directional derivatives around the agent's current position:

\begin{equation} x_i = \frac{h([x,y] + \epsilon\angle\theta_i)-h([x,y])}{\epsilon} \text{ for } i=1,...,d \end{equation}

Here $h(x,y)$ represents the height at position $(x,y)$. The angles $\theta_i$ are spaced evenly along the unit circle. So the vector $\mathbf{x}$ is the difference in height between an agent's current position and $d$ surrounding locations which are $\epsilon$ units away. For a differentiable surface, the gradient of $h(x,y)$ is sufficient to describe it, but for this approximate model I have chosen $d=4$.

Model Parameters and Output

At each step the agent must decide a new direction in which to move. This should be an angle, $\phi$, between $\pi$ and $-\pi$. The model becomes a function $A$ of $\mathbf{x}$, parameterized by some unknown vector $\alpha$:

\begin{equation} \hat{\phi} = A_\mathbf{\alpha}(\mathbf{x}) \end{equation}

The core of this model, $\mathbf{\alpha}^\top\mathbf{x}$, is a simple linear combination of the parameters, $\mathbf{\alpha}$, and the input vector $\mathbf{x}$. The output is then "squished" with a sigmoid function and normalized between $-\pi$ and $\pi$:

\begin{equation} A_\mathbf{\alpha}(\mathbf{x}) = 2\pi\left(\frac{1}{1+exp(\mathbf{\alpha}^\top\mathbf{x})}-0.5\right) \end{equation}

I show below that a model with only $d=4$ parameters is sufficient for reasonable performance as a path-finding agent. A genetic algorithm, below, is used to find near-optimal values for these four parameters.

Genetic Algorithm

A Genetic algorithm attempts to solve an optimization problem by mimicking evolution. Genetic algorithms are well-studied, so I will describe only my implementation for this problem. I represent each individual as a list of floating-point "genes" (in this case, each individual has four genes corresponding to the model parameters $\mathbf{\alpha}$.

Results

GA vs Control

In order to demonstrate that the genetic algorithm is productively solving this problem, I compare it to a control. The control algorithm keeps the best individual from the previous generation, as in the GA. Every other individual, however, is randomly generated. The following figures show that a genetic algorithm (left) outperforms a random search (right) of the parameter space. The cost function here has been tuned to discourage climbing in favour of longer, contour-following paths.

Evolution of Best Path

In the figures below, the best path for several different generations is evolved. The algorithm finds a global solution fairly quickly, and then optimizes certain local difficulties. The first image, below, shows the initial (random) best path, which does not reach the objective. After a few generations, a reasonable solution has been found but three problem areas (outlined in blue) remain. The next row of images focus on only the lower-right problem area. We see that the path, while initially confused, begins to straighten out after several generations.

Text output from this experiment is available. It is possible here to see how the percentage of the population which sucessfully completes any route grows as successful individuals are selected to reproduce.

Step cost and differentiability

Although the best path in the previous section eventually became somewhat smooth, its jagged nature can be improved upon. The jaggedness of that path can be attributed to two factors:

By smoothing the image (by a 20x20 boxcar filter) and increasing the step cost, we can find a much better route through the image. From left to right:

Generalization

A simple experiment shows that these agents have some limited capacity for generalization. An agent was trained on a perlin noise depth map using an initial position near the centre of the image. It proceeded to its goal while attempting to maintain the same altitude throughout. The same agent was then run starting at the far left of the image (at a different altitude) and then found a reasonable path to its goal, again maintaining a relatively uniform altitude throughout.

This capacity for generalization has limits. In the following example, there is a conduit through the image with a fork. One fork is a route through, and the other is not. Because these agents work using only local knowledge, it is impossible to determine a priori which path is best (It is impossible for even a truly intelligent agent to choose the correct path given only the information at hand).

Code

MATLAB code for this demo may be found on github: iank/erasmus.

References

[1] G. Rudolph. Evolutionary search for minimal elements in partially ordered finite sets. In V. W. Porto, N. Saravanan, D. Waagen, and A. E. Eiben, editors, Evolutionary Programming VII, Proceedings of the 7th Annual Conference on Evolutionary Programming, pages 345–353. Springer, Berlin, 1998.

[2] E. Zitzler, K. Deb, and L. Thiele. Comparison of Multiobjective Evolutionary Algorithms: Empirical Results. Evolutionary Computation 8.2 (2000): 173-195.

[3] D. Montana and L. Davis. Training Feedforward Neural Networks Using Genetic Algorithms. IJCAI, vol. 89 (1989): 762-767