Tuesday, 27 March 2007

Rapidly Exploring Random Trees

So you have a map, then what?

Although maps are pretty in themselves and are proof of a certain level of awareness, there comes a point when you might actually want your robot to do go somewhere, and eventually do something!

There are many ways to plan a path, and some of the best results often come from the simplest approaches like 'try a straight line', or A* search, but sometimes this isn't enough to solve the problem at hand. Typical problems like a car stuck in a dead end, or a 'beetle trap' are problematic for greedy approaches, and sometimes your search space is too large to explore exhaustively.
Here, I'll look at one approach called RRT which searches the space very quickly and can incorporate the motion constaints of the robot.

The simplest form is a Sharp tree.

At each step:
  • A random point is chosen in free space.
  • Find the closest existing point.
  • If the two can be connected, then connect.
  • Otherwise, move the new point to furthest safe place and connect.
  • Test for solution.

The advantage of this type of search is that it quickly explores free space without searching everywhere and progressively increases its coverage. Aditional contraints can be added to keep the minimum distance between points above a given resolution, and make the definition of free space include knowledge about the size of the robot.

The downside is that the result is likely to be far from optimal. To get round this, it can be iterated a few times to see if a shorter path can be found, and then smoothed by looking for shorter lines or curves between points in the path.

However many robots can't turn sharply, or prefer smoother paths. The random approach can be adapted to this by using a slightly different approach:

  • Choose a random existing point in the tree.
  • Choose a random control. (e.g. max speed, gentle left, for 1 second)
  • If the path is safe, add this to the tree.
  • Otherwise add the longest safe path.
  • Test for solution

This method can eaily be adapted to any control space, and subtle optimisations can be made such as only allowing a certain granularity of control, and keeping the tree points a certain distance apart.

The great thing about this method is that you know that the path is possible in terms of the controls of your robot. Still, the path is far from optimal, and may well need iterating, and smoothing of the path and acceleration.

By tweaking the control space to allow reverse, it can solve three point turns and reversing out of tight corridoors.

Again, using reverse frequently is not optimal, so it would be best to use to imply some cost to reversing, and choose a less expensive path if available.

One optimisation that seems to work well, is to grow two trees at the same time, one from your current location and another from the destination.

This often succeeds at finding a solution in many fewer iterations and can help avoid the search getting stuck in beetle traps when the search resolution is too low and points near the exit of the trap are overlooked.

The same technique has been used to solve much more complex problems, where brute force is far too expensive to try.

Related links:

Animations of RRT


Steven LaValle's free book

Thursday, 22 March 2007

Grid SLAM animation

This is an animation of the best aggregate map of 200 particles doing grid SLAM. It wiggles when another particle becomes the best aggregate particle. The map represents 10cm per pixel with a robot motion of 30cm per second. It is just dumbly wandering around, using a Vector Field Histogram to avoid walls - no purposeful entropy reduction.
It is using 24 simulated sonars modeled as fairly short range (200cm), narrow beam, medium motion noise and a beam probability model whose results are trusted to the power of 0.1 (i.e. not much, so as to encourage a smooth distribution without heavy particle duplication upon resampling - typically only 5%).
When the particles are moving into unknown territory, they diverge. When they see parts of the map a second time, they converge. As all of the particles' poses are overlaid on the current best particle's map, it can look as if they go through walls - if you were to see their own map, you would see that they think they are in the middle of the corridoor, and that the corridoor is in a different place.
You can see that the particle diversity is only just enough to close the loops.

Wednesday, 21 March 2007


Lately I've been playing with a c# implementation of GridSLAM, with an aim of getting a system that could work with poor sonar data. In simulation this is starting to get good results when the model parameters are tweaked just so.

The basic idea of SLAM (Simultaneous Localization and Mapping) is, as the name implies, to make a map while guesstimating where you are in the map. The output you hope for is a good map and a good estimate of where you are (x,y,theta). The inputs you get are noisy sonar data (range, angle), and poor odometry information (bad x,y,theta from dead reckoning using wheel encoder ticks).

If you were to try to make a map using just the raw data you might get something like this:

In GridSLAM, you try to model the uncertainty of your odometry information and the uncertainty of ranging data and together make an estimate of your state (x,y,theta) using the correspondance between new data and your existing map.

The major tool is called a particle filter: Rather than model just the best guess, you model hundreds of possible states, each with their own map. Each step, each particle is moved according to odometry data plus added gaussian noise, supposedly sufficient to cover the expected range of error in the odometry data.

The image above left shows 1000 particles moving to the right, with a small clockwise rotation. You will notice that the bulk of particles stay near the middle of the distribution. This is due to an essential but dangerous step of the particle filter called re-sampling whose aim is to keep the distribution of particles closely related to the overall probability of being true. It does this by assigning a weight (0 to 1) to each particle based on the probability of the latest move - i.e. how close it was to perfect motion. It then picks particles to be promoted into the next generation with a probability related to the weight: less likely particles die, while more likely particles are duplicated to keep the total number of particles the same.

This is fine if there are enough particles to cover the space, but when there are too few particles, the resampling could favour particles that happened to move well in the last step, but aren't good representations of a longer term movement.

The image on the left shows what could happen when using only 30 particles. The resampling has killed the true distribution by duplicating the 'wrong' particles and ending up in what is called 'particle deprivation' where too many duplicates followed the wrong path and have lost coverage of the model they were trying to approximate.
Without re-sampling (left) the distribution would have been closer to the truth. Unfortunately because GridSLAM updates a map for each particle at each step, with limited memory and processor power, it is difficult to use thousands of particles. 100-200 is more typical, so great care has to applied in choosing when to resample.

Each time a particle is duplicated, part of the history of possible movements is destroyed with the lost particle. In a maze with multiple levels of loops, a small loop can exagerate confidence, loosing the particle diversity needed to later close the large loop.

Much recent research has been done on ways to avoid this problem. Some of which are:
  • Don't re-sample if the distribution is too focused
  • Compress the diversity of weights, so resampling is less vicious
  • Remember previous diversity and revert to it after a small loop

As an exmaple, consider this robot path. It starts at the top left (1), moves to the right and around the first loop (2). Just after (2) it starts to see places it has seen before, so the uncertainty about the movement in the small loop is reduced. In this case it closed the small loop pretty well, and coninues back across the top, past where it started (3) and down to the bottom left(4). From (2) to (3) it was reducing its error, then from (3) to (4) more error is creeping into the distribution in preparation for closing the larger loop at (5).

The particles are shown as a red mess near (5). The graph shows how the Root Mean Square error of the entire particle set away from perfect odometry changes over that time. Just before (5) it is high, and the current aggregate best estimate is somewhat mis-aligned, but recoverable.

A little later, it managed to recover to a decent map, but only because all the model parameters had been tweaked to make this possible. If there was more motion error, less particles or more trust of sensor data, it would have failed spectacularly.

Many of the published papers work well due to massive overlap between the current sensor data and existing map. By using long range laser with 180 data points per scan a fairly accurate match is possible that removes much of the rotational uncertainty. Poorer systems with just a few highly noisy wide-beam, short-range sonars are far less likely to be good at closing large loops. I'm hoping that inserting rotation estimates from image tracking and a digital compass will help aleviate some of these problems.

My next step is to port this code from a c# test app into an MSRS service that it can be evaluated with real world data.

Image Tracking

The last few weeks I've been playing with Image tracking algorythms. After searching around a bit I decided to start with Edard Rosten's FAST Corner Detector (PDF).

It seems to produce good results and has been used by Andrew Davison to do real time monocular SLAM (i.e. one web cam). Although Bob Motram has already done a great c# port of Davison's code (MonoSLAM), it relies on seeing a dark rectangle of known size before starting, which didn't seem appropriate for my usage on a robot and I wanted to get to know the algorythms at first hand.

Finding corners is pretty easy, the hard part is tracking them. The first phase is to find corners in the next frame that look similar, but this also generates a good deal of randomly placed matches.

The first method I attempted created a mean move from all the moving corners, then fine tunned by iteratively re-calculating a mean move by weighing moves based on the probability of being within a gaussian distribution of the last mean move. This has the effect of ignoring outliers a often converging on the true move. In retrospect, the initial mean should have been calculated according to weights based on how similar the corners looked, but it didn't seem to matter too much.

Although this worked OK-ish, unless the camera is on a perfect horizontal and all the objects in the scene are far away, the result of panning a camera really wants to be part of a sphere. My actual robot is likely to have a fixed camera, so I should have been content with only optimising the x displacement (rotation) and leaving it at that, but alas curiosity got the better of me, so I decided to try to look to higher dimensions. Bad move.

Really, there is no escaping it - full 3D is the only proper representation that can account for the movement in the pictures. Alas this isn't just an average or weighted average of pixel movement. It requires simultaneous gradient descent in all the dimensions of the model, and projection of corners into the real world. This involves some hard maths that I wasn't quite prepared for and am far from mastering...