Mathias Brandewinder on .NET, F#, VSTO and Excel development, and quantitative analysis / machine learning.
by Mathias 11. December 2011 11:06

Yesterday, I made the first version of Bumblee public on Codeplex. Version 0.1 is an Alpha release, meaning that it’s usable, but still rough around the edges.

What is Bumblebee? It is an Artificial Bee Colony (ABC) algorithm, a randomized search method which mimics the behavior of bee hives. Given a search problem, the algorithm will dispatch Scout bees to look for new solutions, and Active bees to explore around known solutions. Once their search is complete, bees return to the Hive and share information with Inactive bees, and new searches are allocated based on the information available so far.

I have multiple goals with Bumblebee. I came across the algorithm for the first time in this article, which illustrates it on the Traveling Salesman Problem with a C# implementation. I enjoyed the article, but wondered if I could

  • parallelize the algorithm to use multiple cores,
  • provide a general API to run arbitrary problems.

… and I figured it would be a good sample project to sharpen my F# skills.

For the parallelization part, I decided to use the Task Parallel Library: the Hive creates Tasks for each bee beginning a search, which returns to a Queue to be processed and share the search result with the inactive bees in the Hive (see outline here).

Deciding on a design for the API took some back and forth, and will likely evolve in the next release. The API should:

  • accommodate any problem that can be solved by that approach,
  • be reasonably simple to use by default,
  • limit parallelization problems, in particular around random numbers generation,
  • be palatable for F# and C# users.

Regarding the first point, 3 things are needed to solve a problem using ABC: Scouts need to know how to find new random solutions, Active bees need to be able to find a random neighbor of a solution, and solutions need to be comparable. I figured this could be expressed via 3 functions, held together in a Problem class:

  • a generator, which given a RNG returns a new random solution,
  • a mutator, which given a RNG + solution tuple, returns a random neighbor of the solution,
  • an evaluator, which given a solution returns a float, measuring the quality of the solution.

You can see the algorithm in action in the TspDemo console application included in the source code.

I opted to have the Random Number Generator as an argument because the algorithm is responsible for spawning Tasks, and is therefore in a good position to provide a RNG that is safe to use, relieving the client of the responsibility of creating RNGs. I’ll probably rework this a bit, though, because I don’t like the direct dependency on Random; it is not very safe, and I would like to provide the ability to use whatever RNG users may happen to prefer.

The reason I chose to have the evaluator return a float (instead of using IComparable) is because I figured it might be interesting to have a measure which allowed the computation of rates of improvements in solution quality.

As for simplicity, I ended up with a main class Solver with 2 main methods. Search(Problem) initiates a search as a Task, and goes on forever until Stop( ) is called. The Solver exposes an event, FoundSolution, which fires every time an improvement is found, and returns a SolutionMessage, containing the solution, its quality, and the time of discovery. It is the responsibility of the user to decide when to stop the search, based on the information returned via the events.

By default, the Solver is configured with “reasonable” parameters, but if needed, they are exposed via properties, which can be modified before the search is initiated.

No effort has gone for the first release to make this C# friendly – this, and abstracting the RNG, are the goals of the next release.

I would love to get feedback, both on the overall design and the code itself! You can download the project from here.

by Mathias 5. September 2011 14:20

Back in April, inspired by an MSDN article, I began looking into converting a Simulated Bee Colony algorithms from C# to F#; I thought this would be an interesting exercise on my slow path to learning F#, and I got a rough implementation going. Attention-disorder deficit and life in general got me side-tracked, but, better late than never, I am now ready to get back to it.

One aspect I am interested in, is to figure out how the original algorithm could be modified for parallelization. In its original form, the algorithm followed a turn-based approach: given the current state of the hive, process sequentially every bee (active, inactive, and scout bees), and repeat until the pre-set number of iterations is reached.

How could we approach parallelization? At a high level, two operations are taking place:

  • some bees are searching from new solutions to the problem, either by creating brand-new solutions (Scout bees), or by finding a solution close to an initial Solution (Active bees),
  • bees that have completed a search come back to the Hive, and share the Solution they found to the Inactive bees via a Waggle Dance; Inactive bees can update their state based on that new information.

The first part, the Search, is easily parallelizable: while searching, a Bee shares no data with other bees. Therefore, multiple bees could be searching for new solutions, each on its own thread. On the other hand, the information sharing part is trickier: if multiple bees were to share their new solution with inactive bees at the same time, concurrency problems would likely arise.

One approach to resolve the issue is to avoid the concurrency problem, and make sure that by design, the information-sharing part is taking place sequentially, one bee at a time. Here is how we will achieve this:

ParallelBees

A queue will hold bees returning from a Search with a new Solution, and process bees from the queue one by one, passing their information to the current inactive bees. Once it has shared its information with the inactive bees, the bee (or one of the inactive bees) is sent to Search again in parallel – and when its search completes, it returns to the queue, where it goes back in line and wait until it can be processed.

More...

by Mathias 22. May 2011 14:26

In our previous installments, we laid the groundwork of our Bee Colony Algorithm implementation. Today, it’s time to put the bees to work, searching for an acceptable solution to the Traveling Salesman problem.

We will approach the search as a Sequence: starting from an initial hive and solution, we will unfold it, updating the state of the hive and the current best solution at each step. Let’s start with the hive initialization. Starting from an initial route, we need to create a pre-defined number of each Bee type, and provide them with an initial destination:

let Initialize nScouts nActives nInactives cities (rng : Random) =
   [    
      for i in 1 .. nScouts do 
         let solution = Evaluate(Shuffle rng cities)
         yield Scout(solution)
      for i in 1 .. nActives do
         let solution = Evaluate(Shuffle rng cities)
         yield Active(solution, 0)
      for i in 1 .. nActives do
         let solution = Evaluate(Shuffle rng cities)
         yield Inactive(solution)
   ]

There is probably a more elegant way to do this, but this is good enough: we use a simple List comprehension to generate a list on the fly, yielding the appropriate number of each type of bees, and assigning them a shuffled version of the starting route.

More...

by Mathias 15. May 2011 14:57

In the last post, we ended up defining Bees of 3 types – Scout, Active and Inactive – and a Search function which described how bees search among the space of possible solutions for new “food sources”. The second part of the algorithm deals with how bees returning to the hive share that information with their colleagues, and are promoted from inactive to actively searching.

Real bees share information with each other using what is known as the Waggle Dance, a pretty amazing process by which bees convey to other bees the direction to food sources. The algorithm is much less poetic: bees who return to the Hive with a new Solution will share that information with all bees that are already inactive, or will just become inactive because they have exhausted their current food source. Inactive bees, when presented with a new solution, may – with a certain probability - adopt it as their new target if it is better than their current target.

We'll represent this with a Waggle function, which will represent how a Bee will update its target Solution when presented with a list of potential new solutions.

First, we need to filter the bees who pay attention to the dance; we’ll use a simple Active Pattern:

let tripsLimit = 100

let (|RequiresUpdate|) bee =
   match bee with 
   | Scout(solution) -> false
   | Inactive(solution) -> true
   | Active(solution, trips) -> trips > tripsLimit

While a Scout never listens to the Waggle Dance, an Inactive bee always listens to incoming bees, and Active bees who have made multiple trips to the same destination without finding any improved solution will become Inactive, and listen to other bees.

We can now apply that pattern to decide what to do with a bee:

let Waggle (solutions : List<Solution>) (bee : Bee) (rng : Random) =
   match bee with 
   | RequiresUpdate true -> 
      let currentSolution = Solution bee
      let newSolution = List.fold (fun acc element -> 
         if element.Cost < acc.Cost && rng.NextDouble() < probaConvince 
         then element else acc) currentSolution solutions
      Inactive(newSolution)      
   | _ -> bee

For bees that require an update, we retrieve their current solution (using a simple function Solution I’ll leave out for the moment), and fold the list of new potential solutions: starting from the bees’ current solution, we examine whether each candidate is an improvement, and pass the new solution to the accumulator if it is selected.

More...

by Mathias 8. May 2011 19:43

This is our third episode in attempting to convert a C# bee colony algorithm into an F# equivalent. In our previous posts, we created functions to randomly shuffle lists of cities, and to measure the length of the corresponding path. Today, it’s time to get the bees to work, bringing us new solutions to the hive.

The algorithm distinguishes 3 types of bees: Scout, Active and Inactive. Each bee type has a different role in the algorithms: Scouts keep searching for new solutions, Active bees explore around known solutions for improvements until their potential is exhausted, and Inactive bees wait for new information, and replace Active bees when they turn Inactive.

Let’s start there, and define a Bee discriminated union:

type Bee = Scout | Active | Inactive

In the original C# implementation I am starting from, the algorithm works by iteration: each bee of the hive is processed and its state in the Hive updated (see “The Solve Method” in the article), with 3 steps: the bee

  • finds a new solution and evaluates its quality,
  • shares that information with the inactive bees of the Hive by performing a Waggle Dance,
  • becomes re-allocated as Active or Inactive for the next iteration.

Rather than follow strictly the existing implementation, where all three steps are happening in one single method for a bee, I decided to re-organize it a bit, and separate each of these operations, in part to make the code easier to follow, and in part with an eye to making it run in parallel later.

In that frame, let’s begin with the first step, where bee searches for a new solution. Every bee has a current solution in memory, and after searching, they will come up with a new target solution if it is an improvement. Let’s first define for convenience what a Solution will be:

type Solution = { Route: List<City>; Cost: float }
let Evaluate (route: List<City>) = { Route = route; Cost = CircuitCost route }

A Solution wraps in a Record the Route – the ordered list of Cities travelled – and its Cost, measured by its length. We also define a convenience function Evaluate, which takes in a Route and returns the corresponding solution, with the original Route and its Cost, computed using the function we wrote in our last post.

More...

Comments

Comment RSS