Mathias Brandewinder on .NET, F#, VSTO and Excel development, and quantitative analysis / machine learning.
15. February 2014 12:51

My favorite column in MSDN Magazine is Test Run; it was originally focused on testing, but the author, James McCaffrey, has been focusing lately on topics revolving around numeric optimization and machine learning, presenting a variety of methods and approaches. I quite enjoy his work, with one minor gripe –his examples are all coded in C#, which in my opinion is really too bad, because the algorithms would gain much clarity if written in F# instead.

Back in June 2013, he published a piece on Amoeba Method Optimization using C#. I hadn’t seen that approach before, and found it intriguing. I also found the C# code a bit too hairy for my feeble brain to follow, so I decided to rewrite it in F#.

In a nutshell, the Amoeba approach is a heuristic to find the minimum of a function. Its proper respectable name is the Nelder-Nead method. The reason it is also called the Amoeba method is because of the way the algorithm works: in its simple form, it starts from a triangle, the “Amoeba”; at each step, the Amoeba “probes” the value of 3 points in its neighborhood, and moves based on how much better the new points are. As a result, the triangle is iteratively updated, and behaves a bit like an Amoeba moving on a surface.

Before going into the actual details of the algorithm, here is how my final result looks like. You can find the entire code here on GitHub, with some usage examples in the Sample.fsx script file. Let’s demo the code in action: in a script file, we load the Amoeba code, and use the same function the article does, the Rosenbrock function. We transform the function a bit, so that it takes a Point (an alias for an Array of floats, essentially a vector) as an input, and pass it to the solve function, with the domain where we want to search, in that case, [ –10.0; 10.0 ] for both x and y:

#load "Amoeba.fs"

open Amoeba
open Amoeba.Solver

let g (x:float) y =
100. * pown (y - x * x) 2 + pown (1. - x) 2

let testFunction (x:Point) =
g x.[0] x.[1]

solve Default [| (-10.,10.); (-10.,10.) |] testFunction 1000

Running this in the F# interactive window should produce the following:

val it : Solution = (0.0, [|1.0; 1.0|])
>

The algorithm properly identified that the minimum is 0, for a value of x = 1.0 and y = 1.0. Note that results may vary: this is a heuristic, which starts with a random initial amoeba, so each run could produce slightly different results, and might at times epically fail.

More...

25. November 2012 09:10

I am still working my way through “Machine Learning in Action”, converting the samples from Python to F#. I am currently in the middle of chapter 6, dedicated to Support Vector Machines, which has given me more trouble than the previous ones. This post will be sharing my current progress: the code I have so far is a working translation of the naïve SVM implementation, presented in the first half of the chapter. We’ll get to kernels, and the full Platt SMO algorithm in a later post – today will be solely discussing the simple, un-optimized version.

Two factors slowed me down with this chapter: the math, and Python.

The math behind the algorithm is significantly more involved than the other algorithms, and I won’t even try to go into why it works. I recommend reading An Idiot’s guide to SVMs, which I found a pretty complete and accessible explanation of the theory behind SVMs. I will focus instead on the implementation, which was in itself a bit challenging.

First, the Python code uses algebra quite a bit, and I found that deciphering what was going on required a bit of patience. Take a line like the following:

fXi = (float)(multiply(alphas, labelMat).T*(dataMatrix*dataMatrix[i,:].T))+b

I am reasonably well versed in linear algebra, but figuring out what this is saying takes some attention. Granted, I have no experience with Python and NumPy, so my whining is probably a bit unfair. Still, I thought the code was not very readable, and it motivated me to see if that could be improved, and as a result I ended up moving away from heavy algebra notation.

Then, the algorithm is implemented as a Deep Arrow. A main loop performs computations and evaluates conditions at multiple points, using continue to exit / short-circuit the evaluation. The code I ended up with doesn’t use mutation, but is still heavily indented, which I am not happy about - I’ll work on that later.

## Simplified algorithm implementation

Note: as the title of the post indicates, this is work in progress. The current implementation works, but has some obvious flaws (see last paragraph), which I intend to fix in upcoming posts. My intent is to share my progression through the problem – please don’t take this as a good reference SVM implementation. Hopefully we’ll get there soon, but this is not it, not yet.

More...

27. June 2012 15:06

This months’ issue of MSDN Magazine has an interesting piece on evolutionary algorithms. The article applies a genetic algorithm to identify the minimum value of a “pathological” continuous function, the Schwefel function.

The Schwefel function

For X and Y values between –500 and 500, the “correct” answer is X and Y = 420.9687.

This function is designed to give fits to optimization algorithms. The issue here is that  the function has numerous peaks and valleys. As a result, if the search strategy is to use some form of gradient approach, that is, from a starting point, follow the steepest descent until a minimum is reached, there is a big risk to end up in a place which is a local minimum: the algorithm gets stuck in a valley with no path downwards, but there are other, better solutions around, which can be reached only by “climbing out of the hole” and temporarily following a path which heads in the wrong direction.

Out of curiosity, I checked how the Excel Solver would fare on this problem:

The result was an abject failure – not even close to the true solution.

I thought it would be interesting to see how Bumblebee, my Artificial Bee Colony framework, would perform. There are some general similarities between the underlying ideas behind the articles’ algorithm and Bumblebee, the main difference being that Bumblebee simply mutates individual solutions, and doesn’t create “crossover solutions”.

Let’s open Visual Studio, create an F# Console project, grab the NuGet package for Bumblebee – and start coding.

As usual, we need 4 elements to leverage Bumblebee – a Type of Solution, and 3 functions: a Generator, which returns a brand-new, random solution, a Mutator, which transforms a known solution into a new, similar solution, and an Evaluator, which evaluates a solution and returns a float, increasing with the quality of the solution.

In this case, the Solution type is fairly straightforward. We are looking for 2 floats x and y, so we’ll go for a Tuple. Similarly, the Evaluation is a given, it is the negative of the Schwefel function. The reason we go for the negative is that Bumblebee will try to maximize the Evaluation, so if we are looking for a minimum, we need to reverse the sign – because the Minimum of a function is the Maximum of its negative.

let schwefel x =
-x * Math.Sin(Math.Sqrt(Math.Abs(x)))

let evaluate (x, y) =
- schwefel (x) - schwefel (y)

The Generation function is also trivial – we’ll simply pick random pairs of floats in the [ –500.0; 500.0 ] interval:

let min = -500.0
let max = 500.0

let generate (rng: Random) = (
rng.NextDouble() * (max - min) + min,
rng.NextDouble() * (max - min) + min)

The Mutation function takes a tiny bit more of effort. The idea I followed is essentially the same as the one used in the article: given a solution defined by a pair of floats, randomly decide if any of the elements will be mutated, and if yes, add a random permutation to that element, scaled to the precision we want to achieve:

let precision = 0.00001
let rate = 0.5

let mutate (rng: Random) solution =
let (x, y) = solution
let x =
if rng.NextDouble() < rate
then x + precision * ((max - min) * rng.NextDouble() + min)
else x
let y =
if rng.NextDouble() < rate
then y + precision * ((max - min) * rng.NextDouble() + min)
else y
(x, y)

More...

1. April 2012 09:47

Last week’s StackOverflow newsletter contained a fun problem I had never seen before: Bipartite Matching. Here is the problem:

There are N starting points (purple) and N target points (green) in 2D. I want an algorithm that connects starting points to target points by a line segment (brown) without any of these segments intersecting (red) and while minimizing the cumulative length of all segments.

Image from the original post on StackOverflow

I figured it would be fun to try out Bumblebee, my artificial bee colony library, on the problem. As the accepted answer points out, the constraint that no segment should intersect is redundant, and we only need to worry about minimizing the cumulative length, because reducing the length implies removing intersections.

As usual with Bumblebee, I’ll go first with the dumbest thing that could work. The solution involves matching points from two lists, so we’ll define a record type for Point and represent a Solution as two (ordered) lists of points, packed in a Tuple:

type Point = { X: float; Y: float }
let points = 100
let firstList = [ for i in 0 .. points -> { X = (float)i ; Y = float(i) } ]
let secondList =  [ for i in 0 .. points -> { X = (float)i ; Y = float(i) } ]

let root = firstList, secondList
We’ll start with a silly problem, where the 2 lists are identical: the trivial solution here is to match each point with itself, resulting in a zero-length, which will be convenient to see how well the algorithm is doing and how far it is from the optimum.

How can we Evaluate the quality of a solution? We need to pair up the points of each of the lists, compute the distance of each pair, and sum them up – fairly straightforward:

let distance pair =
((fst pair).X - (snd pair).X) ** 2.0 + ((fst pair).Y - (snd pair).Y) ** 2.0

let evaluate = fun (solution: Point list * Point list) ->
List.zip (fst solution) (snd solution)
|> List.sumBy (fun p -> – distance p)

More...

6. January 2012 07:08

I just came across this interesting homework problem on StackOverflow:

Given a group of n items, each with a distinct value V(i), what is the best way to divide the items into 3 groups so the group with the highest value is minimized? Give the value of this largest group.

This looks like one of these combinatorial problems where a deterministic algorithm exists, and will be guaranteed to identify the optimal solution, with the small caveat that larger problem may take an extremely long time to resolve.

Note that as ccoakley points out in his comment to the StackOverflow question, there is no reason for a greedy approach to produce the optimal answer.

I figured this would be a good test for Bumblebee, my Artificial Bee Colony algorithm, which can produce a good solution in reasonable time, at the cost of not being guaranteed to find the actual optimum solution.

While we are at it, I figured we could also relax the constraints, and break the group into an arbitrary number of groups, and have possible duplicates rather than unique values.

More...