I am digging back into the Bumblebee code base, to clean it up before talking at the New England F# user group in Boston in June. As usual for me, it’s a humbling experience to face my own code, 6 months later, or, if you are an incorrigible optimist, it’s great to see that I am so much smarter today than a few months ago…
In any case, while toying with one of the samples, I noted that performance was degrading pretty steeply as the size of the problem was increasing. Most of the action revolved around producing random shuffles of a list, so I figured it would be interesting to look into it and see where I messed up how this could be improved upon.
Here is the original code, a quick-and-dirty implementation of the Fisher-Yates shuffle:
open System
let swap fst snd i =
if i = fst then snd else
if i = snd then fst else
i
let shuffle items (rng: Random) =
let rec shuffleTo items upTo =
match upTo with
| 0 -> items
| _ ->
let fst = rng.Next(upTo)
let shuffled = List.permute (swap fst (upTo - 1)) items
shuffleTo shuffled (upTo - 1)
let length = List.length items
shuffleTo items length
[<EntryPoint>]
let main argv =
let test = [1..10000]
let random = new Random()
let shuffled = shuffle test random
System.Console.WriteLine("done...")
0
Running the test case in fsi, using #time, produces the following:
Real: 00:00:42.735, CPU: 00:00:42.734, GC gen0: 2307, gen1: 6, gen2: 0
(Digression: #time is absolutely awesome – just typing #time;; in a fsi session will automatically display performance information, allowing to quickly tweak a function and fine-tune it “on the fly”. I wish I had known about it earlier.)
My initial assumption was that the problem revolved around performing multiple permutations of a List. However, I figured it would be interesting to take the opportunity and use the Performance Analysis tools provided in VS11 – and here is what I got:

Uh-oh. Looks like the shuffle is spending most of its time doing comparisons in the swap function – and what the hell is HashCompare.GenericEqualityIntrinsic doing in here? Something is off.
Looking into the swap function provides a hint:

F# has identified that the function could be made generic. It’s great, but in our case it comes with overhead, because we simply want to compare integers. Let’s mark the function as inline, to avoid that problem (we could also make the function non-generic, by marking one of the inputs as integer):

More...