Mathias Brandewinder on .NET, F#, VSTO and Excel development, and quantitative analysis / machine learning.
28. August 2009 17:54

In the current issue of OR/MS Today, I came across this nice optimization puzzle, “Bridges to Somewhere”. There are these two islands. Five people A, B, C, D and F live on the first island, and need to commute to work to the second island. Individual A lives in the spot marked A, and needs to go to spot A on the second island – and so on for the 4 others. People can travel only vertically and horizontally (no diagonals), and will always take the shortest path available.

There is currently no bridge between the islands, but a budget for 2 bridges has been approved (the island just received a stimulus package). There are 4 bridge proposals to chose from (One, Two, Three and Four on the map). Which 2 bridges should be built to minimize the travel distance of the population?

Before trying to figure out which 2 bridges are best, I thought it would be interesting to investigate a simpler problem: if you could build one bridge anywhere, where should you build it?

There are a number of ways you could resolve this using Excel; I will illustrate how to find the best solution, using Excel Data Tables.More...

20. August 2009 05:55

I wrote a post a few days ago describing how to generate a Line – Column chart in Excel through C#. And then a few things happened. Jon Peltier proposed a much nicer approach, I realized that my code worked for Excel 2003 but not Excel 2007, and someone asked for my code, “Jon-Peltier style”. So here we go: assuming your chart has more than one series, and you want the second series to be formatted as a line, all the rest as columns, you would do something like this:

// Create your chart object first
// formatted as column
Chart chart = ExcelCharts.AddChart(targetWorkbook, “my chart”, “the chart title”, XlChartType.xlColumnClustered, dataRange, XlRowCol.xlRows);
// Select the second series and make it a line
Series series = (Series)chart.SeriesCollection(2);
series.ChartType = XlChartType.xlLine;

Here is a simplified version of my AddChart method, which creates the base chart. Nothing fancy, but gets the job done.

public static Excel.Chart AddChart(Workbook workbook, string chartSheetName, string title, XlChartType chartType, Range dataRange, XlRowCol byRowOrCol)
{
Excel.Chart chart;
chart = (Excel.Chart)workbook.Charts.Add(Type.Missing, Type.Missing, Type.Missing, Type.Missing);
chart.ChartType = chartType;
chart.Location(XlChartLocation.xlLocationAsNewSheet, chartSheetName);
chart.SetSourceData(dataRange, byRowOrCol);
chart.HasTitle = true;
chart.ChartTitle.Text = title;
return chart;
}

As an aside, I was not happy with myself when I realized the code didn’t run on Excel 2007. I tend to write Excel-related code against Excel 2003 first, assuming it is the smallest common denominator and will likely work with Excel 2007 – but this is a perfect illustration that while it will typically be correct, it will sometimes fail, sometimes in very unexpected and trivial places, like in this example. Moral of the story: as Lenin allegedly said, “Trust is good, control is better”…

17. August 2009 09:51

I was digging through older Excel projects recently, and realized that while all my recent projects haven’t become smaller in terms of data (if anything, they have become more data-intensive), they all had very small Excel files. Which got me wondering – did my workbooks really shrink, and why?

One of the characteristics of my recent projects has been that I have progressively removed most of the calculations from the spreadsheets. Excel is used to store inputs and to display outputs, and computations happen outside, using either VSTO add-ins, or Interop (I realize it is a somewhat devious use of Excel, but I have my reasons for doing this). As a result, the Excel files contain raw data (input data, or output results), some formatting, and no formulas. This made me curious: does using formulas affect the file size significantly?

Here are the 2 key conclusions from my quick & dirty experiment:

• Each formula costs about the double of a straight, static input.
• Besides that, the size of a file grows linearly with the number of cells used.

More...

9. August 2009 17:25

The first of Scott Hanselman’s “Top 10 Tips Working Developers Should Know about Windows 7” really made my day: Win 7 includes .NET 3.5 SP1. This probably doesn’t matter to you if you are not a .NET developer, but if you are, chances are you have had the same frustration as I did. .NET 3.5 SP1 is really what you want to be developing against, because it includes so much goodness, but in my experience, most users don’t have it on their machines. As a result, your potential user has to go through a good 15 minutes of download and a reboot – and that’s assuming the IT department is fine with that, which is not a given (personal experience).

This is of particular importance to me, because my pet project Akin, a free application that helps track down differences between Excel files, is written against .NET 3.5. I really needed WPF to create the kind of user interface I wanted, but this has proven a hurdle in getting people to try it out. You might be able to convince people you know personally that they should install .NET 3.5, but for the casual visitor who stumbles across a webpage and wants to just try out an application, asking them to download a giant file for unclear purposes first is just a killer. You lost one potential user, right there.

With Windows 7 pre-sales making a solid start, hopefully I will be relieved of that issue. This is very motivating – time to get back to it, and implement some of the great suggestions I received so far!

8. August 2009 18:03

Today I came across a solution to Euler Problem 205 on The Daily Dose of Excel. The problem is stated as follows:

Peter has nine four-sided (pyramidal) dice, each with faces numbered 1, 2, 3, 4.
Colin has six six-sided (cubic) dice, each with faces numbered 1, 2, 3, 4, 5, 6.

Peter and Colin roll their dice and compare totals: the highest total wins. The result is a draw if the totals are equal.

What is the probability that Pyramidal Pete beats Cubic Colin? Give your answer rounded to seven decimal places in the form 0.abcdefg

I thought it was a pretty cool problem; I love probability problems, and had never come across something similar, so it piqued my interest. The solution presented in The Daily Dose was essentially a pretty efficient brute-force enumeration, and I wondered if it was possible to go a bit faster that 6 minutes follow a different approach – using my language of predilection, C#. [Edited August 9. Note to self: before commenting on other people’s blog posts, I should make sure I read them properly. Especially when discussing their code’s performance. Otherwise, I will look foolish].

The probability that Pete wins can be written as:

Refreshing a bit my memory in probability through Wikipedia, “the exact probability distribution Fs,i of a sum of i s-sided dice can be calculated as the repeated convolution of the single-die probability distribution with itself” as follows:

More...