Mathias Brandewinder on .NET, F#, VSTO and Excel development, and quantitative analysis / machine learning.
by Mathias 31. October 2010 12:36

Last week, an interesting problem came on my desk. Initially, when I was asked to sort items, I didn’t think much of it. Given a list of items, it’s fairly trivial to use LINQ and sort it by whatever property you want. What I hadn’t quite anticipated was that the user should be able to select between multiple sorting criteria.

If there was a predetermined sorting criterion, the problem would be straightforward. For instance, given a list of Fruits with a name, supplier and price, I can easily sort them by price:

static void Main(string[] args)
{
   var apple = new Product() { Supplier = "Joe's Fruits", Name = "Apple", Price = 1.5 };
   var apricot = new Product() { Supplier = "Jack & Co", Name = "Apricot", Price = 2.5 };
   var banana = new Product() { Supplier = "Joe's Fruits", Name = "Banana", Price = 1.2 };
   var peach = new Product() { Supplier = "Jack & Co", Name = "Peach", Price = 1.5 };
   var pear = new Product() { Supplier = "Joe's Fruits", Name = "Pear", Price = 2 };

   var originalFruits = new List<Product>() { apple, apricot, banana, peach, pear };

   var sortedFruits = originalFruits
      .OrderBy(fruit => fruit.Price);

   foreach (var fruit in sortedFruits)
   {
      Console.WriteLine(string.Format("{0} from {1} has a price of {2}.", 
         fruit.Name, 
         fruit.Supplier, 
         fruit.Price));
   }

   Console.ReadLine();
}

Running this simple console application produces the following list, nicely sorted by price:

BasicSorting

However, if we want to give the user to select how fruits should be sorted, the problem becomes a bit more complicated. We could write a switch statement, with something like “if 1 is selected, then run this sort, else run that sort, else run that other sort”, and so on. It would work, but it would also be ugly. We would be  re-writing essentially the same OrderBy statement over and over again, something which reeks of code duplication. How could we avoid that, and keep our code smelling nice and fresh?

More...

by Mathias 27. October 2010 15:14

When it comes to design discussions, I have been described as “opinionated” (at times, harsher words like pig-headed or argumentative – or way worse - have been substituted). It isn’t that I think I am usually right, and  I am normally a pretty gentle person, but I firmly believe in Socratic method for identifying well-formulated, clear and transparent solutions.

Two ingredients are required to make it work: an unambiguous statement, and a contradictor, who will poke at the statement until it either crumbles, or gets refined until it is self-evident, with clearly identified qualities and limitations. So I like to take somewhat extreme positions in design discussions, either in my proposals, or in my criticism. It has nothing to do with how much I believe them to be right, and everything to do with understanding what is at stake in the discussion and where the tensions are.

Unfortunately, it has also gotten me into trouble at times, because, well, discussions can get heated. I will honestly try my best to find issues with any design I am presented with, and push for clarification – just like I really appreciate when people do the same with mine. It’s a tricky exercise, because while these are ideas that are under fire, it’s often difficult not to take the criticism personally. I hope to get better one day at sensing when that emotional line is being crossed, so that an honest and open discussion can be maintained.

Anyways, the reason for this rant is that I am finally reading The Structure of Scientific Revolutions, by Thomas S. Kuhn - and I am enjoying it tremendously. The book is mostly concerned with science, but some of the ideas resonate deeply with me, and in my opinion extend beyond science. The two following gems are lifted from the book:

Truth emerges more readily from error than from confusion.

Sir Francis Bacon, quoted in Kuhn

… Novelty ordinarily emerges only for the man who, knowing with precision what he should expect, is able to recognize that something has gone wrong. Anomaly appears only against the background provided by the paradigm. The more precise and far-reaching that paradigm is, the more sensitive an indicator it provides of anomaly […] By ensuring that the paradigm will not be too easily surrendered, resistance guarantees that scientists will not be lightly distracted…

Thomas S. Kuhn

I could not agree more.

by Mathias 17. October 2010 16:13

The current project I am working on requires writing large amount of data to Excel worksheets. In this type of situation, I create an array with all the data I want to write, and set the value of the entire target range at once. I know from experience that this method is much faster than writing cells one by one, but I was curious about how much faster, so I wrote a little test, writing larger and larger chunks of data and measuring the speed of both methods:

private static void WriteArray(int rows, int columns, Worksheet worksheet)
{
   var data = new object[rows, columns];
   for (var row = 1; row <= rows; row++)
   {
      for (var column = 1; column <= columns; column++)
      {
         data[row - 1, column - 1] = "Test";
      }
   }

   var startCell = (Range)worksheet.Cells[1, 1];
   var endCell = (Range)worksheet.Cells[rows, columns];
   var writeRange = worksheet.Range[startCell, endCell];

   writeRange.Value2 = data;
}
private static void WriteCellByCell(int rows, int columns, Worksheet worksheet)
{
   for (var row = 1; row <= rows; row++)
   {
      for (var column = 1; column <= columns; column++)
      {
         var cell = (Range)worksheet.Cells[row, column];
         cell.Value2 = "Test";
      }
   }
}

Clearly, the array approach is the way to go, performing close to 1000 times faster per cell. It also seems to improve as size increases, but that would require a bit more careful testing.

WriteDataToExcel

More...

by Mathias 13. October 2010 14:19

Thanks to all of you who attended my sessions on Mocking and on TDD at Silicon Valley Code Camp 2010; I had a great time presenting, in large part because you were awesome and asked great questions!

I uploaded the slides and code for the session on Mocks here. There wasn’t much “written” material for the TDD session, so I didn’t upload it. If someone wants it, let me know in the comments and I’ll add it, too.

Feel free to let me know if there are things I could have done better in the comments below, and don’t forget to fill in your evaluations on the Code Camp website. It’s very helpful for speakers, and… you can win an iPad, courtesy of Dice.com!

On a side note, hats off to Peter Kellner and the whole crew of volunteers. Silicon Valley Code Camp gets bigger every year, and yet, the organization was flawless, and the whole event very fun. Congratulations!

by Mathias 8. October 2010 12:50

<SAMSUNG DIGITAL CAMERA>

I am very honored to announce that I have become a Microsoft MVP for VSTO, Visual Studio Tools for Office. I learnt the news last week, and I am still super excited. I also feel somewhat intimidated, with a new sense of responsibility towards the community. Thank you Microsoft, and thank you to the Community - I’ll try my best to keep it up, to do my part to keep the .NET community at large a lively one, and to share my experience with VSTO and the fun of .NET + Office with you guys!

In other news, I apologize for the blog slowdown lately. In addition to being super-busy with a project (VSTO), I have had some problems with blog hosting, which have prevented me from writing new material. Things seem to be back on track now, I am sorry if you experienced some 404 inconvenience!

Comments

Comment RSS