Mathias Brandewinder on .NET, F#, VSTO and Excel development, and quantitative analysis / machine learning.
by Mathias 26. September 2008 06:41
I just wanted to share this solid post by Joel Spolsky on password management. Like most people, I know that it's bad to use the same password for multiple places, and that you should change them regularly; like most people too, I am totally convinced of that, and yet I do just what I shouldn't, because remembering many passwords is just a pain, especially if it's a long and human-unreadable password. The problem is compounded if you work from multiple machines, and need to access online services from them, in which case the temptation of using a few generic passwords increases quite a bit. Well, the solution Joel proposes addresses that, and I just made all my sensitive passwords completely human-unreadable; on top of that, I even used the option to make me change them every 3 months. Woot! It's nice when technology makes doing the right thing easy...
by Mathias 23. September 2008 17:37

On September 2, 2008, Google launched its browser, Chrome, with great buzz in the geekosphere. I gave it a spin, but stayed with Firefox (old habits die hard), and did not give it more thought until I came across this post where Donn Felker ventures his gut feeling for what the browser market will look like in 2009.

I believe that his forecast, while totally subjective, qualifies as an “expert opinion”, and is essentially correct, and wondered what quantitative analysis methods would add to it – and decided to give it a shot.

The Bass adoption model


Properly representing the introduction of a new product on the market is a classic problem in quantitative modeling. At least two factors make it tricky: there is only limited data available (because it’s a new product), and the underlying model cannot be linear (because it starts from 0, and has a finite growth).

In 1969, Frank Bass proposed a model which is now a classic. It represents adoption as the combination of two factors: innovation and imitation. Innovators are the guys you see in line at the Apple store when a new iGizmo is launched; they have to have it first, regardless of how many people have it already. Imitators are the cautious ones, who will jump on board when enough people are using the product already – the more people already adopted, the more imitation will take place.

In terms of dynamics, innovators determine the early pick-up of the product, and create the initial critical mass of users– and imitators drive the bulk of the growth, going from early adoption to peak.

The mathematical formulation of the model goes like this:

 

(from http://www.valuebasedmanagement.net/methods_bass_curve_diffusion_innovation.html)


It is a very elegant and lightweight model, which takes only 3 parameters, and is surprisingly good at replicating actual adoption. The Excel model attached provides an illustration of the dynamics of the model, depending on its input parameters, the total population, and the rates of innovation and imitation.

Bass.xls (27.50 kb)
More...

by Mathias 16. September 2008 16:59

Macroeconomics and public policy have never been my forte in economics, which is probably why I did not come across the Gini coefficient until now. In a nutshell, the Gini coefficient is a clever way to measure inequalities of distribution in a population.

As an illustration, imagine 4 countries, each of them with 10 inhabitants. In Equalistan, everyone owns the same amount of $100, whereas in Slaveristan, one person owns everything, and the 9 others have nothing. In between, there are Similaristan and Spreadistan.

 

If you order the population by increasing wealth and plot out the cumulative % of the total wealth they own, you will get the so-called Lorentz curve. Equalistan and Slaveristan are the two extreme possible cases; any curve must fall between these two, and the further the curve is from Equalistan, the less equal the distribution. The Gini coefficient uses that idea, and measures the surface between the Equalistan curve and your curve; normalizing to obtain 100% for the Slaveristan case, and any population will have an index between 0% (perfectly equal) and 100% (absolutely unequal).

More...

by Mathias 7. September 2008 10:27

Yes/No/Cancel choices are a classic in creating surprisingly confusing user interfaces out of a very simple problem, but this one takes the cake, and proves that people will find ways to create confusion even in the most unlikely situation:

fail owned pwned pictures
see more pwn and owned pictures

by Mathias 4. September 2008 18:19

 [Edit, Sept 5, 2008: nothing incorrect in the following post; however, if I had Google'd first, I would have found that DateTime date = DateTime.FromOADate(d), where d is a double, does exactly the job...]

The project I am currently working on requires reading some data from an Excel workbook into a .NET calculation engine written in C#. Most of my reads follow this pattern: read a named range into an array of objects, then convert the object to the appropriate .NET type.

public static object[,] GetRangeAsArray(Excel.Worksheet sheet, string rangeName)
{
    Excel.Range range = sheet.get_Range(rangeName, Missing.Value);
    object[,] rangeAsArray = range.Value2 as object[,];
    return rangeAsArray;
}

However, I ran into an issue reading dates. Excel stores dates as doubles, which encode the number of days elapsed since January 0, 1900 (Yes, January 0). As a result, the object stored in the array is a double, and the Convert.ToDateTime(double) method throws an InvalidCastExpression, so standard conversion doesn’t work.
If you look a bit deeper into it (here is a very comprehensive page on the topic), you will discover some interesting idiosyncrasies of the date encoding in Excel. For instance, back in the days, the Excel team knowingly implemented a bug to replicate a known bug of Lotus, for the sake of backwards compatibility.
Here is the quick method I wrote to perform that conversion, addressing these issues:

public static DateTime ConvertToDateTime(double excelDate)
{
    if (excelDate < 1)
    {
        throw new ArgumentException("Excel dates cannot be smaller than 0.");
    }
    DateTime dateOfReference = new DateTime(1900, 1, 1);
    if (excelDate > 60d)
    {
        excelDate = excelDate - 2;
    }
    else
    {
        excelDate = excelDate - 1;
    }
    return dateOfReference.AddDays(excelDate);
}

More...

Comments

Comment RSS