During some recent meanderings through the confines of the internet, I ended up discovering the Winnow Algorithm. The simplicity of the approach intrigued me, so I thought it would be interesting to try and implement it in F# and see how well it worked.
The purpose of the algorithm is to train a binary classifier, based on binary features. In other words, the goal is to predict one of two states, using a collection of features which are all binary. The prediction model assigns weights to each feature; to predict the state of an observation, it checks all the features that are “active” (true), and sums up the weights assigned to these features. If the total is above a certain threshold, the result is true, otherwise it’s false. Dead simple – and so is the corresponding F# code:
type Observation = bool 
type Label = bool
type Example = Label * Observation
type Weights = float 
let predict (theta:float) (w:Weights) (obs:Observation) =
(obs,w) ||> Seq.zip
|> Seq.filter fst
|> Seq.sumBy snd
|> ((<) theta)
We create some type aliases for convenience, and write a predict function which takes in theta (the threshold), weights and and observation; we zip together the features and the weights, exclude the pairs where the feature is not active, sum the weights, check whether the threshold is lower that the total, and we are done.
In a nutshell, the learning process feeds examples (observations with known label), and progressively updates the weights when the model makes mistakes. If the current model predicts the output correctly, don’t change anything. If it predicts true but should predict false, it is over-shooting, so weights that were used in the prediction (i.e. the weights attached to active features) are reduced. Conversely, if the prediction is false but the correct result should be true, the active features are not used enough to reach the threshold, so they should be bumped up.
And that’s pretty much it – the algorithm starts with arbitrary initial weights of 1 for every feature, and either doubles or halves them based on the mistakes. Again, the F# implementation is completely straightforward. The weights update can be written as follows:
let update (theta:float) (alpha:float) (w:Weights) (ex:Example) =
let real,obs = ex
match (real,predict theta w obs) with
| (true,false) -> w |> Array.mapi (fun i x -> if obs.[i] then alpha * x else x)
| (false,true) -> w |> Array.mapi (fun i x -> if obs.[i] then x / alpha else x)
| _ -> w
Let’s check that the update mechanism works:
> update 0.5 2. [|1.;1.;|] (false,[|false;true;|]);;
val it : float  = [|1.0; 0.5|]
The threshold is 0.5, the adjustment multiplier is 2, and each feature is currently weighted at 1. The state of our example is [| false; true; |], so only the second feature is active, which means that the predicted value will be 1. (the weight of that feature). This is above the threshold 0.5, so the predicted value is true. However, because the correct value attached to that example is false, our prediction is incorrect, and the weight of the second feature is reduced, while the first one, which was not active, remains unchanged.
Let’s wrap this up in a convenience function which will learn from a sequence of examples, and give us directly a function that will classify observations:
let learn (theta:float) (alpha:float) (fs:int) (xs:Example seq) =
let updater = update theta alpha
let w0 = [| for f in 1 .. fs -> 1. |]
let w = Seq.fold (fun w x -> updater w x) w0 xs
fun (obs:Observation) -> predict theta w obs
We pass in the number of features, fs, to initialize the weights at the correct size, and use a fold to update the weights for each example in the sequence. Finally, we create and return a function that, given an observation, will predict the label, based on the weights we just learnt.
And that’s it – in 20 lines of code, we are done, the Winnow is implemented.