Mathias Brandewinder on .NET, VSTO and Excel development, and quantitative analysis.
by Mathias 21. August 2010 16:38

In a previous post, we saw how to programmatically search for text in a PowerPoint slide, by iterating over the Shapes contained in a slide, finding the ones that have a TextFrame, and accessing their TextRange property. TextRange exposes a Text property, which “represents the text contained in the specified object”.

Our goal is to translate a slide from a language to another, which means translating every chunk of text we find. However, the Text property contains a bit more than just text. Suppose you were working with a slide like the one below, which contains multiple bullet points, with various indentations:

DaftPunkSlide 

If you inspect the Text for the content area, you’ll see that it looks like this:

Work It\rMake It\rDo It\rMakes Us\rHarder\rBetter\rFaster\rStronger

At the end of each bullet point, we have a \r, which indicates a line break. If we want to maintain the formatting of our slide when we translate it, we’ll have to deal with it.

We’ll worry about the actual  translation later – for the moment we will use a fake method, which will show us what chunk of text has been translated:

public static string Translate(string text)
{
   return "Translated [" + text + "]";
}

A crude approach

A first approach would be to simply take the entire Text we find in the TextRange, manually separate it into chunks by splitting it around the carriage return character, translating the chunk, and re-composing the text, re-inserting the carriage returns.

Starting where we left off last time, let’s loop over the Shapes in the slide:

private void TranslateSlide()
{
   var powerpoint = Globals.ThisAddIn.Application;
   var presentation = powerpoint.ActivePresentation;
   var slide = (PowerPoint.Slide)powerpoint.ActiveWindow.View.Slide;
   foreach (PowerPoint.Shape shape in slide.Shapes)
   {
      if (shape.HasTextFrame == Microsoft.Office.Core.MsoTriState.msoTrue)
      {
         var textFrame = shape.TextFrame;
         var textRange = textFrame.TextRange;
         var text = textRange.Text;
         textRange.Text = CrudeApproach(text);
      }
   }
}

More...

by Mathias 13. August 2010 12:37

Today is Friday the 13th, the day when more accidents happen because Paraskevidekatriaphobics are concerned about accidents. Or is it the day when less accidents take place, because people stay home to avoid accidents? Not altogether clear, it seems.

Whether safe or dangerous, how often do these Friday the 13th take place, exactly? Are there years without it, or with more than one? That’s a question which should have a clearer answer. Let’s try to figure out the probability to observe N such days in a year picked at random.

First, note that if you knew what the first day of that year was, you could easily verify if the 13th day for each month was indeed a Friday. Would that be sufficient? Not quite – you would also need to know whether the year was a leap year, these years which happen every 4 years and have an extra day, February the 29th.

OuroborosImagine that this year started a Monday. What would next year start with? If we are in a regular year, 365 days = 52 x 7 + 1; in other words, 52 weeks will elapse, the last day of the year will also be a Monday, and next year will start a Tuesday. If this is a leap year, next year will start on a Wednesday.

Why do I care? Because now we can show that every 28 years, the same cycle of Friday the 13th will take place again. Every four consecutive years, the start day shifts by 5 positions (3 “regular” years and one leap year), and because 5 and 7 have no common denominator, after 7 4-year periods, we will be back to starting an identical 28-years cycle, where each day of the week will appear 4 times as first day of the year.

More...

by Mathias 8. August 2010 17:30

I am currently on a project which involves creating a PowerPoint VSTO add-in. I have very limited experience with PowerPoint automation, so before committing to the project, I thought it would be a good idea to explore a bit the object model, to gauge how difficult things could get, and I set to write a small PowerPoint add-in which would automatically translate slides. Sounds like a simple enough project, how difficult could it be?

Turns out, not too difficult, but not completely trivial either. I discovered quickly that the PowerPoint object model, unlike most Office applications, doesn’t have much (any?) documentation for the .Net developer; the best I found is the VBA PowerPoint 2007 developer reference, which gives a decent starting point to figure out what the objects are about. So I thought I would share my exploration of the PowerPoint jungle, and hopefully spare some trouble to other .Net developers.

The plan

The objective is simple: write an add-in which allows the user to

  • select a language to translate from, and a language to translate to,
  • create a duplicate of the current slide, translating all the text and keeping the layout

The plan will be to use Google Translate to perform the translation. In order to do that, we will nedd to extract out all pieces of text that require translating.

Finding all the text in a slide

Lets’ start by identifying where we have text in the current slide. Let’s first create a PowerPoint 2007 Add-in project in Visual Studio. To keep things simple for now, we will add a Ribbon control with a button, and when that button is clicked, we’ll start working on the current slide:

RibbonWithButton

Double-click on the Button (I renamed my button translateButton) to generate an event handler for the Click event, and get the current Slide:

private void translateButton_Click(object sender, RibbonControlEventArgs e)
{
   var powerpoint = Globals.ThisAddIn.Application;
   if (powerpoint.ActivePresentation.Slides.Count > 0)
   {
      var slide = (PowerPoint.Slide)powerpoint.ActiveWindow.View.Slide as PowerPoint.Slide;
   }
}

More...

by Mathias 5. July 2010 16:16

In our last post, we saw how to use F# to read historical stock quotes from Yahoo. Today we’ll take the raw response, which is a big block of text, and break it up into a list of individual quotes.

Breaking up the response into lines

The function we wrote last time, GetResponse, receives one chunk of text from the web service, formatted like this:


Date,Open,High,Low,Close,Volume,Adj Close
2010-03-08,28.52,28.93,28.50,28.63,39414500,28.50
2010-03-05,28.66,28.68,28.42,28.59,56001800,28.46
2010-03-04,28.46,28.65,28.27,28.63,42890600,28.50

What we need to do now is break up this into individual lines of text, and parse them to read individual quotes. The first part is straightforward: the function BreakIntoLines calls String.Split(), using char(10), the code for line break, as a delimiter, and returns an Array of strings.

let BreakIntoLines (response:string) =
  response.Split((char)10)

Note the type annotation on the function argument: without context, F# cannot infer the type of “response”, and we need to specify that this function expects a string argument.

Parsing valid lines into Quote records

The second part is a bit more complex. We need to break each line into 7 components (date, open, etc…), deal with lines that are not valid, like the header, and store the result in an appropriate structure.

We will store individual quotes into records. A Record is a data type somewhat similar to the C# struct. It has named fields, which makes it more expressive than Tuples, and is less involved than a class. Here is the declaration for our Quote record – concise, and pretty self-explanatory:

type Quote={
  Symbol:string;
  Date:DateTime;
  Open:double;
  Close:double;
  Low:double;
  High:double;
  Volume:int64}

More...

by Mathias 21. June 2010 13:12

Lately I have spent time on a pet project, which requires access to historical financial data. Mads Kristensen has a nice post where he shows how to read  stock quotes from Yahoo finance using C#, which was very helpful to get started. I figured it would be interesting to try out a conversion to F# and see what the result looked like.

Mads focus is on getting quasi real-time updates of a quote; my interest is in an easier problem: retrieving historical data. Fortunately, Yahoo provides a free service for that, too. Given a quote symbol and two dates, it returns a comma-separated file list of all the values for the quote between these 2 dates.

So what do we need to do? Given a valid symbol and 2 dates, we want to create the WebRequest to send to Yahoo, retrieve the response, break it into lines, and parse each line into a quote, which will be added to a list. The core of the resulting program will be the ReadQuotes function, which will look like this:

let ReadQuotes symbol date1 date2 = 
  CreateRequest symbol date1 date2 
  |> GetResponse 
  |> BreakIntoLines
  |> CreateQuotes symbol

Creating the WebRequest

The web request required to obtain historical data from Yahoo follows this pattern:

http://ichart.finance.yahoo.com/table.csv?s=S&a=A&b=B&c=C&d=D&e=E&f=F&g=d&ignore=.csv

where:

  • S is the symbol (ex: MSFT)
  • A, B, C are the start month, day and year, the month being coded in base 0 (i.e. January is 0)
  • D, E, F are the end month, day and year, the month being coded in base 0 (i.e. January is 0)

For instance, replacing S with MSFT, A with 0, B with 1, C with 2010, D with 1, E with 15, F with 2010, will return all the available quotes for Microsoft between January 1 and February 15, 2010.

Let’s start by creating a Console application, by selecting new F# project > F# Application, and typing in the following code:

open System;
open System.Net
open System.IO
open System.Text

let RetrieveDateInfo (date:DateTime) =
  (date.Day, date.Month-1, date.Year)

let CreateRequest symbol startDate endDate =

  let startDay, startMonth, startYear = RetrieveDateInfo startDate
  let endDay, endMonth, endYear = RetrieveDateInfo endDate

  let query = String.Format("&a={0}&b={1}&c={2}&d={3}&e={4}&f={5}&g=d&ignore=.csv", startMonth, startDay, startYear, endMonth, endDay, endYear)
  let url = "http://ichart.finance.yahoo.com/table.csv?s=" + symbol + query
  WebRequest.Create(url)

More...

Comments

Comment RSS