R in Resolver One (and perhaps IronPython generally)
We've just announced the winner of this month's round of our competition at Resolver Systems, and it's a great one; Marjan Ghahremani, a student at UC Davis, managed to work out how you can call R (a powerful statistical analysis language) from our spreadsheet product Resolver One. You can download a ZIP file with a detailed PDF describing how it works and a bunch of examples.
If you're not interested in Resolver One, but want to use R from your own IronPython scripts, you may be able to do that too, using her instructions as guidelines -- I've not tried it myself, but there are no obvious blockers. If you do try it out, I'd love to hear how it goes.
xmlrpc
One of our customers had been asking about how to call XMLRPC servers from
Resolver One. It doesn't work in version 1.3, and he was having problems getting
it to work in 1.4. The problem turned out to be simple and fixable, and unlikely
to affect other people, so I'm proud to present a really simple XMLRPC/Resolver One example
that you can use as a starting point: a Python script that creates a server
exposing an is_even
function (which tells you if a number is even or not), and
a Resolver One spreadsheet that uses it. There are only two lines of code in
the spreadsheet, which is pretty cool :-)
The Resolver One Spreadsheet Challenge: a winner for round one!
Proving that there really is a point in having a proper PR department who think about these things, we only realised today that our choice of date for the announcement of the winner of our spreadsheet competition was not ideal -- our US customers (who make up a hefty fraction of the total) have something else on their minds, apparently.
Nonetheless, we decided it would be unfair on the entrants to delay the annoucement, so here we have it: the winner of the first round of the competition is Siamak Faridani, for his spreadsheet to estimate the electrostatic field around Micro Electro Mechanical Systems. Congratulations, Siamak!
Money for spreadsheets
We've produced a lot of interesting spreadsheets in-house at Resolver Systems -- some of which I've blogged about here -- but we're really keen to see what everyone else is doing with Resolver One. So we're running a competition: every month for the next five months, we're asking people to send us interesting stuff that they've done with our product, and we'll give $2,000 to the author of the best one. After five months we'll give $15,000 to the author of "the best of the best".
It should be interesting to see what people send us :-)
VAT calculations
There's been an interesting discussion over at Smurf on Spreadsheets about the consequences of the UK government's temporary VAT rate reduction. For the benefit of non-UK readers, VAT is basically the british sales tax (it differs a little in implementation from a simple sales tax). It is currently 17.5%, but as a reaction to the financial crisis, it will be reduced to 15% from 1 December 2008 until 31 January 2010 inclusive. Whether this makes sense as a matter of economic policy is, of course, highly contentious. But this is a technical blog so I'll stick to its effect on spreadsheets :-)
Resolver One plug
A quick plug: there's only one day left to get Resolver One at the old price!
As of midnight (GMT) tomorrow, the discounted price for Resolver One 1.3 will come to an end, and the price will rise from $199 to $399. If you want to get your copy at the old price, you should buy now...
Why use IronPython?
I just posted this on the Joel on Software discussion board, in answer to someone's question about using IronPython for their new company. Hopefully it will be of interest here.
We've been using IronPython for three years now with a lot of success. The great thing about it is that it allows you to benefit from Python's syntax while getting most of the advantages of .NET:
- All of the .NET libraries are available.
- UIs look nice. I've never seen a pure traditional Python application that looked good, no matter how advanced its functionality.
- We use a bunch of third-party components -- for example, Syncfusion's Essential Grid -- without any problems.
- Reasonably decent multithreading using the .NET libraries -- CPython, the normal Python implementation, has the problem of the Global Interpreter Lock, an implementation choice that makes multithreading dodgy at best.
- We can build our GUI in Visual Studio, and then generate C# classes for each dialog, and then subclass them from IronPython to add behaviour. (We never need to look at the generated code.)
- When things go wrong, the CLR debugger works well enough -- it's not perfect, but we've never lost a significant amount of time for want of anything better.
Of course, it's not perfect. Problems versus using C#:
- It's slower, especially in terms of startup time. They are fixing this, but it's a problem in the current release. This hasn't bitten us yet -- all of the non-startup-related performance issues we've had have been due to suboptimal algorithms rather than language speed. However, it you're writing something that's very performance-intensive, you may want to look elsewhere.
- No LINQ yet.
- If you're considering IP then you presumably already know this, but dynamic languages have no compile-time to perform sanity checks on your codebase, so problems can come up at runtime. We write all of our code test-first and so we aren't impacted by that. However, if you're not writing a solid amount of test code (and if you're not, you should :-) then you might want to use a statically-typed language.
Problems versus using CPython:
- No cross-platform. Linux or Mac support is one of our more frequently-requested enhancements, and it will be a lot of work to add. The reason for this is that many third-party .NET components -- for example, the Synfusion grid -- are not "pure" .NET; they drop into win32 for certain operations, I assume for performance reasons. This means that if you use them, your application won't run on non-Windows .NET platforms.
- No use of CPython's C extensions, like numpy (a numerical functions library). This has hit us pretty hard, so we're working on an open-source library to interface between C extensions and IronPython -- however, it's still a work in progress.
Hope this was of some help.
Evolution in action
At Resolver Systems, we've recently split into two teams; about two thirds of us work on the core Resolver One platform that is our main product (this group is inventively called the Platform team), and the other third build new spreadsheet/Python programs, using Resolver One, for specific clients' custom needs (the Apps team). This is great, because we are now not only building business solutions for people, as well as a generic platform (which means more money for us), but we are also dogfooding -- so we can be sure we're adding features and fixing bugs which really do help our users.
The problem with doing this is that everyone in the company has different preferences about how much time they want to spend in each team. Some people really like writing programs to fix business problems, and others are keener on abstract algorithms. We could have just said "stuff it" and swapped people around so that everyone was doing a 1:2 rotation, but it was much more fun to solve the problem in software :-) My aim was to somehow generate, for each of the next twelve iterations (the two-week development cycles we work in), a list of people who would form that iteration's Apps team and the people who'd form the iteration's Platform team.
So I put together a spreadsheet: an evolutionary algorithm for team scheduling.
If you're using Windows, you can download it and take a look (you can get a
free version of Resolver One to run it on if you haven't already).
You enter your team's preferences -- in terms of the percentage of time they'd
like to spend on the Apps team -- in the "Preferences" sheet (which also shows
some results from the last run), and then some numbers to guide the evolution (number
of generations, population size, etc) in the "Parameters" sheet, and then get the
best schedule it can generate in "Rota" sheet.
To be honest, it's using the spreadsheet more as a display mechanism than anything
else. But it's a fun bit of code, although I'm sure that anyone who actually works
on evolutionary algorithms would find it trivially simple (and probably broken :-).
The function GenerateSchedule
in the pre-formulae user code (for Resolver
newbies: in the box below the grid - the section with a green background) is the
interesting bit -- everything below there is just presentation logic. Here's how
it works:
- We generate a random set of schedules, each of which is created by picking three random people from our team and putting them into the Apps team, leaving the remainder in the Platform team.
- We then run through as many generations as the user specified. In each generation:
- Every schedule in our population is assigned a weight. This is generated
by a function called
WeightSchedule
, which is what people who study evolutionary algorithms would call a fitness function. Basically, the higher the number it returns, the less good the schedule is. - We sort the schedules by their weights, and then we kill off the worst of them.
- We then create a new generation comprising the survivors from the cull, and
a set of new schedules that are "parented" by those survivors, using the
function
MutateSchedule
. We apply a slight bias so that the fitter schedules have a better chance of reproducing than the others. - And on we go for another generation.
- Every schedule in our population is assigned a weight. This is generated
by a function called
WeightSchedule
was the most difficult function in the code to get right. (This
is in keeping with what I've heard about evolutionary algorithms in general.) Its
job is to return a number that is high for bad schedules, and low for good ones. I
found I got the best results by returning an arbitrary "high" value for any schedule
that failed to meet certain must-have criteria, and then working out, for each
person, the difference between the amount of time they wanted to spend in a given
team and the actual amount of time they spent there in the current schedule. I then
raised those per-person errors to the power of four (to make it clear that three
people 5% out is better than one person 15% out) and then summed the results. This
seemed to work just fine.
For MutateSchedule
I had a bit of fun. It's purpose is to generate a new child
from a single parent schedule (I chose to use an asexual reproduction model because,
in my experience, sexual reproduction and spreadsheets rarely mix well). My initial
implementation just switched one pair of people around for every iteration -- that is,
one person who was originally on the Apps team was now on the Platform team, and vice
versa. I then made the number of such swaps a user-settable parameter, so that people
could increase the extent of mutations. This sounded like a good idea, but didn't
help much -- indeed, increasing the number of swaps invariably made the system less
likely to produce a good schedule. My "background radiation" level was clearly too
high. So I then changed things so that you could specify a fractional number of
swaps. A swap level of 0.1 meant that each iteration has a one in ten chance of a
having someone swapped around. This seemed to work well -- indeed, 0.1 seemed pretty
close to the sweet spot for the number of swaps. I suppose this makes sense -- you can
imagine that a schedule with twelve iterations in it that is almost perfect is more
likely to be improved if you switch around two people in just one of its iterations
than if you make a swap for every iteration.
So that's it -- a simple evolutionary algorithm in a spreadsheet. I've deliberately not over-tidied the code in the version you can download above -- I've just sanitised the data so that no-one on the team's privacy is harmed, and then added a few comments for the more impenetrable bits of code. But it should all be pretty easy to understand, and I'd love to hear from anyone with comments (especially if they know more about this kind of thing than me...)
Resolver One as a Python Success Story
Jonathan Hartley, a friend who is also a developer at Resolver Systems, has contributed to the set of Python Success Stories at Pythonology.org with a description of how we've benefited from using Python -- in particular the .NET variant of the language, IronPython. It's well worth a read, especially if you're interested in how a Python-based Extreme Programming team can use the language for both its internal systems and its public product.
Off to visit the Beast of Redmond ;-)
Mahesh Prakriya at Microsoft was kind enough to suggest that I give a talk at the Lang.NET symposium, and so tomorrow I'm flying to Seattle. It looks like a fantastically interesting meetup, and I'm really looking forward to it.
The one hiccup for me was trying to work out what to put in the talk. Having been on so many client and potential client visits, and done marketing material for non-technical users, it was very hard to switch over to thinking again about what Mahesh had clearly realised, and Jon Udell touched on back when he did a screencast with us: that a lot of the power behind Resolver One comes from the way it treats spreadsheets as just another .NET language.
This doesn't mean that our marketing and sales efforts are wrong -- our users and users-to-be don't really care about how the program does what it does, they care about what problems it solves for them. But it's useful reminder to me that I need to keep both sides in mind.
[Update] The talk went well! It was videoed and I'll link to it as soon as they put it online. In the meantime, here are the slides.
[Update, later] Darryl Taft has written about the talk in eWeek.