Thursday 6 June 2013

Sustainability of software in the digital humanities

Everyone knows that it is very difficult to maintain whatever you create in this field, let alone make any progress. And once progress has been made, the result breaks very quickly. We have seen this scenario played out countless times but I'm not going to point fingers. That's just how it is. I am reminded very much of Alice through the looking glass where the Red Queen takes Alice by the hand and runs with her for some time as fast as she can go. But when they finally stop Alice realises the horrible truth:

'Why, I do believe we've been under this tree the whole time! Everything's just as it was!'

'Of course it is,' said the Queen, 'what would you have it?'

'Well, in OUR country,' said Alice, still panting a little, 'you'd generally get to somewhere else—if you ran very fast for a long time, as we've been doing.'

'A slow sort of country!' said the Queen. 'Now, HERE, you see, it takes all the running YOU can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!'

The answer to this dilemma is not to complain that you don't have enough resources, but simply to program economically. First put everything you can into the most stable software platform you know: in my case the C language. Much of the computation you need to do can be expressed this way. Programs that compute phylogenetic trees, compare and format texts, load and save formats, in fact anything you need to do that performs basic operations that you can code once, debug and forget. I have C programs from the 1980s that still compile and run flawlessly. The modern programmer seems to have forgotten that C is the bedrock of everything they do: every scripting language they use, every operating system (OK, except Windows, which is C++ I believe) is written in this. It cannot go away or the entire world would cease to be. And it doesn't change. If you don't have any dependencies, if your code just computes something then you can write, debug and forget. That saves a lot of maintenance work.

Secondly, leverage existing open source code. Write your extra stuff to use existing CMSes and other platforms, with as little glue code as you can manage. Let the wider community maintain that platform for you. Don't expect that your service will last. It won't. But you can call the C-code from the service; in fact you can call C-code from any language and compile it for any platform, so that the service need only be a wrapper around core functionality. Be ready for change: expect it. Then when you have to rewrite your code it will just be a matter of rewrapping it all in the latest style.

Thirdly, don't expect that the open source community is going to either fix your code or maintain it for you, because they won't. You have to be prepared to do it all yourself, at least until you have a great product that everyone is using, like Linux. So far no one in the digital humanities has got that far. And maybe they never will. You have to believe in what you do, and not be dependent on only working when you have grant money.