Sunday 15 November 2009

Progress with the rewrite

So it's decided. Digital variants, the Harpur Archive and perhaps part of Hrit (can't say any more) will be done in Joomla! with a C++ version of nmerge with several improvements:

  1. Merging using the list format, not the explicit graph. So no more wasteful conversion to/from the graph.
  2. Support for 'plug-in' alignment modules for corpus linguistics texts, multi-lingual texts, xml-aware alignment. Also a user-accessible API for building their own alignment modules in C++.
  3. Use of a full suffix-tree for the general alignment algorithm. In this way the performance will be linear.

The GUI will become a Joomla! module, extension or whatever that will take over all the current functionality of the wiki and add a manuscript view and a tree-view, which will show the genealogical tree of the work. Maybe each of these views could be designed as modules also, so the site designer can add his/her own. I want this to be as flexible as possible.

Of course, the advantage of using Joomla! is that we leverage all its existing GUI for site management and page editing etc. So we get out of the box something that humanists can already use to build their own website.

But the keyword is: components, components, components! Like Steve Ballmer with his 'Developers! developers! developers!' Let the designer be free to customise the system.

Rahmel's book on Joomla! is fairly comprehensive, but a bit difficult to read. I have nearly finished working my way through Chapter 3. My intention is to go through it all as fast as possible and type in and test all the examples. At the end I hope that I will come out as a Joomla! guru, able to create three websites at the stroke of a key. If the goal of changing the face of digital humanities is ambitious I think it is also achievable given the tools available today.

Sunday 1 November 2009

Radical Redesign

I have decided after arguing this through with the DV people is that they want a Joomla! or php-based website so they get a nice GUI and can edit it themselves. And no Java. So I have to rewrite nmerge in C++ and develop a Joomla! plugin out of the Alpha wiki web application. This way other humanitites users can get a simple, easy to deploy, web-application without paying for expensive web-hosting.

Monday 24 August 2009

Some Big changes are Afoot

OK, I've been quiet about this for a bit. But I had other things to do, like attending the Balisage Conference. In the meantime I've had a rethink. I have realised how poorly written Jetspeed is. To create your own Jetspeed website you have to modify an existing website. Yuck. It should just be a product that you ADD things to, like webapps or portlets. Drop them in and away you go. Instead there is all this tinkering with the internals of a complex progam.

So I am returning to Pluto. It's very simple a cleaner. OK, there's no so many features but I can add those. Also, I have decied to use JSPs to define the portlets. I can rewrite the servlet code I have for Alpha and fit it very nicely into Pluto. And the best thing is Pluto is tiny - just 20MB. That sure beats over 200MB for Jetspeed, and that was small.

Thursday 18 June 2009

Changing the database location

This would seem to be a simple task. After all, a web application which is dependent on a database being located somewhere on the disk can't store that information in the database itself. It needs some independent storage location. The obvious place to put this information is in the customised application somewhere. But after searching for it for hours I couldn't find it. No, it is in Tomcat in the /conf/Catalina/localhost directory. The file is called dv.xml (or jetspeed.xml for an uncustomised application).

The reason I wanted to know was that I wished to distribute the website and have it run on Windows. Obviously /tmp/jetspeed wouldn't be available. It would make more sense to store the database in the same folder as the website, and then have a single .bat file to start it all up. But I needed to alter the database location. Now I can.

Actually, I use a relative url. Instead of /tmp/jetspeed/derby/productiondb I used: ../../jetspeed/derby/productiondb, and it worked. My jetspeed directory was in the same directory as apache-tomcat-6.0.20, and contains derby/productiondb.

Changing the Navigation Menu

The navigation menu, which in the 'thesolution' layout is seen on the left, is a reflection of the links defined in /dv/WEB-INF/pages. I just deleted all the boring apache ones and replaced them with more reasonable DV ones.

Wednesday 17 June 2009

Localising Included HTML files

Of course, in a website you want lots of HTML files, and in a multi-language website you want to provide translations for them. But how do you arrange it so that if you choose French as your locale, then the correct French HTML files are loaded?

The trick is to create a portlet definition for each HTML file you want to include. A bit tedious, I know, but how else can you do it? The portlet definitions reside in portlet.xml in the j2-portal web application, inside the WEB-INF folder. In there you will find a suitable portlet to revise and take over, such as 'Welcome to Jetspeed 2'. Just copy and duplicate this as many times as you want. It uses the Apache Portal Applications FilePortlet. Now this contains a trick suggested here, although the real mechanism is a little bit different.

You first define an init-param for the portlet thus:

<init-param>
  <name>useLanguage</name>
  <value>true</value>
</init-param>
This turns on the 'fallback' mechanism for files you specify but Jetspeed doesn't find at first. It works like this: if you specify a file /WEB-INF/view/info/about.html in your portlet-preference (see the portlet.xml file) but it's not there, and you enabled the fr locale by saying <supported-locale>fr</supported-locale>, and the current locale is French, then it will generate a fallback consisting of /WEB-INF/view/info/fr/about.html, and if that is present it will load that. The same trick works for any locale, including en.

The title and description strings have to be defined in resource bundles, but I will tackle that tomorrow.

Tuesday 16 June 2009

Localising the Website Title

Although my colleagues will probably want 'Digital Variants' in all languages, I wanted localised versions in French, Spanish and Italian as the website title. Here's how you do it. In default-page.psml add some meta-data for each desired version:

<metadata name="short-title" xml:lang="fr">Variantes Numériques</metadata>
etc.
Then, define a Velocity macro in decorator-macros.vm:
#macro (ShortPageTitle)$jetspeed.page.getShortTitle( $preferedLocale)#end
This calls the getShortTitle method of org.apache.jetspeed.page.document.Node.java, which retrieves the declared 'shortTitle' of the page. I now have 'Varianti Digitali' as the website title in Italian.

Sunday 14 June 2009

Restricting the number of languages

How do you correctly restrict the number of languages supported by the DV site? I tried configuring the locales belonging to the Locale Selector portlet, but this doesn't limit the number of displayed flags.

The answer was quite simple, but ferreted away in the bowels of Jetspeed. In the j2-admin application (of which Locale Selector is but a part) you look in the WEB-INF directory for the classes folder. In there under org/apache/jetspeed/portlets/localeselector/resources there's a file LocaleSelectorResources.properties. Edit out the values in localeselector.locales that you don't want. I think that's all, but for good measure I also edited out the unwanted locale names. Ditto for the other files for the other locales in the same directory. Again this is probably redundant. Then you have to quit Tomcat. I deleted /work/Catalina for good measure (the Tomcat cache) and restarted Tomcat and the unwanted locale flags were gone. Hooray!

Why can't they just tell you this?

Progress!

So now I at last understand how to customise Jetspeed. There are four basic areas you can tweak:

  1. The administration portlets. Here under 'Portal Site Manager' you can select a page and configure it to some extent. Crucially you can't alter the layout of the stuff in the middle. For that you need:
  2. The .pgsml files. These specify in simple XML how to lay out the portlets on a page. Or you can edit a page directly in the portal by clicking on the pencil icon if you have admin privileges.
  3. The decorations/layout and decorations/portal folders in the jetspeed or (in my case 'dv' application) folder contain stylesheets for tweaking the appearance of the header, footer and portlet decorations. Overall layout is determined by choosing and manipulating:
  4. The header.vm, footer.vm Velocity templates. These are basically HTML files with embedded stuff that boggles the mind where it comes from. I wasted considerable time trying to trace it back through the Java and Velocity macros and eventually gave up. Familiarity will come in time, but it ain't easy.

But, hey, it's starting to look like the portal I wanted, and that's progress! Here's a peek at my first embarrassing effort:

Obviously I haven't finished colouring it correctly or even giving it a main menu or any content yet. But hold on, this my first effort at building a portal and I am as proud of it as a toddler with his first painting. We're actually only going to support four languages (unless we can find someone to do a German or Hindi version) namely French, Italian, Spanish and English, so don't get carried away.

Saturday 13 June 2009

Oh oh ... spoke too soon

It looks like I spoke too soon about everything being fine with the customised version of Jetspeed. After playing around with it for a bit it got slower and slower then fell over. All attempts to restart it failed. It seems that the 64MB JVM default wasn't enough. This led to a corruption of the database when it expired and a return to all the faults of the previous attempts: failure to login etc.

So I tried again with the 'minimal' install of Jetspeed, without any customisation. This reduces the heavy admin portlets to the minimum, but you have to modify the Jetspeed installation itself. They warn you not to do this, but the customised version already creates another application for Tomcat to struggle with. All I'll be altering are the layout and decorators, so I don't see the problem. I can also throw out alternative decorators since they won't be needed. I've got to run in a max of 64MB.

They reckon you only need '1 MB' for the basic Jetspeed, and '2 MB' for the customised builds, which I just don't believe. You can't even run a JVM that says 'hello world' in that much RAM, but it does show that you need less for Jetspeed in the raw.

Thursday 11 June 2009

Battling with Maven

Well after upteen tries I finally got the customised Jetspeed portal tutorial to work. That's the problem with volunteer efforts. They're not quite professional because they have no real requirement to be bug-free. The community – that's me! – is supposed to fix the broken bits. I must say that of the various build systems around today, Maven, which is compulsory for Jetspeed, is not my favourite.

Problems experienced by me, and the rest of the community, include:

  1. difficulty of getting Maven to work through a proxy, which not everyone found so easy. Here's the cryptic error-message:
    Reason: POM 'org.apache.maven.plugins:maven-archetype-plugin' not found in repository: Unable to download the artifact from any repository
    This actually means that Maven couldn't contact any remote repository. In my case I couldn't get it to work through port 3128 with a username and password, even after specifying them in ~/.m2/settings.xml, according the the official instructions. So I gave up and did it at home, where I had a more direct Internet connection. Here's someone else who had the same problem.

  2. failure to login once you get it to build. What they failed to mention was that the tutorial ceased to work with some version of Tomcat after 5.5.6 and before 6.0.18. I tried for a long time with 6.0.13 which is missing certain security classes whose absence prevents the user from logging in. (It rejects user=admin, password=admin, or any user you define because it doesn't encrypt the password before looking it up.)
  3. failure to dispose of previous attempts to start up the database. I noticed that after any failed attempt to build and deploy jetexpress (the tutorial's customised version of Jetspeed) the java database connection was still live, causing any further attempts to connect to the Derby jdbc driver to fail. So I just rebooted the machine and all was fine.
  4. failure to provide proper instructions for configuring databases other than Derby. Although I eventually found some here, they are not explained properly and really belong in the Jetspeed documentation.
  5. etc. (yes, etc!) In particular don't use one of the older tutorials by mistake. They should be taken down, because they are worse, and obsolete. Oh, and don't launch Tomcat using ./startup.sh but using ./catalina.sh run as it says in the tutorial, or you'll miss important error messages.
OK, it works now, but it was a lot of effort to get something going that was supposed to take only five minutes. Whereas I wasted four days of my precious spare time. For the record I used Tomcat 6.0.20 on Mac OSX 10.4.13, JRE 1.5, Jetspeed 2.2.0, the bundled copy of Derby, and it all worked.

Why don't I give up on Jetspeed, you ask? Because the alternative is to use one of those bloated and unusable 'free'/commercial portal products. My contempt for them and their dirty tricks is undiminished in spite of my difficulties of the past few days. At least this way I can do what I like with it and walk away at the end with my software intact.

Now to start the customisation!

Sunday 7 June 2009

Starting Up the Jetspeed Customisation

The manual says that you need only create a customisation of Jetspeed, so I followed the instructions, which are pretty sketchy. They don't tell you for example, how to set up a Postgres database, just Derby, which I know is slow. I got the postgres settings from here, and fired up my customisation according to the instructions. But I got an error from the Spring framework that said

java.lang.IllegalStateException: BeanFactory not initialized or already closed - call 'refresh' before accessing beans via the ApplicationContext.

So I reloaded my two Jetspeed applications in the Tomcat 6 Manager and it got a bit further ...

Then it said:

Portlet Application j2-admin not available

when it was. So I reloaded the j2-admin application also in the Tomcat Manager, and it worked. At least I thought it did. Now it complains that the password 'admin' for user 'admin' is not valid. But I checked in the database and there is something there, and when I registered myself and tried the password I had just entered it gave the same error message. Groan! Will this never end?

Why I chose Jetspeed 2

There are literally hundreds of website building tools out there. Most of them are complicated, and most of them are big. Those that aren't don't seem to work as well. What I needed was something that was standards compliant, totally free, wholly Java based and small enough to run on a commercial webserver for not too much money. The list quickly narrowed to zero and got filled instead with things that looked like they were free, but weren't.

'Free' Software Honeypots

There are lots of free Content Management Systems and portal products out there but they are not really free. They are just a free front end to a basically commercial product. They offer something quite fancy for nothing, but once you start to use it you realise that the free version doesn't have all the features you want, and for that you have to pay. Or they use a proprietary plugin architecture, which is not standards compliant. Once you develop your plugin or portlet for that platform, you are locked in. Sooner or later you need support or something more or they start charging and wham! you are caught in the trap. Or the company goes out of business, or decides to change everything overnight and your project is wrecked. The naïve young programmers fall for this every time, but us old mice have been caught by the tail once too often.

Jetspeed 2

I wanted something that was really truly 100% free, and I eventually found Jetspeed 2. It's not as fancy as some other portal offerings but it doesn't have any strings attached. I don't know at this point if it is really right for my needs but it is based on Pluto, which is the reference implementation of JSR 286, the Portlets 2 standard. Hardly any commercial products adhere to this yet, but Jetspeed does. There's no hidden commercial product behind it, and it's not too complicated. Best of all it seems to run (not sure yet) in 64MB of VM memory, so it might be deployable on a commercial webhosting site. My only worry is whether users will be able to easily update HTML type content. But to find out I have to use it first. The old Catch-22.

So this blog is really a document for me and for anyone else who wants to know how to set up their own first class customisation of Jetspeed. That's the ambition anyway. Let's see if we get there. Heh heh. :-)