Linking areas on an image to segments of text so you can highlight one or the other and show what fragment of an image produced what transcription sounds like a crazy pedantic idea. At least that is what I thought when I first heard about it. But the fact is, if you want to display a facsimile image next to a transcription the user has no easy way to make the correlation between what corresponds to what. They spend their whole time scrolling up and down, scanning the image with their eyes and then going back to the text, losing where they were on the image and starting again, etc. Following the HCI notion of least 'excise' or user effort to get a task done, text-to-image links make it easy to read a manuscript facsimile. Not so crazy after all!
The images are stored in Mongo's grid-fs. In the same database are stored the plain text and markup overlays called cortex and corcode respectively. CorCode has the advantage over standard standoff markup or directly embedded markup in that you can overlay any number of markup sets onto the same text and it produces valid HTML. The HTML then gets sent to the browser, which has a simple window with two panels. The left one shows the image inside a HTML5 canvas. The right hand side has a transcription of the image's contents. As the user moves over regions occupied by words in the image those regions turn pink, and a corresponding region on in the text on the right is highlighted also. It also works vice-versa.
The really cool bit in TILT1 is that the user can quickly refine the guessimate made by the server by selecting a region already recognised on each side. Re-recognising does the same thing but starts at two known good end-points on either side. In the most fine-grained case one word on each side could be chosen, but in most cases great swathes of text can be selected in one go. TILT1 uses a clever alignment algorithm adapted from textual diff tools to align the word-shapes on the left with words of corresponding length on the right by taking account of their order. When the user is satisfied with the alignment he/she can press "next" or"prev" to go on to a new page or to refine a previously done page, and the work is automatically saved.
The problem with this is that it is still just a design. But I need it for two projects: the De Roberto I Viceré and the Charles Harpur critical archive, both of which have extensive manuscript facsimiles to compliment the texts. Without automation such text image alignment would be infeasible on this scale. The thing I like about this design is that each software component does what it is good at, and delegates the rest to the other components.
Yes, it will require HTML5, but all modern browsers support this. Without HTML5 it becomes very very messy to do the drawing on IE (using VML) and another way on other browsers. If it doesn't work for you and you need it, just update your browser or if you can't then buy a new computer or tablet. I haven't got time to support every damn browser out there.