Sunday, 29 May 2011

Extensions to CorCode

I've been thinking about the exact role of CorCode and how we render it on screen as HTML. There are some problems with the simple CorTex, CorCode, CorPix model that need figuring out. Hopefully they're just details.

TEI and XSLT

For TEI embedded markup people write XSLT stylesheets to transform the text, typically into HTML. Since a particular encoding of a text is embedded in it, and is highly customised, the XSLT stylesheet that transforms it cannot be fully portable to other people's texts:

  1. they may want to render the same data differently, or
  2. they may have custom tags and attributes that need rendering according to their local GUI requirements

The reason that TEI tries to mandate particular names for tags and attributes is so that standard stylesheets can be used. In practice this does not work very well because just as the encoding is customised the stylesheets also need customisation as required by these two points. On the other hand, if we no longer mandate any standard encoding and just allow the user to specify fully the names of properties (aka tags) then it is up to the encoder to specify what is to be done with them.

How to do this in HRIT?

One objective of designing HRIT is to make things easy and to make them work smoothly. So the user should download Cortex and CorCode and they should automatically merge locally and produce output that looks good in their local GUI tool. But how?

Both functions of the transformation - the one into a visual appearance and its interpretation by the application are specific to the encoding itself. So I am thinking that the transforming css file that performs both duties in HRIT needs to be closely associated with the CorCode. It shouldn't need to be transformed locally for a particular application because how can that transformation do anything without knowledge of the format? Imagine I write a css file to render King Lear, and use it within single view. Cool. But what happens if I download Little Dorrit by Charles Dickens in the same application, and my css file no longer works? So the css file can't belong to the application, or at least most of it can't. It has to belong to the CorCode. This is more or less what TEI already does: an XSLT stylesheet is designed for a particular collection of customised text encodings and is useless outside of that domain.

Ergo: Corform

So maybe we need a fourth type of basic data: corform that is as tightly bound to corcode as corcode is itself bound to cortex. That sounds less evil to me than extending the corcode format to support formatting. The good news is that we wouldn't need as many corforms as corcodes, because they would be version- and even file-independent. But it would be both functionally and visually specific to a specific application. We might assign each corcode a "style" and then specify corforms like render that style. We'd want many corforms for each style but not the other way around. So a suitable rest resource url might look like: http://luc.edu/hrit/corform/style-name/app-name/render-name. "app-name" could be "default", meaning any application, or it could be "hritsingle" meaning the single view application; a suitable style-name might be "TEI"; "render-name" might be "freds-format". An individual corform resource would be in css.

So the way it would work in practice the user would specify a particular corform resource manually, and it would be saved until changed. But if none was specified, the application name combined with the corcode's style name should identify a default rendering in every case.

No comments:

Post a Comment