Tuesday 25 October 2016

Formatting poetry for the Web

HTML was designed for the display of business documents, not poetry. In HTML, text is composed out of a succession of flow elements, each of which contains a series of phrasing elements. So an element like <p> is a flow-element, and <span> is phrasing content. <div> is like the joker in cards: it can enclose anything.

Stanzas

Let's consider the first problem, how to encode stanzas:

Ah! Daniel mine, some Muse malign Hath skimm'd thy judgments cream away But take a slice of "good advice"--- Even that I proffer thee today.

If the stanza here was enclosed in a <pre> then all the lines would have the same indentation, and this could not be corrected using CSS. You could use spaces at the start of each line, but with variable-width fonts this looks awful and you have no control over indentation when fonts are substituted by the browser. In CSS you can instead use the white-space: pre property to make any flow element behave like <pre> anyhow, so <pre> is not needed, especially as it uses a monospaced font by default.

An obvious alternative would seem to be <div>, which can enclose anything. So a <div class="stanza"> would be a good choice. Equally <p> is also possible, so long as it encloses only phrasing content. (A <p> can't enclose another <p>, so we can't use <p> to represent lines if <p> is already used for stanzas.) However, <p> has the distinct advantage of being the direct result of translation of Markdown's double line-breaks to separate paragraphs. This allows us to type poems online using a very simple Markdown-like syntax, and then translate it into suitable HTML. Stanzas can be styled using a CSS selector that selects all <p>-elements enclosed in a single <div class="poem">, so you don't have to keep typing "<div class="stanza">" every four lines or so.

Headings

Often in poems headings are centred. Unfortunately, poetic lines are typically much shorter than the screen-width, and since the enclosing <div> will fill all the available width on screen, it will push the heading to the right of the text. So, it won't be centered in any meaningful sense. The fix is to write a little Javascript to measure each line of poetry, then adjust the width of the <div> so that it is slightly wider than the longest line. Something like this:

<head>
...
<script src="https://ajax.googleapis.com/ajax/libs/
  jquery/1.12.4/jquery.min.js"></script>
<script type="text/javascript">
$(document).ready(function(){
    var maxWidth=0;
    var lines = $("div p span");
    for ( var i=0;i<lines.length;i++ ) {
        var w = lines.eq(i).width();
        if ( w > maxWidth )
            maxWidth = w;
    }
    $("div.poem").width(maxWidth+10);
})
</head>
<body>
<div class="poem">
...
</div>
</body>

Lines

Neither the <pre>-tag, which is used to format computer-code, nor the line-break tag, <br>, provides control over the indenting of specific lines. It is a common mistake of XML technicians to encode lines as "<lb/>", rather than <l>...</l>, to avoid common issues of overlap with other elements like <del> (deleted). Once a poetic line has been encoded using an empty XML-element like this it can only be translated into HTML's <br> tag, which incurs the problems just mentioned. Hence lines are better represented as <span>s within a stanza (a <p> or <div>) where white-space has been set to "pre". Lines of various indentations can then be represented by defining classes of spans such as <span class="line"> or <span class="line-indent1"> etc.

Italics etc

For character formats ("phrasing content") you need to use classes, so <span class="underlined">that</span> can be styled to be in italics. If you use <i> or <em> you can't control the text appearance so well. For example, you might have stage-directions, foreign words etc that need different formatting.

Special characters

To get a really professional look simple typewriter codes like " and ' need to be translated into their curly equivalents. The same goes for dashes like ---, which becomes —.

Putting it all together

The whole design looks like this. You can change the indents, define extra classes for a wider variety of indented lines. I use up to six. To copy the design just use "display source" in your browser.

To Twank.1

Ah! Daniel mine, some Muse malign Hath skimm’d thy judgments cream away But take a slice of “good advice”— Even that I proffer thee today.

Again read Shakespear by the hour,— Read Milton more—McDonald less— And Wordsworth for his simple power, Not for his namby-pamby-ness.

And know,—’twere better to esteem What’s best in Byron’s godless “Don” Than with crude Browning much to dream, Or wire-draw through with Tennyson.

And better at the “woes of Moore” To shed the artifical tear, Than doat with eunuch passion o’er The feeble beauties of De Vere.

The above poem was automatically formatted by simply translating a Markdown representation into HTML. This is what the user actually typed:

To Twank.
==========

    Ah! Daniel mine, some Muse malign
        Hath skimm'd thy judgments cream away
    But take a slice of "good advice"---
        Even _that_ I proffer thee today.

    Again read Shakespear by the hour,---
        Read Milton more---M^c^Donald less---
    And Wordsworth for his simple power,
        Not for his namby-pamby-ness.

    And know,---'twere better to esteem
        What's best in Byron's godless "Don"
    Than with crude Browning much to dream,
        Or wire-draw through with Tennyson.
    
    And better at the "woes of Moore"
        To shed the artifical tear,
    Than doat with eunuch passion o'er
        The _feeble_ beauties of De Vere.

Now I call that easy.

1 Charles Harpur, 'To Twank' Empire 14 March 1860. 'Twank' was a nickname of Daniel Deniehy

No comments:

Post a Comment

Note: only a member of this blog may post a comment.