Archive for October, 2009

Microdata vs. RDFa – What does it mean to us?

Wednesday, October 28th, 2009

Only today I became aware of microdata, the proposed way of embedding semantic annotations into HTML5. (Yes, they adopted the syntax that Michael also prefers for OMDoc, and which I personally hate, but I will get used to it.) Microdata are not to be confused with microformats, a poor man’s way of annotation that (ab)uses CSS classes and thus is compatible with HTML 4. Microdata are something like RDFa but

  1. are slightly easier to use for people who don’t understand XML namespaces
    • granted, RDFa’s excessive reliance on XML namespaces makes it hard to parse, and makes it unbearably complex to copy/paste a fragment, which is an important use case for HTML5
  2. allow for ad hoc pseudo-semantic markup when you do not use an ontology
    • What’s the point in annotating at all, then?
  3. compatible with the non-XML syntax of HTML5 (which should have been ditched IMHO, but, well, in the interest of reactionary users and software, they decided differently)

The fight for the future of RDFa in HTML is going on, but what does that mean to KWARC? We have incorporated RDFa into OMDoc as a means of extending the metadata vocabularies. RDFa, originally designed for XHTML, is prepared for being integrated into any XML language, including OMDoc. HTML5 microdata are an integral part of the HTML5 specification and would not work in other XML languages. OK, but we present OMDoc documents as HTML to make them human-readable. In this output, we want to preserve the semantics of the OMDoc markup, and for that we had always been thinking about using RDFa. (We know exactly how to do it, but just have not yet implemented that step, though.) We could use HTML5 microdata instead, but:

  1. RDFa has little software support so far, but microdata have none (beyond proofs of concept)
  2. We generate XML-compliant HTML. The non-XML syntax of HTML5 supports embedded MathML, but I doubt that it will support parallel OpenMath markup, where elements from yet another namespace are embedded into the MathML formulae.
  3. We generate HTML. The embedded annotations need not be authored manually, so they do not have to be easy to author.
  4. We are interested in using well-defined ontologies to express semantics, so we don’t need ad hoc “semantic” markup.

What do you think?

Readably and economically printing LNCS papers

Tuesday, October 20th, 2009

The LNCS format does not print nicely on A4, because the LNCS book pages are much smaller. However, most preprints, your own LNCS papers, and papers you get for reviewing are formatted for A4. Printing one page per sheet wastes a lot of paper for the wide margin, but when you print two pages per sheet you can hardly read the small text any more. Here is a fix:

pdfnup doc.pdf --nup 2x1 --trim "-6cm -6cm -6cm -6cm" --delta "-18cm -18cm" --scale 1.8

Update: Formatting your PDF right

It is even better if you already set the right paper format when creating your PDF. With appropriate printing settings, that gets the print right without the adjustments mentioned above, and it makes screen reading more convenient. Markus Kuhn explains how. However, his measurements didn’t work for me; instead of 92 112 523 778 I had to use 91 71 521 721.