Archive for the ‘JOMDoc’ Category

Microdata vs. RDFa – What does it mean to us?

Wednesday, October 28th, 2009

Only today I became aware of microdata, the proposed way of embedding semantic annotations into HTML5. (Yes, they adopted the syntax that Michael also prefers for OMDoc, and which I personally hate, but I will get used to it.) Microdata are not to be confused with microformats, a poor man’s way of annotation that (ab)uses CSS classes and thus is compatible with HTML 4. Microdata are something like RDFa but

  1. are slightly easier to use for people who don’t understand XML namespaces
    • granted, RDFa’s excessive reliance on XML namespaces makes it hard to parse, and makes it unbearably complex to copy/paste a fragment, which is an important use case for HTML5
  2. allow for ad hoc pseudo-semantic markup when you do not use an ontology
    • What’s the point in annotating at all, then?
  3. compatible with the non-XML syntax of HTML5 (which should have been ditched IMHO, but, well, in the interest of reactionary users and software, they decided differently)

The fight for the future of RDFa in HTML is going on, but what does that mean to KWARC? We have incorporated RDFa into OMDoc as a means of extending the metadata vocabularies. RDFa, originally designed for XHTML, is prepared for being integrated into any XML language, including OMDoc. HTML5 microdata are an integral part of the HTML5 specification and would not work in other XML languages. OK, but we present OMDoc documents as HTML to make them human-readable. In this output, we want to preserve the semantics of the OMDoc markup, and for that we had always been thinking about using RDFa. (We know exactly how to do it, but just have not yet implemented that step, though.) We could use HTML5 microdata instead, but:

  1. RDFa has little software support so far, but microdata have none (beyond proofs of concept)
  2. We generate XML-compliant HTML. The non-XML syntax of HTML5 supports embedded MathML, but I doubt that it will support parallel OpenMath markup, where elements from yet another namespace are embedded into the MathML formulae.
  3. We generate HTML. The embedded annotations need not be authored manually, so they do not have to be easy to author.
  4. We are interested in using well-defined ontologies to express semantics, so we don’t need ad hoc “semantic” markup.

What do you think?

Google likes us (2)

Wednesday, February 18th, 2009

I wanted to explain “parallel markup” to a colleague and was too lazy to look it up in the MathML specification, so I googled it.  It turned out that KWARC ranks quite well on that topic, far ahead of the MathML spec.  First hit was a www-math mailing list thread following up a question about parallel markup that I once asked.  A Trac ticket on parallel markup support in our JOMDoc library ranks #5.

Welcome to mmlkit!!!

Saturday, December 15th, 2007

mmlkit is a Java-based toolkit for building presentation engines for content markup formats for mathematics.

The next generation of content markup formats for mathematics come with a flexible mechanism for defining mathematical notations (see e.g. the MathML3 Working Draft. mmlkit facilitates building presentation engines that use such notation definitions as a parameter.

The mmlkit toolkit currently has three components:

  • mmlproc – A library for transforming Content MathML/ OpenMath into Presentation MathML.
  • collector – A library for collecting notation definition from a document collection.
  • rndgrab – A library for selecting appropriate renderings for a content math object while considering the context and variant attributes of the rendering element
  • general – A library for the error handling and utility classes

The initial framework of the toolkit (in particular the mmlproc library) have been developed by Normen Müller and is currently maintained and extended by Christine Müller.

For more information please refer to the project webpage.