Archive for June, 2010

OpenMath CDs as Linked Data

Wednesday, June 30th, 2010

I am currently pursuing the integration of OpenMath Content Dictionaries (CDs) into the Web of Data. (Here is the agenda, which I will present and discuss at the upcoming OpenMath workshop.) The motivation is that mathematical knowledge is currently underrepresented on the Web of Data, but that it is needed for certain use cases, such as dealing in a reasonable way with all those numbers in statistical datasets published by governments.

Only now I discovered several blog posts, which are almost a year old, on the question whether something that is called “Linked Data” must use RDF. In the proposed OpenMath setup, we will primarily publish the OpenMath CDs themselves according to the Linked Data principles. That works, because the CDs and the symbols defined in them have URIs. The XML language, in which the CDs are written, is well known in the OpenMath community. It consists of a thin XML wrapper around the actual objects of interest, the so-called OpenMath objects, i.e. mathematical formulæ in a functional tree structure. When a web service wants to know how to compute, e.g., the Human Development Index of a country, assuming that the auxiliary data points LE, ALI, GEI and GDP are already known, it looks up the definition of the HDI symbol by its URI, e.g. http://example.org/statistics#hdi. It would request the CD as application/openmath+xml, locate the desired symbol, find out that its definition is 1/3 (LE + 2/3 ALI + 1/3 GEI + GDP) (encoded as an OpenMath object), substitute the values it knows for the parameters, and let a computer algebra system do the computation.

Thus, my answers to these previous blog posts are:

  • to Paul Miller’s “Does Linked Data need RDF?”: No, it does not. OpenMath CDs also work. Well, in principle, at least for entirely OpenMath-based application scenarios, as sketched above. For making a real contribution, the data should additionally be made available as RDF (which is no problem for us, we have the software for translation), so that RDF-based Linked Data applications don’t get stuck on a link saying, e.g., the function used to compute this entry of our dataset is http://www.openmath.org/cd/arith1#sum.
  • to Toby Inkster’s comment on that blog post: Yes, in principle we could convert a whole OpenMath CD to RDF. At the moment, I’m not doing this. I provide the complete structural outline of the CD (i.e. what symbols it contains, what metadata have been given for the CD and its symbols), but so far I have not implemented a translation of OpenMath objects to RDF. Why?
    1. There is no suitable RDF representation of the ordered tree nature of mathematical expressions. Several people have tried it (e.g. [1], [2]), but none of these representations have been adopted by the community, if they have been implemented at all.
    2. RDF-based reasoning engines don’t understand mathematical expressions. They don’t know, e.g., what a bound variable is, so even if we expressed a formula in RDF, it would be useless.
    3. Software that does understand mathematical expressions (e.g. a computer algebra system) can usually either process OpenMath, or a language for which translations from/to OpenMath have been implemented.

    Note that I have been thinking about what information from the OpenMath objects might reasonably be represented in RDF. In my own applications, I do make use of the information about what symbols occur in a formula (regardless of the depth at which they occur and the order), so I represent that information in the RDF I extract from OpenMath CDs. I have seen other applications that care about the symbol at the root of an expressions, such as the “plus” in a+2b², so that could as well be represented in RDF. One could also think about applications making use of OpenMath objects in CDs obtaining them from the RDF representation of a CD, as XMLLiterals. (That could entirely replace the XML-based CD format without losing expressivity, but I’m sure the OpenMath community wouldn’t like that.)

  • to Ian Davis’ blog post: I do not agree with the idea that the term Linked Data may only be used together with RDF. I will continue to call what I’m doing with OpenMath “Linked Data”. However, being aware of the ubiquity of RDF and the software supporting it, I will also make RDF data available for the OpenMath CDs, so the difference is a philosophical one.

And a general remark for the RDF community: Most OpenMath users don’t care. The OpenMath community is conservative, and it has tools that work with the OpenMath knowledge model and its concrete XML representation. In fact, both communities are quite similar. Both have their own standard, with useful applications, and they say: “Why should we need any other knowledge model or format? OpenMath/RDF is fine for us. We won’t use RDF/OpenMath. But of course we’d appreciate if you could come up with another real-world use case that uses OpenMath/RDF and shows its superiority.” (BTW, I would be interested in feedback from other communities whose original data you have published as RDF Linked Data. What attitudes to they have?)