Archive for the ‘OpenMath’ Category

TEI Guidelines mention MathML, OpenMath, and OMDoc

Saturday, July 31st, 2010

Someone in the humanities must be interested in OMDoc. I was really surprised to find a reference to OMDoc in the section “Formulæ and Mathematical Expressions” guidelines (a.k.a. specification) for TEI. TEI (Text Encoding Initiative) is the standard semantic markup language for humanities, social sciences and linguistics, much like DocBook for technical manuals. All that TEI itself has is an element <formula notation=”…”/>, where notation refers to the language in which the formula is represented. But the guidelines refer to some mathematical markup languages, from which the document author is asked to “make an informed choice”:

  • TeX – the obvious candidate, also used in some examples
  • MathML – the obvious candidate when XML is desired.  They give one Presentation MathML example but also mention Content MathML.
  • OpenMath – much less expected. Nice to see that here. Oh the other hand, the links to the OpenMath standard are outdated. I should probably report that.
  • OMDoc – I didn’t expect that at all.

OpenMath CDs as Linked Data

Wednesday, June 30th, 2010

I am currently pursuing the integration of OpenMath Content Dictionaries (CDs) into the Web of Data. (Here is the agenda, which I will present and discuss at the upcoming OpenMath workshop.) The motivation is that mathematical knowledge is currently underrepresented on the Web of Data, but that it is needed for certain use cases, such as dealing in a reasonable way with all those numbers in statistical datasets published by governments.

Only now I discovered several blog posts, which are almost a year old, on the question whether something that is called “Linked Data” must use RDF. In the proposed OpenMath setup, we will primarily publish the OpenMath CDs themselves according to the Linked Data principles. That works, because the CDs and the symbols defined in them have URIs. The XML language, in which the CDs are written, is well known in the OpenMath community. It consists of a thin XML wrapper around the actual objects of interest, the so-called OpenMath objects, i.e. mathematical formulæ in a functional tree structure. When a web service wants to know how to compute, e.g., the Human Development Index of a country, assuming that the auxiliary data points LE, ALI, GEI and GDP are already known, it looks up the definition of the HDI symbol by its URI, e.g. http://example.org/statistics#hdi. It would request the CD as application/openmath+xml, locate the desired symbol, find out that its definition is 1/3 (LE + 2/3 ALI + 1/3 GEI + GDP) (encoded as an OpenMath object), substitute the values it knows for the parameters, and let a computer algebra system do the computation.

Thus, my answers to these previous blog posts are:

  • to Paul Miller’s “Does Linked Data need RDF?”: No, it does not. OpenMath CDs also work. Well, in principle, at least for entirely OpenMath-based application scenarios, as sketched above. For making a real contribution, the data should additionally be made available as RDF (which is no problem for us, we have the software for translation), so that RDF-based Linked Data applications don’t get stuck on a link saying, e.g., the function used to compute this entry of our dataset is http://www.openmath.org/cd/arith1#sum.
  • to Toby Inkster’s comment on that blog post: Yes, in principle we could convert a whole OpenMath CD to RDF. At the moment, I’m not doing this. I provide the complete structural outline of the CD (i.e. what symbols it contains, what metadata have been given for the CD and its symbols), but so far I have not implemented a translation of OpenMath objects to RDF. Why?
    1. There is no suitable RDF representation of the ordered tree nature of mathematical expressions. Several people have tried it (e.g. [1], [2]), but none of these representations have been adopted by the community, if they have been implemented at all.
    2. RDF-based reasoning engines don’t understand mathematical expressions. They don’t know, e.g., what a bound variable is, so even if we expressed a formula in RDF, it would be useless.
    3. Software that does understand mathematical expressions (e.g. a computer algebra system) can usually either process OpenMath, or a language for which translations from/to OpenMath have been implemented.

    Note that I have been thinking about what information from the OpenMath objects might reasonably be represented in RDF. In my own applications, I do make use of the information about what symbols occur in a formula (regardless of the depth at which they occur and the order), so I represent that information in the RDF I extract from OpenMath CDs. I have seen other applications that care about the symbol at the root of an expressions, such as the “plus” in a+2b², so that could as well be represented in RDF. One could also think about applications making use of OpenMath objects in CDs obtaining them from the RDF representation of a CD, as XMLLiterals. (That could entirely replace the XML-based CD format without losing expressivity, but I’m sure the OpenMath community wouldn’t like that.)

  • to Ian Davis’ blog post: I do not agree with the idea that the term Linked Data may only be used together with RDF. I will continue to call what I’m doing with OpenMath “Linked Data”. However, being aware of the ubiquity of RDF and the software supporting it, I will also make RDF data available for the OpenMath CDs, so the difference is a philosophical one.

And a general remark for the RDF community: Most OpenMath users don’t care. The OpenMath community is conservative, and it has tools that work with the OpenMath knowledge model and its concrete XML representation. In fact, both communities are quite similar. Both have their own standard, with useful applications, and they say: “Why should we need any other knowledge model or format? OpenMath/RDF is fine for us. We won’t use RDF/OpenMath. But of course we’d appreciate if you could come up with another real-world use case that uses OpenMath/RDF and shows its superiority.” (BTW, I would be interested in feedback from other communities whose original data you have published as RDF Linked Data. What attitudes to they have?)

Integrating Presentation into OMBase

Monday, March 24th, 2008

I have just been reading up on REST again, since I found a very palatable pair of articles (REST intro, and  practices). This got me thinking about the state of OMBase, and the integration of our presentation pipeline into the OMBase interface. It is RESTful, since we have MMT addressing via URIs implemented. You just use a GET to retrieve them.

What I have talked with Florian about, but maybe not with the OMBase team, is how to integrate presentation. That should be very simple from the interface point of view: we just take the same URLs, but a different HTTP header.

GET /arith1/lcm
Host: cds.omdoc.org
Accept: application/omdoc+xml

gives you the OMDoc file and

GET /arith1/lcm
Host: cds.omdoc.org
Accept: application/xhtml+xml

gives you the presented version (with the standard options). Now, we have written a paper about presentation and submitted it to MKM and Christine has spent a lot of ingenuity on defining user options to the presentation process.This should be easy to integrate with the URI query interface:

GET /arith1/lcm?ext=foo.ntn∫=lang:ntn;style:physics
Host: cds.omdoc.org
Accept: application/xhtml+xml

That should do the trick.

A radical new referencing scheme for Openmath and MathML (and OMDoc)

Monday, July 2nd, 2007

We are thinking about how to reference theory-constitutive elements in content dictionaries. We had distinguished “reference by location” (via usual URIs) and “reference by context” (via the OMDoc theories and their constitutive elements) in OMDoc 1.1. It was very hard to explain the latter, and the encoding was a little weird, so I dropped it again from OMDoc 1.2. But the concept is valid and important, so here we go again.

This topic is important, since we are thinking about OpenMath3 and we are adding CDs in MathML3. And I guess that there will be quite a while until we can change these two again, so we better get it right. Moreover, the referencing scheme better be compatible with those two.

Here is the idea: we have nested theories in OMDoc1.2, and we need to reference symbols from them. Now, symbols are referenced by their name, which need not be document-unique (and we do not want to do that, since we want to compose theories in documents. That is why their names have three components: the theory name (cd name in OM; which is document-unique at least in OMDoc), and a symbol name, which is theory/cd-unique. And to disambiguate we have URIs for the cds in the cdbase attribute.

We would like to generalize that in OMDoc1.8, theory names should only be unique in their context (which might be the document context or a theory). So far so good, but then we need a path-like referencing scheme at least for the cd names. So we can really combine them in one path/URI as described in a post on MathML/OM referencing.

The next step in OMDoc would be to allow any content element to be theory-like, and allow it to import. Here is a somewhat extreme example of what we would be able to do.

<!-- all statements are theories, so this is also one -->
<symbol name="nat"/>

<!– this symbol declaration imports from theory “nat” –>
<symbol name=”zero”>
<imports=”nat”/>
<type><csymbol pref=”nat/nat”/></type>
</symbol>

<!– this one also needs a function type, so we import it –>
<symbol name=”suc”>
<imports=”nat”/>
<imports=”simple-types”/>
<type>
<apply>
<csymbol pref=”simple-types/funtype”/>
<csymbol pref=”nat”/>
<csymbol pref=”nat”/>
</apply>
</type>
</symbol>

<!– the third Peano Axiom (1&2 are about types) is only about suc –>
<axiom name=”peano3″>
<imports from=”suc”/>
<imports from=”quant1″/>
<bind>
<csymbol pref=”quant1/forall”/>
<bvar><ci>a</ci><ci>b</ci></bvar>
<apply>
<iff/>
<apply><eq/><ci>a</ci><ci>b</ci></apply>
<apply><eq/>
<apply><csymbol pref=”suc/suc”/><ci>a</ci></apply>
<apply><csymbol pref=”suc/suc”/><ci>b</ci></apply>
</apply>
</apply>
</bind>
</axiom>

Referencing symbols in OpenMath and MathML

Monday, July 2nd, 2007

We are currntly working at an aligned OpenMath/cMathML model for mathematical objects, based on the model for OpenMath objects. This will go into the MathmL3 and OpenMath3 specifications due in spring. Afterwards we will not be able to change much for a long time I expect, so we better get this one right.

There has been some discussion abouth the OpenMath referencing triplet: a symbol (OMS in OM) has three attributes a name, a cd, and a cdbase, e.g. the symbol for addition might be
<OMS cdbase="http://openmath.org/cds" cd="arith1" name="plus"/>

The cdbase and cd attributes determines a content dictionary (in this case the file http://openmath.org/cds/arith1.ocd) and the name attribute a symbol declaration in it (the name of that must be cd-unique).

In MathML3 we want to follow the same general model, but have the definitionURL attribute for specifying meaning. Here we would use the URL http://openmath.org/cds/arith1#plus currently. There was some discussion whether we should have one big CD for MathML or many small ones, … Sam Dooley remarked that if we were to use the OM triplet, then he would like to treat the cd attribute like a cdbase now, which inherits…, then we could write <apply cd="mathml">...<csymbol name="plus"/> ...</apply> (especially if we had one big CD for all MathML, then we could make the cd=”mathml” a default on the <math> element…). Frankly I find this quite attractive (after having thought about it).

I would like to take this idea a little further in MathML3: like MathML2 we use a single URI-type attribute for symbol referencing, let’s call it pref (path ref; just to distinguish it from definitionURL for this post, it could in the end becomd definitionURL to keep backwards compatibility; after all MathML does not say what kind of URLs definitionURL should be; convenient).

So we use pref attributes on csymbols, and take xml:base into the picture we can write

<csymbol pref="http://openmath.org/cds/arith1/plus"/>

<math xml:base="http://openmath.org/cds/">.... <csymbol pref="arith1/plus"/>...</math>

and even

<math xml:base="http://openmath.org/cds/">
<apply xml:base="arith1">
<csymbol pref="plus"/>...</math>
</apply>
</math>

This would make a very simple framework. All the URIs can be used for REST-ful access to the relevant features (symbol declarations in the CDs), and relative URIs work as expected. And if we write content dictionaries in a somewhat atomic way, then we can even supply them on a static web server. It would be quite simple to configure apache that it really generates the right files, for instance, in the directory …/arith1 we could have ocd.php with the CD skeleton and inclusions for the symbol declarations which are represented as files in the directory. e.g. arith1/plus.

That would make it quite simple to set up a structure that would make the cds meaningful.