Archive for the ‘OMDoc’ Category

Referencing symbols in OpenMath and MathML

Monday, July 2nd, 2007

We are currntly working at an aligned OpenMath/cMathML model for mathematical objects, based on the model for OpenMath objects. This will go into the MathmL3 and OpenMath3 specifications due in spring. Afterwards we will not be able to change much for a long time I expect, so we better get this one right.

There has been some discussion abouth the OpenMath referencing triplet: a symbol (OMS in OM) has three attributes a name, a cd, and a cdbase, e.g. the symbol for addition might be
<OMS cdbase="http://openmath.org/cds" cd="arith1" name="plus"/>

The cdbase and cd attributes determines a content dictionary (in this case the file http://openmath.org/cds/arith1.ocd) and the name attribute a symbol declaration in it (the name of that must be cd-unique).

In MathML3 we want to follow the same general model, but have the definitionURL attribute for specifying meaning. Here we would use the URL http://openmath.org/cds/arith1#plus currently. There was some discussion whether we should have one big CD for MathML or many small ones, … Sam Dooley remarked that if we were to use the OM triplet, then he would like to treat the cd attribute like a cdbase now, which inherits…, then we could write <apply cd="mathml">...<csymbol name="plus"/> ...</apply> (especially if we had one big CD for all MathML, then we could make the cd=”mathml” a default on the <math> element…). Frankly I find this quite attractive (after having thought about it).

I would like to take this idea a little further in MathML3: like MathML2 we use a single URI-type attribute for symbol referencing, let’s call it pref (path ref; just to distinguish it from definitionURL for this post, it could in the end becomd definitionURL to keep backwards compatibility; after all MathML does not say what kind of URLs definitionURL should be; convenient).

So we use pref attributes on csymbols, and take xml:base into the picture we can write

<csymbol pref="http://openmath.org/cds/arith1/plus"/>

<math xml:base="http://openmath.org/cds/">.... <csymbol pref="arith1/plus"/>...</math>

and even

<math xml:base="http://openmath.org/cds/">
<apply xml:base="arith1">
<csymbol pref="plus"/>...</math>
</apply>
</math>

This would make a very simple framework. All the URIs can be used for REST-ful access to the relevant features (symbol declarations in the CDs), and relative URIs work as expected. And if we write content dictionaries in a somewhat atomic way, then we can even supply them on a static web server. It would be quite simple to configure apache that it really generates the right files, for instance, in the directory …/arith1 we could have ocd.php with the CD skeleton and inclusions for the symbol declarations which are represented as files in the directory. e.g. arith1/plus.

That would make it quite simple to set up a structure that would make the cds meaningful.

OMDoc Versions

Monday, July 2nd, 2007

I am surprised that it is already almost a year since OMDoc 1.2 appeared, and we have to do something.

We have been thinking about the future of OMDoc Version numberings. Currently OMDoc is at 1.2, and we have collected a lot of ideas for 2.0. Some of them have been presented at the MKM 2007 Workshops, and have met friendly comments. Somehow a direct push towards 2.0 seems scary, and it may be better to use the summer to make a revision 1.3 with these new ideas, bring it out, and move on to the next set of ideas and improvements.

We came up with a new scheme: we will try to have our cake and eat it: we come out with a specification that is somewhat incremental soon (I guess fall) integrating the new ideas, and call it 1.8. That shows that we are on our way to 2.0. This version will be incremental in that we will mostly only add things, and redefine the old syntax in terms of the new, and deprecate some of the old syntax (e.g. the use of xslt in the presentation elements like we do in content MathML3). Then we have one more version (1.9) to accumulate new stuff (e.g. the new MathML3/OM3 syntax), and in 2.0 we will throw out the accumulated deprecated functionality.

That will give us a relatively clean roadmap towards 2.0, I think.

Narrative Structure of Mathematical Text

Saturday, June 30th, 2007

Here we are again at MKM 2007, listening to Krztof Retel from the Ultra group at Heriott Watt, he is talking about the narrative structure of Mathematical Text. This is very much related to our own MathUI paper.

He proposes to annotate text fragments with names and annotate with RDF triples the relations between the boxes. Then the “dependency graph” is transformed to the “graph of logical precedences” changing some directions. The first is used for checking what we call the document ontology, and the second is the consistency of the text. I do not see anything that we cannot do in OMDoc.

Q: are there any relations that we do not already have in OMDoc? I think not.
Q: is this more than just a standoff-version of OMDoc in RDF? I think not.

MathLang and OMDoc and Souring and Aggregation

Saturday, June 30th, 2007

I am sitting in Robert Lamar’s (from the Ultra Group at Heriot Watt) talk on MathLang. He has the very ambitious goal: He wants to restore natural language as an input method for mathematics. The idea is that he does a linguistic analysis on the mathematical text (including the formulae) and at every level (I would guess that he is using a categorial grammar approach for that; in any case, the result is a nicely hiearchical phrase structure (at least for english)) the “boxes” can be annotated with meaning. This seems to build on the old Nederpelt & Kamareddine weak type theory, which we also have talked about in a KWARC graduate seminar.

In any case, all he does seems to be at the text level, and does not seem to trasncend sentences. So it would really work inside the OMDoc statement level. We could just come up with an XML encoding of the MathLang boxes (do they have one) and make it an OMDoc module. That would standardize it and would keep it in sync with OMDoc and would of course give OMDoc much better control over natural language. I wonder how much of this is automatic.

A wonderful concept he is introducing is the concept of “souring” i.e. the inverse of sugaring (i.e. making it palatable to the human). So souring makes things palatable to the computer. We would probably call this preloading. The souring operation is used for analyzing chains of equations, … This seems quite similar to things I have done in sTeX (and was very proud of at the time). I will have to look it up and compare it.

He takes the souring notation to the extreme, so that he can even include aggregation into account e.g. \forall x,y:A –> \forall x:A \forall y:A. This is really nice to see for a lambda-person like me, quite nifty. Is this really automated? He has souring constructors share, chain, fold, map, position.

I wonder whether this gives a very strong presentation language for OMDoc, we already have map in our system, maybe we should look at this. I am quite intrigued.