Archive for June, 2007

Narrative Structure of Mathematical Text

Saturday, June 30th, 2007

Here we are again at MKM 2007, listening to Krztof Retel from the Ultra group at Heriott Watt, he is talking about the narrative structure of Mathematical Text. This is very much related to our own MathUI paper.

He proposes to annotate text fragments with names and annotate with RDF triples the relations between the boxes. Then the “dependency graph” is transformed to the “graph of logical precedences” changing some directions. The first is used for checking what we call the document ontology, and the second is the consistency of the text. I do not see anything that we cannot do in OMDoc.

Q: are there any relations that we do not already have in OMDoc? I think not.
Q: is this more than just a standoff-version of OMDoc in RDF? I think not.

MathLang and OMDoc and Souring and Aggregation

Saturday, June 30th, 2007

I am sitting in Robert Lamar’s (from the Ultra Group at Heriot Watt) talk on MathLang. He has the very ambitious goal: He wants to restore natural language as an input method for mathematics. The idea is that he does a linguistic analysis on the mathematical text (including the formulae) and at every level (I would guess that he is using a categorial grammar approach for that; in any case, the result is a nicely hiearchical phrase structure (at least for english)) the “boxes” can be annotated with meaning. This seems to build on the old Nederpelt & Kamareddine weak type theory, which we also have talked about in a KWARC graduate seminar.

In any case, all he does seems to be at the text level, and does not seem to trasncend sentences. So it would really work inside the OMDoc statement level. We could just come up with an XML encoding of the MathLang boxes (do they have one) and make it an OMDoc module. That would standardize it and would keep it in sync with OMDoc and would of course give OMDoc much better control over natural language. I wonder how much of this is automatic.

A wonderful concept he is introducing is the concept of “souring” i.e. the inverse of sugaring (i.e. making it palatable to the human). So souring makes things palatable to the computer. We would probably call this preloading. The souring operation is used for analyzing chains of equations, … This seems quite similar to things I have done in sTeX (and was very proud of at the time). I will have to look it up and compare it.

He takes the souring notation to the extreme, so that he can even include aggregation into account e.g. \forall x,y:A –> \forall x:A \forall y:A. This is really nice to see for a lambda-person like me, quite nifty. Is this really automated? He has souring constructors share, chain, fold, map, position.

I wonder whether this gives a very strong presentation language for OMDoc, we already have map in our system, maybe we should look at this. I am quite intrigued.

Lessons from the DLMF search

Saturday, June 30th, 2007

I am sitting in Abdou Youssef’s talk on his search engine on the DLMF, one thing that stuck me is that he says is that he is doing hit fragment descriptions by pre-computing the fragments at indexing times, storing them in a database and then do a fragment search, i.e. in comparison with MWS, where we compute the fragment at reporting time, he only assembles the hit page from the database, which seems more reliable and of course faster. I think that this should be a standard technique in Math Search that is independent of the search engine.

Of course they have it good, since they generate all their documents from LaTeX and have good control over what is a good fragment. If we are in the general case, this is not true. But we could use some discourse grammar techniques to do the fragment computation.

Treating Mathdex Presentation trees for MathWebSearch

Saturday, June 30th, 2007

I am sitting in Robert Miner’s (Design Science) talk on Mathdex in MKM2007, he is stressing that the most imporant thing in in their experiments turned out to be data normalization. He is actually going a long way towards semantics for the general case. At least he is generating some kind of trees, so we will be able to index them in the MWS system. They are interpretable in first-order terms, which is all that MWS needs. It would be very nice if we could build on the idea we had last MKM to have a set of challenge examples that we all can work on and compare our systems.

MKM for Tom Hales’ Proof Text

Friday, June 29th, 2007

I am sitting in Tom Hales Talk at MKM 2007, where he discusses how he is re-arranging his 1998 proof of the Kepler conjecture in a way so that it becomes more palatable to formalization process (the flyspeck project). He says that the proof text is becoming much less dense in the process.

This seems like a very good case study for MKM, most particular the SWIM semantic Wiki for Math, together with Normen’s Change managment techniques (see Locutor).

I think that I will ask him for access to the text as a case study.

Hello world!

Friday, June 29th, 2007

Peter Murray Rust (PMR) has convinced me that I should be blogging, …. and he has set such a good example, that I am trying this out now. But Christoph Lange, my Wiki conscience, told me that I should also contribute more to the MathWeb Wiki our work group wiki, I promise. Thanks to Christoph for setting up this blog for me.

In this blog I will mainly cover my thoughts about using content markup and structural semantics techniques for improving science. Of course much of this will relate to my own format OMdoc format, and the work of my KWARC group.