Archive for November, 2008

Three types of mathematicians

Wednesday, November 26th, 2008

Cristian Calude and I discussed the price and gains of content mark-up. He emphasized that unless we provide interesting features, mathematicians will not see the value in content mark-up and the reason for additional efforts (e.g. when using sTeX instead of LaTeX).

Below you find three groups of mathematician that most likely need different amount of arguments to be convinced (please note that I am not citing Cris. I simply present what I remember from our discussion and made up the names of the groups):

  • Pen-and-Paper guys: They only use computers for publishing but all mathematics is actual developed on paper. The publishing process is seen as a tedious and inconvenient activity that takes time away from the actual job that a mathematicians wants to do. The digitalization is annoying (proof reading). In earlier times, this was done by secretaries and the publisher, but nowadays publishers only accept LaTeX, which is really seen as a burden by this group of mathematician.
  • LaTeX Lovers: There are mathematicians that think in LaTeX. The use it a lot for developing their ideas and incrementally revising proofs with colleagues. This groups seems to have an increasing influence on scientific publishing as most publisher (in mathematics) will nowadays reject a submission if it is not provided in LaTeX.
  • Innovators: The third group wants even more. (We didn’t really talk about this group long) For example, this groups promotes semantic technologies and aims at making mathematics machine-processable as well as bringing mathematics to the web. I assume that includes vast parts of the MKM community.

Maybe we need to start asking ourselves: Would we use our tools and services? (Who is using sTeX?) And if so, for which activities? Think of the very early steps towards a new topic. Would you like to be forced to content mark-up? Although we provide full flexibility in switching between concepts, simply having to establish theories and marking up structure really slows down the creative thinking. So when is a good timing of using content-based techniques? Do we restrict it to the very last stage of scientific work, i.e. the publishing process, or teaching (the latter not even recognized as scientific activity)?

I am collecting arguments on gains and burdens of content mark-up (in the panta rhei trac), in particular, with a focus on the technologies and services provided by KWARC. I’d appreciate your feedback and comments!

Documenting XSLT

Wednesday, November 26th, 2008

A considerable part of the implementation of my research prototype(s) is done in XSLT. Now that the extraction of RDF from semantic markup is more and more turning in to a project of its own, more software engineering was needed – including proper documentation.

It turned out that XSLTdoc is a really nice solution for that: Just put a few additional XML elements in front of every template or function and run a special XSLT to generate the documentation. Works like javadoc and looks nice.

Growing the Semantic Web with Inverse Semantic Search

Wednesday, November 5th, 2008

I recently read an interesting paper that was presented at the INSEMTIVE workshop (incentives for the semantic web). Hans-Jörg Happel addresses the problem that a lot of metadata (tags, annotations, etc.) exist in private knowledge spaces but are not shared on the semantic web – maybe because of privacy concerns, maybe because nobody has ever asked for them, whatever. His new idea is to design a search engine that knows which private metadata could improve the result set – and then asks their authors to publish them: i.e. the opposite of the “publish first, retrieve then” process of traditional IR.