Archive for the ‘clange’ Category

German Wines according to the Wine Ontology

Tuesday, April 20th, 2010

Today I saw another Semantic Web application that used the Wine Ontology from the OWL 1 Guide as an example, and once more I – coming from a German wine producing region – stumbled upon the strange “German” wines listed in that ontology: SchlossRothermelTrochenbierenausleseRiesling and SchlossVolradTrochenbierenausleseRiesling. The ontology itself traces back to a 1991 publication on the CLASSIC knowledge representation system by Peter F. Patel-Schneider, Deborah L. McGuinness, and Alex Borgida.

I’m creating this blog post to contribute yet another occurrence of the word “Trochenbierenauslese” to the Web. All occurrences that Google currently lists are related to the Wine Ontology. The correct term would be “Trockenbeerenauslese” (literally “selected harvest of dried berries”). “Trochenbierenauslese” seems to be an uncommon misspelling; Google lists a few hits for “trochenbieren” and “bierenauslese” each. (Note that “Bier” in German means “beer” ;-) )

Then I wanted to learn where the “Schloss Volrad” and “Schloss Rothermel” wineries are.  The former one is actually named “Schloss Vollrads” (literally “Vollrad’s castle”) and located in the Rheingau region. While “Rothermel” exists as a German surname, “Schloss Rothermel” does not exist except in the Wine Ontology.  This will be likely to frustrate any attempt to geo-tag the Wine Ontology.  Or maybe one of the actual winemakers named Rothermel might want to register that brand?  This one from the Baden region, for example.  (Would be a nice contribution to the Semantic Web community, as that is not far away from Karlsruhe.)

What to do now?  Cool URIs don’t change. So why not showcase yet another feature of OWL in the Wine Ontology?  It would be interesting to deprecate those wrong URIs and see how the multitude of examples using the Wine Ontology handles that.

Upcoming SPARQL improvements

Tuesday, February 2nd, 2010

W3C’s new SPARQL working drafts bring a lot of nice features that I soon hope to be widely supported, because our applications would also greatly benefit from them.

Property paths

Property paths will make queries both more powerful and easier to write. Some cases resemble XPath/XQuery:

Find the names of people 2 “foaf:knows” links away.

{
 ?x foaf:mbox <mailto:alice@example> .
 ?x foaf:knows/foaf:knows/foaf:name ?name .
}

… whereas others generalize the idea of transitive closures, which is also relevant in our applications that work on RDF extracted from OMDoc or OpenMath (e.g. finding imported theories, computing dependencies, and checking MMT well-formedness):

Find the names of all the people that can be reached from Alice by foaf:knows:

{
 ?x foaf:mbox <mailto:alice@example> .
 ?x foaf:knows+/foaf:name ?name .
}

Update language

Other features to come are an update language, probably inspired by XQuery Update.  That would, assuming a triple store that supports it, e.g. make it easier to integrate Krextor into applications.

Entailment regimes

Besides enhancements to simple queries, the behavior of SPARQL under different entailment regimes (e.g. RDFS or OWL – in practical terms: what happens when you attach a reasoner to your triple store) will be clarified.

Miscellaneous

In the core of the language, certain other goodies will be specified, such as an easier syntax for negation-as-failure and subqueries (nested queries).

Re: Popculture in logics

Tuesday, December 22nd, 2009

Inspired by a post from Denny Vrandečić, I came up with more quotes from pop culture, rewritten in logics – enjoy, and please correct me if anything should be wrong:

  • ¬∃knows.TroubleI’veSeen (Spiritual)
  • Bier ⊑ ¬∃gibtsAuf.Hawaii ⇒ Ich ⊑ (¬∃fahreNach.Hawaii)⊓(∀bleibe.Hier) (Kuhn, 1963)
  • ¬(⋄I ⊑ ∃get.¬“⊨”) (Jagger/Richards, 1965)
  • ¬∃ b:Business . b = ShowBusiness (Berlin, 1946)
  • I ⊑ ∃shot.Sheriff ⊓ ∀shot.¬Deputy (Marley, 1973)
  • ¬(⊥ ⊓ ¬HoundDog) (Presley, 1956)
  • ¬∃ ¬Sunshine ← gone(She) (Withers, 1971)
  • ∀x ∃y . needs(x, y) ∧ loves(x, y) (Blues Brothers, 1981)
  • I ⊑ ∃feel.(Pretty ⊓ Witty ⊓ Gay) (Bernstein/Sondheim, 1961)

Citing URLs with BibLaTeX and AUCTeX

Monday, December 7th, 2009

I recently switched to BibLaTeX and also convinced Michael.  Key advantages are: a huge supply of entry types and fields, comprehensive customizability, better Unicode awareness, and an exhaustive documentation.  Among the best features is that one can now properly cite URLs.  Not only is the url field supported (and displayed!) for almost all entry types, but also there is a standard way of saying when you last visited a URL – either a combination of the fields urlyear, urlmonth and urlday, or alternatively urldate = {YYYY-MM-DD}. The only tedium that remains is entering such dates. Users who, like me, use the AUCTeX Emacs mode for editing LaTeX and BibTeX, might find the following macro helpful. It is ready to be used in your ~/.emacs file:

(defun bibtex-insert-current-urldate ()
  (interactive)
  (bibtex-make-field
  '("urldate" "" (lambda () (format-time-string "%Y-%m-%d" (current-time))))
  t))

The following line binds it to the keyboard shortcut C-c u:

(add-hook 'bibtex-mode-hook '(lambda ()
			       (define-key
				 bibtex-mode-map [(control c) ?u]
				 'bibtex-insert-current-urldate)))

With the default BibLaTeX style, the urldate field will render as (visited on MM/DD/YYYY).

Microdata vs. RDFa – What does it mean to us?

Wednesday, October 28th, 2009

Only today I became aware of microdata, the proposed way of embedding semantic annotations into HTML5. (Yes, they adopted the syntax that Michael also prefers for OMDoc, and which I personally hate, but I will get used to it.) Microdata are not to be confused with microformats, a poor man’s way of annotation that (ab)uses CSS classes and thus is compatible with HTML 4. Microdata are something like RDFa but

  1. are slightly easier to use for people who don’t understand XML namespaces
    • granted, RDFa’s excessive reliance on XML namespaces makes it hard to parse, and makes it unbearably complex to copy/paste a fragment, which is an important use case for HTML5
  2. allow for ad hoc pseudo-semantic markup when you do not use an ontology
    • What’s the point in annotating at all, then?
  3. compatible with the non-XML syntax of HTML5 (which should have been ditched IMHO, but, well, in the interest of reactionary users and software, they decided differently)

The fight for the future of RDFa in HTML is going on, but what does that mean to KWARC? We have incorporated RDFa into OMDoc as a means of extending the metadata vocabularies. RDFa, originally designed for XHTML, is prepared for being integrated into any XML language, including OMDoc. HTML5 microdata are an integral part of the HTML5 specification and would not work in other XML languages. OK, but we present OMDoc documents as HTML to make them human-readable. In this output, we want to preserve the semantics of the OMDoc markup, and for that we had always been thinking about using RDFa. (We know exactly how to do it, but just have not yet implemented that step, though.) We could use HTML5 microdata instead, but:

  1. RDFa has little software support so far, but microdata have none (beyond proofs of concept)
  2. We generate XML-compliant HTML. The non-XML syntax of HTML5 supports embedded MathML, but I doubt that it will support parallel OpenMath markup, where elements from yet another namespace are embedded into the MathML formulae.
  3. We generate HTML. The embedded annotations need not be authored manually, so they do not have to be easy to author.
  4. We are interested in using well-defined ontologies to express semantics, so we don’t need ad hoc “semantic” markup.

What do you think?

Readably and economically printing LNCS papers

Tuesday, October 20th, 2009

The LNCS format does not print nicely on A4, because the LNCS book pages are much smaller. However, most preprints, your own LNCS papers, and papers you get for reviewing are formatted for A4. Printing one page per sheet wastes a lot of paper for the wide margin, but when you print two pages per sheet you can hardly read the small text any more. Here is a fix:

pdfnup doc.pdf --nup 2x1 --trim "-6cm -6cm -6cm -6cm" --delta "-18cm -18cm" --scale 1.8

Update: Formatting your PDF right

It is even better if you already set the right paper format when creating your PDF. With appropriate printing settings, that gets the print right without the adjustments mentioned above, and it makes screen reading more convenient. Markus Kuhn explains how. However, his measurements didn’t work for me; instead of 92 112 523 778 I had to use 91 71 521 721.

Krextor Publicity

Monday, July 20th, 2009

I was surprised to find the following search result for Krextor

The document “Krextor – An extensible XML→RDF extraction framework.pdf” is no longer available on docstoc.
It has either been removed by the original owner of the document or by the docstoc staff due to copyrighted or inappropriate content.

Isn’t that actually a proof of success, in this new age of the Pirate Party? ;-)

Here is where it was stolen from, and here is the paper.

Shiny and productive to-do notes in LaTeX

Wednesday, July 1st, 2009

I found the ultimate setup for to-do notes in LaTeX (of which my current thesis draft has a lot). Traditionally, I’ve been using Michael’s ednotes, but they didn’t look nice and they destroyed the page break by creating footnotes when enabled. Then, I switched to Henrik Skov Midtiby’s todonotes, which look great (thanks to TikZ), create a nice summary listing, and use the margin to preserve the page break. The only thing that’s missing is the possibility to annotate a complete range of text, which Michael’s ed package supports by the oldpart/newpart environments – and which he has recently spiced up with some color. So here is how to load both packages:

\usepackage{savesym}
\savesymbol{todo} % occurs in both packages
\usepackage[show]{ed}
\restoresymbol{ed}{todo} % now available as \edtodo
\usepackage{todonotes}

SlideShare

Tuesday, June 30th, 2009

Finally, after putting it off for a long time, I’m using SlideShare. Maybe it will get me more publicity, but definitely it makes publishing easier. Now that we have the publication lists on our homepage generated from BibTeX (here’s mine), I don’t want to manually maintain the old one any more (where I linked to all slides), but on the other hand I don’t want to generate BibTeX entries for the slideshows either. Therefore, I will publish them on SlideShare from now on. Hope it may be useful for the world.

KiWi Programming Camp

Monday, March 23rd, 2009

I spent the whole last week at the first Programming Camp of the KiWi project (“Knowledge in a Wiki”), who are developing the successor of the IkeWiki system that SWiM has been based on so far. My plan is to port SWiM from the abandoned IkeWiki to KiWi, which will be under development in the namesake project for two years from now, and further on by the community that is now starting to grow. Version 0.1 of SWiM as a KiWi extension is not yet out, but the KiWi members, particularly those from Salzburg Research, managed to give me a good understanding of the next steps that I need to do. Some preliminary conclusions so far:

  • KiWi’s strength as a scalable and extensible platform for social software (not just as a semantic wiki!) is its architecture. Based on EJB3 and Seam, it has an incredibly steep learning curve – EJB was one of the topics that I tried best to avoid when studying, now I regret that; on the other hand it also took the people at Salzburg Research several months to come up with that elaborate architecture.
  • Openness attracts the community: The KiWis decided to open their programming camp to external developers, as a first effort to start community-building.  They were really committed also to teaching me, the visitor, how to use their system (thanks, Mihai, Rolf, Sebastian, Stephanie, Szaby, Thomas – and their colleagues from Aalborg, Brno and Munich as well!). Even before the programming camp, they did not jealously lock away their sources, but gave external interested people access.  And now, with further adoption of the software in mind, they are switching to the most liberal license, i.e. BSD.
  • With our Subversion and Trac infrastructure, we have done great steps towards more productive development. Still, the KiWis leverage more professional tools, which really make life easier. OK, they are not open source, but require considerably less hacking in order to make them productive: Jira (bug tracker), Fisheye (repository browser), Crucible (code review system, not yet used), and Hudson (automated integration tester).

“KiWi knows” what else I will be able to report in the near future – stay tuned!