Archive for the ‘semantic documents’ Category

AI Mashup Challenge 2012

Sunday, February 19th, 2012

If you are developing mashups – lightweight web applications that offer new functionality by combining, aggregating and transforming web resources and services – this a great opportunity to win a prize.

Call for Papers

AI Mashup Challenge 2012 at the 9th Extended Semantic Web Conference (ESWC) Hersonissos, Crete, Greece; May 27–31, 2012

Submission deadline: March 31, 2012

(more…)

I-Semantics Conference and Linked Data Cup

Monday, January 30th, 2012

I-Semantics is a very nice conference on applied semantic web research, excellent for networking with the European research community and industry.

Call for Papers

I-SEMANTICS 2012 8th International Conference on Semantic Systems Graz, Austria, 5 – 7 September 2012

including Call for Submissions 5th Linked Data Cup

Latest News:

  • Wolters Kluwer Germany main sponsor of I-SEMANTICS 2012
  • I-SEMANTICS proceedings published by ACM ICPS
  • Important Dates (Research & Application Papers & I-Challenge)
    • Abstract Submission Deadline : April 2, 2012
    • Paper Submission Deadline : April 13, 2012
    • Notification of Acceptance: May 7, 2012
    • Camera-Ready Paper: June 4, 2012
  • Important Dates (I-Challenge)
    • Paper Submission Deadline : April 13, 2012
    • Notification of Acceptance: May 7, 2012
    • Camera-Ready Paper: June 4, 2012
  • Important Dates (Posters & Demo Papers & PhD Track)
    • Submission Deadline: May 21, 2012
    • Notification of Acceptance: June 18, 2012

(more…)

SePublica@ESWC Workshop (May 27 or 28, Crete)

Monday, January 30th, 2012

an ESWC 2012 Workshop. May 27 or 28, Hersonissos, Greece.

http://sepublica.mywikipaper.org/

At SePublica we want to explore the future of scholarly communication and scientific publishing. As we are going through a transition between print media and Web media, SePublica aims to provide researchers with a venue in which this future can be shaped.

Important Dates

  • submission deadline: March 18 (extended)
  • acceptance notification: April 1
  • camera ready: April 15

(more…)

SePublica@ESWC Workshop on Semantic Publication (May 30, Crete), LNCS Post-proceedings, Best Paper Award by Elsevier

Sunday, January 16th, 2011

I am a chair of the following workshop (and Michael is on the PC), which is closely related to KWARC’s research interests (specifically KWARC-relevant topics highlighted below):

1st International Workshop on Semantic Publication (SePublica 2011)
at the 8th Extended Semantic Web Conference (ESWC 2011)
May 30th, Hersonissos, Crete, Greece

Keynote by Steve Pettifer, Manchester University, UK: “Utopia Documents and The Semantic Biochemical Journal experiment”

SUBMISSION DEADLINE (extended) March 4

Highlights:

The MISSION of the SePublica workshop is to bring together researchers and practitioners dealing with different aspects of Semantic Technologies in the Publishing Industry. How is the Semantic Web impacting the publishing industry? How is our experience of publications changing because of Semantic Web technologies being applied to the publishing industry?

(more…)

TEI Guidelines mention MathML, OpenMath, and OMDoc

Saturday, July 31st, 2010

Someone in the humanities must be interested in OMDoc. I was really surprised to find a reference to OMDoc in the section “Formulæ and Mathematical Expressions” guidelines (a.k.a. specification) for TEI. TEI (Text Encoding Initiative) is the standard semantic markup language for humanities, social sciences and linguistics, much like DocBook for technical manuals. All that TEI itself has is an element <formula notation=”…”/>, where notation refers to the language in which the formula is represented. But the guidelines refer to some mathematical markup languages, from which the document author is asked to “make an informed choice”:

  • TeX – the obvious candidate, also used in some examples
  • MathML – the obvious candidate when XML is desired.  They give one Presentation MathML example but also mention Content MathML.
  • OpenMath – much less expected. Nice to see that here. Oh the other hand, the links to the OpenMath standard are outdated. I should probably report that.
  • OMDoc – I didn’t expect that at all.

DITA/OMDoc Compatibility (or topic-based writing in OMDoc)

Friday, April 9th, 2010

When I was at the WritersUA conference before easter, the compatibility (and transformation) between DITA (as a topic-centered format) and DocBook (as a narrative one) was one of the topics with wider interest. In OMDoc we have always maintained that we can follow both the topic-centered approach (which is quite natural for mathematical texts and indeed for wiki-based approaches like the one in SWiM) as well as the narrative one. So I got thinking how we would really do the topic-centered approach in OMDoc.

When I was reading Christine Müller’s Ph.D. thesis that looked a the integration of topic-based and narrative writing styles, I noticed that she says that OMDoc does not have support for topic-style writing. I think that this is wrong. Taking her example (slightly simplified)

<concept id="A.dita">
 <title>Natural Numbers</title>
 <conbody>
 <p>The set of <term>natural numbers</term>
 defined <cite>here</cite> or in <xref href="nat.dita#nat1"/>.
 </p>
 <para conref="topic/p2"/>
 </conbody>
 <related-link>http://example.com/nats.html</related-link>
</concept>

it is obviously directly  expressible in OMDoc as

<omdoc>
 <omgroup type="concept" xml:id="A.dita">
 <metadata>
 <dc:title>Natural Numbers</dc:title>
 <link rel="dita:related-link" resource="http://example.com/nats.html"/>
 </metadata>
 <omgroup type="conbody">
 <omtext&gt
 <CMP>
 <p>The set of <phrase role="term">natural numbers</phrase>
 defined <cite>here</cite> or in <ref type="cite" href="nat.dita#nat1"/>.
 </p>
 </CMP>
 </omtext>
 <ref href="topic/p2" type="include"/>
 </omgroup>
 </omgroup>
</omdoc>

(again slightly simplified; I am leaving out the relevant namespace declarations). It should be directly obvious that we can define an OMDoc sublanguage that is isomorphic to DITA. Indeed I think that this is an exercise that would be worth doing. After all, there was a message from Bryce Nordgren  about opening oup a Math domain in DITA (see http://openmath.org/pipermail/om/2009-February/001203.html for details), which could use this isomorphism as a guiding light.

Of course DITA not only has topics, but also topic maps, let me again use an example from Christine’s thesis.

<map title="title">
 <topichead navtitle="navi-title" audience="math"/>
 <topicref href="A.dita" collection-type="sequence">
 <topicref href="A1.dita"/>
 <topicref href="A2.dita"/>
 </topicref>
 <reltable>
 <relrow>
 <relcell>A.dita</relcell>
 <relcell>B.dita</relcell>
 </relrow>
 </reltable>
</map>

The first part of this map is just what we have always thought of as a narrative structure in our NarCon approach in OMDoc. So we can directly represent it as something like

<omdoc>
 <metadata>
 <dc:title>title</dc:title>
 <link rel="dita:audience" resource="something:math"/>
 <link rel="dita:navtitle" resource="navi-title"/>
 </metadata>
 <omgroup xml:id="A.narrative" type="sequence">
 <ref type="include" href="A1.omdoc"/>
 <ref type="include" href="A2.omdoc"/>
 </omgroup>
</omdoc>

I must confess that I do not really understand what the href on the top-level topicref means, so I have left it out. Note that I am only interested in the general compatibility of the formats and not the details of the translation, which will have to be worked out. That leaves us with the reltable, which (as far as I can understand it a way to specify cross-references that is a better alternative to <related-links>, since it is more portable and attached to DITA maps (which we can think of as discourse-level presentation of the content structure given by the graph of DITA topics). So I would just add the following metadata section to the <omgroup> element:

<metadata>
 <link rel="dita:related-link" resource="http://example.com/nats.html"/>
</metadata>

OK, that ends our little comparison exercise. There are a couple of conclusions I would like to draw from this:

  1. OMDoc can do topic-oriented writing quite nicely
  2. the OMDoc1.3-style metadata help significantly
  3. rather than develop a DITA ontology (hinted at with the dita: namespace prefixes) we should develop ontologies that describe the various aspects of topic-based writing in generality and find the respective markup primitives. For instance dita:audience seems weird, there must be an ontology in the eLearning realm that already formalizes this.
  4. The OMDoc-1.6 idea of leaving out the <metadata> element and freely intermixing the metadata <link>, <resource> and <meta> with the OMDoc content will make the translation much simpler and direct, e.g. for the <reltable> and <related-link> elements from DITA which are situated at the end in the original.

OK, that is all I have to say at the moment, please give me feedback.

Microdata vs. RDFa – What does it mean to us?

Wednesday, October 28th, 2009

Only today I became aware of microdata, the proposed way of embedding semantic annotations into HTML5. (Yes, they adopted the syntax that Michael also prefers for OMDoc, and which I personally hate, but I will get used to it.) Microdata are not to be confused with microformats, a poor man’s way of annotation that (ab)uses CSS classes and thus is compatible with HTML 4. Microdata are something like RDFa but

  1. are slightly easier to use for people who don’t understand XML namespaces
    • granted, RDFa’s excessive reliance on XML namespaces makes it hard to parse, and makes it unbearably complex to copy/paste a fragment, which is an important use case for HTML5
  2. allow for ad hoc pseudo-semantic markup when you do not use an ontology
    • What’s the point in annotating at all, then?
  3. compatible with the non-XML syntax of HTML5 (which should have been ditched IMHO, but, well, in the interest of reactionary users and software, they decided differently)

The fight for the future of RDFa in HTML is going on, but what does that mean to KWARC? We have incorporated RDFa into OMDoc as a means of extending the metadata vocabularies. RDFa, originally designed for XHTML, is prepared for being integrated into any XML language, including OMDoc. HTML5 microdata are an integral part of the HTML5 specification and would not work in other XML languages. OK, but we present OMDoc documents as HTML to make them human-readable. In this output, we want to preserve the semantics of the OMDoc markup, and for that we had always been thinking about using RDFa. (We know exactly how to do it, but just have not yet implemented that step, though.) We could use HTML5 microdata instead, but:

  1. RDFa has little software support so far, but microdata have none (beyond proofs of concept)
  2. We generate XML-compliant HTML. The non-XML syntax of HTML5 supports embedded MathML, but I doubt that it will support parallel OpenMath markup, where elements from yet another namespace are embedded into the MathML formulae.
  3. We generate HTML. The embedded annotations need not be authored manually, so they do not have to be easy to author.
  4. We are interested in using well-defined ontologies to express semantics, so we don’t need ad hoc “semantic” markup.

What do you think?

Submitting content to OMBase and logging

Monday, March 24th, 2008

While I was reading up on the REST papers in my last post, I stunbled upon the following best practice for making sure that material is only submitted once to a RESTful application. This is something we should adopt in OMBase as well, just to be safe.

Another thing that we should think of in this  arena is to enable some form of RESTful logging facility, so that users can find out what happened to the content. The technology that seems best suited for that seems to be RSS or Atom Syndication (probably the latter). The nice thing is that we could log all the changes to any URI we use in the system. I am not sure under which URL we would address the log, one idea is to just make use of the the mime type application/atom+xml just as for the xhtml presentation as suggested in my last post that would at least alleviate the choice of URL.

Integrating Presentation into OMBase

Monday, March 24th, 2008

I have just been reading up on REST again, since I found a very palatable pair of articles (REST intro, and  practices). This got me thinking about the state of OMBase, and the integration of our presentation pipeline into the OMBase interface. It is RESTful, since we have MMT addressing via URIs implemented. You just use a GET to retrieve them.

What I have talked with Florian about, but maybe not with the OMBase team, is how to integrate presentation. That should be very simple from the interface point of view: we just take the same URLs, but a different HTTP header.

GET /arith1/lcm
Host: cds.omdoc.org
Accept: application/omdoc+xml

gives you the OMDoc file and

GET /arith1/lcm
Host: cds.omdoc.org
Accept: application/xhtml+xml

gives you the presented version (with the standard options). Now, we have written a paper about presentation and submitted it to MKM and Christine has spent a lot of ingenuity on defining user options to the presentation process.This should be easy to integrate with the URI query interface:

GET /arith1/lcm?ext=foo.ntn∫=lang:ntn;style:physics
Host: cds.omdoc.org
Accept: application/xhtml+xml

That should do the trick.

Ontology repair in Physics

Thursday, February 21st, 2008

I am just sitting the CIAO workshop and Alan Bundy and Michael Chan are talking about a very nice topic: the evolution of ontologies in Physics. They are applying this to historical examples like the latent heat problem and the MOND theory that is hot in Physics at the moment. The idea is that when experiments contradict theory, there is a clash between the theory ontology Ot and the sensory Ontology Os, which they solve by renaming apart selected concepts between the ontologies to resolve the contradiction. So they change the ontologies by renaming. The nice thing is that they can interpret the operation of renaming as a conservative theory extension which gives a nice interpretation of minimal theory change/repair.

You can find the details here.

Even though I totally buy into their observations, I think that  it would be better to keep the theories as they are and interpret the repair operations as theory morphims. That would be a non-desctructive operation, and the operations would become very natural theory morphisms.