Transfered from Sundries of Anything posted by Jochem Liem
I met Normen during the poster session at KI2006 while he was explaining Christine about his research. Basically, he uses a structured markup language (in this case LaTeX) to describe the “logical” structure of parts of the document (called infoms).
Currently, an ontology about types of infoms and their relations developed in a European project is used, but custom ontologies can be used to structure documents.
Structuring the document using the infom types defined in the ontology should allow versioning as the changes to infoms (and their relations to other infoms) are formally defined.
I did not really understand why the killer application for structured documents would be versioning, and argued that there are other applications which seem at least as sexy.
1. Collaboration. The infoms could describe the flow of an argument. When writing papers together, the changed infoms could be highlighted. Metadata (data about the changes) could be added to describe why the infom was changed. This seems like something which would be extremely handy to use. Because you structure your content using infoms, the purpose of specific parts of text is also made explicit. This makes it far easier for co-authors to understand the reasoning of the author. Collaboration is a topic which fits in the AIED conference series.
2. Text analysis, suggesting improvements. If the structure of a document in terms of content is known, it is probably possible to say something about the flow of the document. For example, if support for a statement is extremely far away of the statement itself, the document probably has to be restructured. If the knowledge about the document is good, you might even suggest where certain infoms have to be placed.
3. Text editing on a meta-level. It might be possible for a perfectly annotated document to be restructured. For example, the text could be visualised using boxes for the infoms, en arrows for the relations between them. Changing the infoms on the screen also changes their position in the document. This way the flow of a document (an argumentation for example) can be changed on a meta-level. Visualising these relations makes it intuitive for the user whether his new flow of infoms makes sense.
Somehow, I think this kind of research also fits in the semantic web vision. By structuring documents you might be able to make them machine accessible. Currently, it is not clear how text-documents and ontologies should be integrated. We have markup languages like XHTML for text and ontology languages like OWL for concepts, but they each serve a different purpose. So, how do we use the semantic web accomplishments to annotate text document in such a way they can be used by semantic search/service discovery/brokering/etc?
Solving these problems is probably worth multiple PhD’s. Normen, from what I’ve written, have I understood what you are working on? Also, please let us know when we can use your work. ;P