Semantic Tagging: Some thoughts after the WSKS on tagging

  • Tagging is used to structure content (e.g. to generate personalized sequences of lecture material)
  • Social tagging = collaborative structuring of content
  • Tags attach specific information to an object
  • Tags are usually keywords
  • Tagging creates a common vocabulary, in social tagging this is also referred to as folksonomy
  • My suggestions: Towards a more sophisticated (semantic) tagging: i.e. tags are semantic concepts, such as mathematical symbols (e.g. represented in OpenMath or (content) MathML) with commonly agreed on or private definitions, which are stored in Content Dictionaries
  • But: Maybe this conflicts with the definitions in the inclusive tagging paper (WSKS) as the authors distinguish between different levels of annotation (from tag to formal metadata). But considering the very general definition of “tagging is attaching specific information to an object”; we might want to include semantics concepts as potential tag-categories.
  • We provide a corpus of semantically marked up documents in the OMDoc format and respective workflows which allow the automatic extraction of mathematical symbols (which we want to use as semantic tags). For example, the panta rhei system provides an import for OMDoc during which it extracts all symbols; we simply need to memorizes the relation of symbols and the imported content snippets to provide the respective tags.
  • Moreover, I suggest to distinguish two types of semantic tags: acquired symbols and required symbols (prerequisites). Based on our OMDoc markup we can identify which symbols are required for the illustration in a mathematical theory and which symbols are acquired when studying the theory: Required symbols are specified via the OMDoc import-elements (which imports symbols from another mathematical theory) and acquired symbols, which are simply all remaining symbols that are not imported from other mathematical theories. Acquired symbols are defined/ introduced in the given theory.
  • Based on the extracted tag, we can visualize tag clouds for each content (e.g. in panta rhei)
  • We should also provide a user interface for creating tags:
    • Users can associate symbols to content snippets in the system (in particular to non-semantic content such as the forum, the library entries, manually entered problems — this allows us to use the semantic objects to bring order/structure in the collection of non-semantic content);
    • Users can create new tags (new symbols); this interface needs to be very intuitive, easy, and usable.
    • Maybe we can also allow users to use keywords for tagging: But these are non-semantic tags and should be disntiguished
  • Based on tagging-structure we can implement tag-based browsing: Given a tag cloud; the selection of a tag provides (i) all resources tag with this tag and (ii) all users that used this tag; clicking on a resources provides the collection of all tags of this resources and the collection of all users that tag this resource; selecting a user provides all his tags and tagged resources …
  • However: the tagging of non-semantic content restricts the granularity of the tags (as we cannot annotate fine granular content inside e.g. a post, we do not have IDs; maybe we have to consider a different annotation approach – e.g. based on xpointer as annotae is doing it); However, for now we neglect the granularity. If a posting annotates a content, we extract the symbols of the annotated area and use them to automatically tag the posting; if the posting links to other content we propagate the tags to this content

Further Readings

  • How do others define/ interpret semantic tags? e.g. see [1]; [2]; [3] (German)

Leave a Reply

You must be logged in to post a comment.