is the process of making the knowledge and structure in informal representations explicit, so that they can be acted upon by machines.
Currently, 99% of the available mathematical kwnoledge is encoded in informal mathematical documents: journal articles, books, preprints, handwritten course notes or recordings of lectures. To make these accessible to semantic services and knowledge managment systems, we must semanticize them.
The KWARC group engages in multiple projects to help along semantization. In the sTeX format, we enable authors to semantically prelaop LaTeX documents so that we can generate OMDoc representation from them (again via LaTeXML).
In the arXMLiv project we transform the Cornell ePrint arXiv into XML with MathML and explicit document structure via LaTeXML. In the LLaMaPuN project we develop libraries for automatically identifying meaning structures in arXMLiv documents so that we will eventually be able to harvest OMDoc from the results.