Archive for the ‘CoP’ Category

Stephen Watt’s talk on analyzing subject areas by symbol frequencies

Sunday, July 27th, 2008

Stephen Watt proposing to automate subject classification from the document content. He says we should belive the document more than the classifiers. I think this is potentially very useful for our KWARC work, in particular to aid in large-scale analysis of documents, e.g. in the arXiv. For instance in notation understanding (@Christine are you listening?). In fact that is what Stephen is talking about just at the moment. He is also interested in pen-based formula recognition and it is clear that this is helpful here, and this also provides the motivation of looking at formulae only, since in pen-based math only has the formulae (not a lot of words here). I think there is also a another motivation: formulae and text constrain each other.

They use of arXiv (of course) and they also have a corpus of engineering Math texts, which is a good corpus, since engineering students imprint on this (like geese on Konrad Lorenz’ rubber boots).  I would like to get my hands on this corpus.

So Stephen computed the symbol frequencies on the corpora, and used pre-existing area classifications for the classifications. The ranking of symbols seems to give a nice key to distinguish  areas. In fact, you only look at the 10 top most comon symbols to identify the area. This really looks like CoP data. This is certainly very very interesting for us.

I would really like to see whether this technique can be used to predict citation cliques/cartels or the math genealogy database.

Case Study On the KWARC Group

Thursday, April 3rd, 2008

In November 2007, I ask the members of the KWARC group to fill out questionnaires to get a better intuition about the social and scientific practice of the KWARC community. The detailed results have not been published due to data privacy issues, but the general findings are provided in evaluation.pdf.

Case Study On Proving Practice

Thursday, March 27th, 2008

After reading the book How to solve it by Polya [Pol73], I decided to analyze different ways of problem solving based on how my colleagues and students (altogether 10 volunteers) proved the following simple lemmas: (1) For all prime numbers p≥5 prove that z=p2-1 is divisible by 24 and (2) Proof that the center of gravity of a polygon equals the average of all points of the polygon. Although no representative survey was carried out, the case study brought up some interesting findings: For (1), some person wrote a lot of explanatory text for almost every step of their proof; while others skipped several steps they found obvious. The level of formality varied among the test persons: some wrote their proofs very close to a form that could be verified by e.g. theorem provers, while others stick to a rather informal writing of their proofs. Moreover, the test persons took different approaches of how to proof the lemma: Some could easily write down a formal proof, while others started with examples and counter examples to get a better idea of how to solve the given problem. For (2), some test persons used drawings to get a better idea whether they had to proof or refute the given geometric problem.

Instantiating CoP Models

Sunday, September 16th, 2007

In his blog, Normen Müller recently posted a comment on Only humans are capable of understanding and making necessary intelligent choices. I agree that the structural semantic techniques allow for relieving humans from tedious choices or analysis. Accordingly, I apply semantic techniques to extract COP relevant information from documents in order to instantiate a mathematical COP model. If documents on their own are not sufficient to model COPs, I will also consider web2.0 techniques to gather further COP-relevant information.

Given that semantic techniques are the most suited approach for automatic extraction of COP data, a specific semantic representation technique/ format needs to be found. Since my work applies COP to the domain of mathematics, these should be suited for this domain. Looking at the mathematical representation format such as MATHML, OPENMATH , and OMDOC , OMDOC seems the most suited one for mathematics.

OMDOC’s advantage lies in its three layers: the object layer, the statement layer, and the theory layer. MATHML and OPENMATH are standards that provide semantics on the object layer, i.e. on the level of formulae. For example, thanks to their mark up, software can distinguish operations such as ’+’ , which represents the arithmetic ‘addition’ operation as well as the logical ‘OR’ operation. The statement layer allows for typing of document fragments, e.g. definitions, examples, or lemmas. The theory provides the context of the mathematical concept. The strength of OMDOC also lies in its implementation. For example, an assertion element with types such as ’lemma’ is used instead of elements for every mathematical object.

The usage of OMDOC will allow me to verify, whether semantic mark-up facilitates the automatic description COPs. I will point out whether OMDOC supports COP-information extraction or whether the format needs to be extended. Since OMDOC has been invented and is continuously developed by our research group, I will not only be facilitated to understand the syntax and mark-up but, moreover, will be able to suggest extensions and contribute to the further development of OMDOC.

observing mathematical practice

Sunday, September 16th, 2007

To observe the mathematical practice, I currently see two methods: (1) An interactive approach in which I contact members of my group to question them about the understanding and view on mathematical practice, and (2) an document-based approach, in which I analyze their publications and lecture notes in order to describe the mathematical practice based on documents. The latter approach, continues the discussion in Andrea and Michael Kohlhase’s paper at the MKM06, in which they assume that mathematical practice is inscribed into documents and that an analysis of document collections allows for identifying communities of practice.

I assume that the combination of both approaches allows for (1) verifying whether a mathematical COP exists, i.e. whether the mathematical practice instantiates the model of a general COP, and (2) describing mathematical practice (e.g. based on interviews and documents) and comparing it with the model; and (2) verifying the assumption of whether or not documents are suited to identify mathematical practice at all and/ or on their own (without additional consideration of e.g. social networks between the authors).
(3) If no mathematical COP exists, I could verify whether and how such a COP can be established (e.g. by comparing the current state with the required conditions in the COP model).

COP-supported Requirements Gathering

Sunday, September 16th, 2007

Another application refers to the usage of COP models to support the requirements gathering: The client has a problem, based on that the engineer extracts requirements. The COP-supported requirements gathering tool, then verifies these requirements by comparing them with the client’s COP’s requirements: either approving, extending them or indicating conflicts between them. For example, the client does not specify the language but his COP prefers German and English. Or the client requires French, which conflicts with his COP’s requirements. Related work in this field is currently achieved by the Frauenhofer Institute, Instute Experimentelles Software Engineering.

Modeling COP Life Cycles

Sunday, September 16th, 2007

In their paper at the Mensch und Computer conference on the Sen-sation System , C. Beckmann et al. suggest an approach to model stages of team based on the economic research of team dynamics (e.g. Group Dynamics by Tuckman and Drexler et al. or Change Theory (unfreezing,transition, refreezing) by Kurt Lewin). This idea could also be transferred onto the modeling of COP life cycles and would enable software to identify the current stage of a COP and to e.g. support the establishment COPs.

Modeling COPs to improve Software

Saturday, September 15th, 2007

In contrast to designing software to support a social system, the social system could also be used to improve the software design, e.g. by enabling the software to automatically understand and identify the practice of the social system it supports. Enabling a machine to understand the concept of a COP can be achieved by semantics, either by means of symbolic semantics (e.g. logic) or statistical methods.

From this perspective, my research could aim to design a methodology of how to enable software to identify/ know about/ take into account the practice of a mathematical social system (e.g. the MKM community) using semantics technologies and, based on this, offer functionalities that deploy the internal COP model to e.g. improve the systems search or recommendations.

Designing Software to support existing COPs

Saturday, September 15th, 2007

During the doctoral seminar of the Conference on Mensch und Computer, Volker Wulf introduced us to the community of conferences in the field of Human Computer Interaction (HCI) and Computer Supported Collaborative Work (CSCW) . He explained that most research in this community aims at developing software that has an effect on an existing social system, i.e. a specific community in the real world. Most commonly, three steps are taken: (1) a pre-study to observe the social system and to gather requirements, (2) the design of a system that solves the existing problems, (3) an evaluation of the effect of the developed system on the social system.

Accordingly, my research could focus on the development of software that supports an existing social system, e.g. the community of the Conference on Mathematical Knowledge Management (MKM). In respect to the suggestion by Wulf, this would require a pre-study, i.e. an observation of the MKM community, to identify the participants’ current problems and needs, and to conclude requirements for a respective software solution. Afterwards, this would imply the design of the software which functionalities solve the current problems. Finally, I would evaluate the effect of the software on the MKM community and the success of the implementation.

More Scoop Musing

Thursday, August 30th, 2007

In the second invited talk, Toby White is talking about SciSpace an experiment of social-software-mediated collaborative scientific research.

The main thrust of the intro is that there is a new kind of scientific practice is emerging, e.g. in the environmental sciences. This involves massive cross-institutional collaboration of scientists and programs. The problem in collaboration is not the lack of communication. We have giant bandwidth, but understanding it is the problem. But just managing e-mail discussions across multiple interlocutors is almost unworkable (think adding a person to a long one). In particular, you are interested in the history of the project, and that is extremely hard to extract from the discussions, since it is multi-threaded and distributed.

Toby and some colleagues decided that they need something like scientific Facebook. SciSpace is like MySpace, but for Scientists. The logic is trivial, to implement as a system, but it is very hard to get to look nice, and easy to use. They are using an open-source social networking framework called ELGG out of Oxford. SciSpace has about 100-200 users and about 30 active ones from that.

Toby claims that the nice thing about SciSpace is that you kind of know what people you are blogging to, you can just keep up with what your colleagues/boss is doing, and what you may contribute to.

This would be a great thing to integrate with Panta-Rhei, maybe we can even re-implement that system in ELGG.  I really wonder whether they have some kind of repository feature. Toby tells us that there are wikis, but they are not very well-integrated, at least not in the same cool way.