Archive for October, 2008

Five misunderstandings about case-study research

Sunday, October 19th, 2008

I recently read a paper by Bent Flyvbjerg in which he discusses and justifies the usefulness of case study/ case methodology in social science. I am wondering whether and how is assumptions can be applied to mathematics. I am not summarizing the paper and am not providing its application to math; but simply sketch my thoughts.

Old-fashioned definition: “Case Study. The detailed examination of a single example of a class of phenomena, a case study cannot provide reliable information about the broader class, but it may be useful in the preliminary stages of an investigation since it provides hypotheses, which may be tested systematically with a larger number of cases. (Abercrombie, Hill, & Turner, 1984, p. 34)”

Cases in mathematics:

Claim: “General, theoretical (context-independent) knowledge is not more valuable than concrete, practical (context-dependent) knowledge”.

Bent Flyvbjerg discusses the role of cases and theory in human learning and emphasises that case study (e.g. carefully chosen experiments, experiences, cases) produces the type of (concrete, practical) context-dependent knowledge that research on learning shows to be necessary to allow people to develop from rule-based beginners to virtuoso experts. In contrast, textbooks provide (general, theoretical) context-independent theories (focus on universals) and bring the students just to a beginner’s level.

(This supports our work towards context-dependent mathematical learning objects; although we have so far defined context by the logical/ narrative/ social relation of mathematical knowledge — but maybe by”social” we actually mean concrete/ individualized/ practical aspects)

We make use of cases in mathematics (and maybe should adapt mathematical learning respectively rather than solely presenting abstract/ universal theories): William Farmer one’s told me about a colleague who had to give a lecture on algebra unprepared. Consequently, when presenting a proof, he had to continuously revise his steps, wipe the board, and start over again. This incremental approach actually helped the students to really understand how mathematics is practised, i.e. that the universal and abstract theories are not invented from scratch but have to be iteratively developed based on many cases — examples, counterexamples, etc. And also Cristian Calude has recently pointed out once more that the way of doing math cannot at all be compared to the way of presenting the final results!

Generalization and Force of Example

Claim: “One can often generalize on the basis of a single case (i.e. does not necessarily need statistics/ quantitative studies), and the case study may be central to scientific development via generalization as supplement or alternative to other methods. But formal generalization is overvalued as a source of scientific development, whereas the force of example is underestimated.”

In “Mathematical Naturalism” Philip Kitcher illustrates that the development of mathematics can be seen as a stepwise process from generalizations to their symbolic substitutes. Kitcher underlines the dynamics in mathematics, i.e. the creating, revising, and dismissing of mathematical knowledge, as well as the process of abstracting experiences to gain symbolic substitutes.

So what is the value of generalization and examples in mathematics? Although finding mathematical results is based on a case-based generalization process, mathematicians are particular interested in discussing the final general, abstract, and universal structures rather than looking at the concrete objects. However, particularly in teaching examples and exercises are essential, they help teachers to guide students from theory to theory (Michael Kohlhase says examples are theory morphism, i.e. “structure-preserving mappings between two mathematical structures.”).

Claim: “The case study is useful for both generating and testing of hypotheses but is not limited to these research activities alone.”

Bent Flyvbjerg cites Eckstein, John Walton, Karl Popper, who underline that case study is a mean for testing theories (Eckstein), produces the best theories (Walton), and is ideal for generalizing using falsification (critical reflexivity) (Popper). In the paper, theory is defined in two ways …

[..] theory in its “hard” sense, that is, comprising explanation and prediction [..] theory in the “soft” sense, that is, testing propositions or hypotheses” [..]

See wikipedia definition of theory, mathematical theory and list of first-order theories. See also “A shorter model theory” by Wilfrid Hodges (1997).

“In mathematical logic, a theory is a set of sentences in a formal language. One way to specify a theory is to define a set of axioms. The theory can be taken to include just those axioms, or their logical or provable consequences, as desired.”

“A syntactically consistent theory is a theory from which not every sentence in the underlying language can be proved.”

“A satisfiable theory is a theory that has a model. This means there is a structure M that satisfies every sentence in the theory. “

Falsification is widely used to test and, if needed, refute mathematical theories: “If just one observation does not fit with the proposition, it is considered not valid generally and must therefore be either revised or rejected.” (cf. Flyvbjerg).

Claim: “The generalizability of case studies can be increased by the strategic selection of cases. When the objective is to achieve the greatest possible amount of information on a given problem or phenomenon, a representative case or a random sample may not be the most appropriate strategy. This is because the typical or average case is often not the richest in information. Atypical or extreme cases often reveal more information because they activate more actors and more basic mechanisms in the situation studied.”

This takes on the discussion by Kerber et. al., which have addressed the typicality of examples. However, in mathematics (in particular teaching) we also make us of atypical examples, counter examples, and near-miss examples. (Note, we might want to use our study of typical examples to eventually identify atypical examples. Moreover, we might want to apply the types of cases by Flyvbjerg to mathematical examples and exercises)

Flyvberg presents several strategies for selection different cases (see figure below): Among others …

  • extreme cases: Getting a point across in an especially dramatic way, see e.g. Normen’s motivative example for management of change — Adriane 5
  • critical cases: Require experience, looking for “most likely” and “least likely” cases, i.e. cases to either clearly confirm or irrefutably falsify propositions and hypothesis
  • paradigmatic cases: Cases that highlight more general characteristics of the societies in questions; Kuhn showed that scientific paradigms cannot be expressed as rules and theories as there exists no predictive theories how predictive theory comes out; discovering paradigmatic cases requires intuition and taken-for-granted procedures
Strategies for the Selection of Samples and Cases

Strategies for the Selection of Samples and Cases

Do Case Studies Contain a Subjective Bias?

According to Flyvbjerg, it is falsification, not verification, that characterizes the case study. Moreover, the question of subjectivism and bias toward verification applies to all methods, not just to the case study.

This question also applies to mathematics. Rarely any human being is able to provide fully objective illustration, thus, also mathematical results have an individual touch and can include subjective, context-dependent parts influenced by the type of problem or personal views. In mathematics, single subjective cases do not lead to accepted and verified results. Instead we can observe a communal and peer-reviewed process within which a given proof is falsified and tested (see discussion with Cristian Calude).

Case studies often contain a substantial element of narrative. Good narratives typically approach the complexities and contradictions of real life. Accordingly, such narratives may be difficult or impossible to summarize into neat scientific formulae, general propositions, and theories.

I recommend to read pages 238-239, as Flyvberg’s illustration open a very new perspective on our work on mathematical examples (see below): “The opposite of summing up and “closing” a case study is to keep it open. [..] I tell the story in its diversity [..] I avoid linking the case with the theories [..] Instead, I relate the case to broader philosophical positions that cut across specializations. In this way, I try to leave scope for readers of different backgrounds to make different interpretations and draw diverse conclusions regarding the question of what the case is a case of. [..] Case study is a “virtual reality” [..] Students can safely be let loose in this kind of reality, which provides a useful training ground with insights into real-life practices that academic teaching often does not provide. [..]“ … maybe mathematical examples are more than theory morphisms (i.e. clear mappings between theories), maybe that have to leave room for imagination and interpretation. However, math is different to social science such as political studies or philosophy and mathematical knowledge is of a very different kind then social-science knowledge: it is abstract, well-structured, extraordinary interlinked, has a precise syntax and semantics. Well, but these characteristics make access to mathematical knowledge also so hard for novice: Maybe before being abstracted into clear structures in the human mind it has to be of a different form to be more easily understood by novice.

“One might say that the rule formulation that takes place when researchers summarize their work into theories is characteristic of the culture of research, of researchers, and of theoretical activity, but [..] something essential may be lost by this summarizing [..] “

This is indeed a problem in mathematics: When getting reading to write down their mathematical concepts, mathematicians do not fully articulate their thoughts but leave out parts that well-experienced mathematicians can fill in. However, students are lacking the observations/ experiences/ examples/ cases that experts have acquired through-out the years and have difficulties in understanding theoretic and abstract writings.

Further Reading

  • Robert Stake’s (1995): The Art of Case Study Research
  • Charles Ragin and Howard Becker’s (1992): What Is a Case?
  • Scenario-based techniques

Discussion with Cristian Calude

Thursday, October 16th, 2008

My first meeting with Cristian Calude has brought up many interesting aspects and thoughts, I shortly sketch here and I do not yet fully understand …
Please note that Cris and all other researchers mention do not necessarily share my personal opinions and summaries on this page.

  • Mathematical Collaboration: In mathematics (and other disciplines), interdisciplinary authors often only know parts of the paper they co-author. If you contact one for the authors, your are sometimes directed to the original writer, as your contact might now able to tell you about all paragraphs in the paper. Cris has experienced this during papers with Chaitin, also both are well-experienced mathematicians, they have different ways of thinking (more visual in contrast to more analytical thinking), which sometimes made it difficult to understand each other’s solution.
  • Mathematical Proofs: Cris’s friend, a logician, never looks at proofs. If he likes a theorem, he tries to proof it himself, but studying other people’s proof is not of interest for him
  • Mathematical Practice: There is a research group of ethno-mathematicians here at the University of Auckland (e.g. see Bill Barton) looking at questions such as: How does the mother tongue language influence mathematics? Are the metaphors in different languages the same?
  • Struggling with Notations: When giving a lecture in Leibzig, Chris experienced that the students struggled with his lecture. They understood the ideas, but when it came to solving assignments and applying the concepts in the lecture, they struggled with notations introduced by Chris as they had learned others.
  • Mathematicians and Computers: Chris pointed to an interesting question: How can we train mathematicians to use computers and not to be afraid of their results (i.e. trust in automatic proofs)?. Many parts are implicit when proving, but computers need everything explicated. We have to ways of bringing mathematicians and computers closer together: (1) Making automatic provers more accessible and human-oriented, i.e. decreasing the level of formality needed for interacting with formal prove systems (see MKM proceedings on e.g. Mizar, i.e. “an attempt to reconstruct mathematical vernacular into a formal language which can be read by humans and also verified by software”, or Coq, Mathematica) and (2) Training mathematicians so that their encodings are more accessible for machines, i.e. in order to use computers, mathematicians need to change their way of encoding knowledge.
  • Mathematicians and Computers: For (2), Chris suggested to start at looking at referee processes: What kind of questions does a referee ask? What aspects do mathematical referee look at? They mostly do not fully check the presented proof and pose specific questions which may help us to identify how knowledge has to be encoded to be communicated more efficiently (between humans and eventually between humans and computers).
  • Mathematical Practice: There are two processes in mathematics: (1) The process of discovery, which is very informal, includes several gaps, and is a slow and iterative approaching of a more and more coherent solution. (2) The process of presenting a result, i.e. the writing up of a solution. Both processes are valuable. We know several tools supporting (2) and in our group we also aim at supporting this process (with a Wiki for collaborative editing and a management of change system, …). However, process (1) requires tools that do not kill the creativity of mathematicians, i.e. that do not enforce notations/ styles and require tedious and correct mark-up of thoughts … but leave room for intuitions, gaps, and mistakes. And also eLearning systems (e.g. panta-rhei) should support process (1) as mathematical education should not focus on the presentation of results but should teach students how to discover math. However, towards the end of process (1), mathematicians might have to/ want to draw on computers (theorem provers and computer algebra systems) to verify their results.
  • Confidence in and Acceptance of proofs: In mathematics we observe different degrees of confidence of a proof, i.e. a proof is accepted in a social process and on different layers: (1) It can be published in a journal (and thus reviewed and accepted by the journal chairs) and (2) it can be accepted by reviewers of the two reviewing databases, Mathematical Reviews and Zentralblatt, and (3) it can be certified (i.e. verified by computers; see e.g. Coquand et al). Moreover, (4) the credibility of a theory increases with the number of proofs for its theorems (see e.g. collection of 79 approaches to proving the Pythagorean theorem) and its use in different context (I am not yet sure what Cris means by context. My first guess is that this relates to the reuse of a theory and its interrelation to other theories, i.e. the number of views and theory morphism. Basically, the structures that were used by Immanuel and Florian to do their PhD on logic translation and theory management; see next meeting). To sum up, a proof is not necessarily correct, but its confidence can increase.
  • Communal Acceptance: The Clay Mathematics Institute of Cambridge, Massachusetts (CMI) nominates its Millennium Prize Problems for one million dollar. The respective proof has to be published and resist community testing for 2 years. Thus, the mathematical community is very aware of the fact that only publishing a proof does not mean that it is correct.

Further readings

Public Lecture by Prof. Alan Lightman

Thursday, October 16th, 2008

(copied from the call for participation)

Alan Lightman is a novelist, essayist, physicist, and educator. Currently, he is Adjunct Professor of Humanities at the Massachusetts Institute of Technology (MIT).

The Discoveries
Several years ago, Alan Lightman asked physicists, chemists and biologists to nominate the most important scientific discoveries in the twentieth century in their respective fields. He condensed this list to about two dozen discoveries, which included the discovery of the first hormone, the discovery of special relativity, the discovery of the uncertainty principle, the discovery how nerves communicate with each other, the discovery of the structure of DNA, the discovery of the chemical bond.

In this lecture, drawing on his book The Discoveries, Professor Lightman will review some of these landmark scientific discoveries and briefly describe the impact and significance of the work, as well as talk about the personalities and human drama of the scientists involved and some common patterns in the process of discovery.

Flexible, Automous Behavioral Control

Wednesday, October 8th, 2008

Flexible, autonome Verhaltenskontrolle: Erlernen und Adaptation von Sensormotorischen Raumrepräsentationen
Keynote by Martin Butz (COBOSLAB, University of Würzburg) at Lernen, Wissen, Adaptivität (LWA 2008), University of Würzburg, 6.-8. October 2008. Track: FGWM

On the Evaluation of Personal Knowledge Management Solutions

Wednesday, October 8th, 2008

Evaluating Tools of the X-COSIM Semantic Desktop
Presentation by Thomas Franz at Lernen, Wissen, Adaptivität (LWA 2008), University of Würzburg, 6.-8. October 2008. Track: FGWM

Hypothesis: PIM benefits from information linkage and information reuse across PIM application.
Method: Using RDF and Semantic Web technologies.

X-COSIM provides the X-COSIMO ontology:
Among others, X-COSIMO defines contextual information in a formal representation for context, the contextual ontology includes concepts such as email, attachment, sender, recipients. Attachments e.g. contextualizes information object, while sender and recipient contextualizes agents in the systems.

Evaluation:
Does an X-COSIM enabled desktop provide better support for PIM tasks than a conventional one?
Better means: increased effectiveness (absolute time spend on each task), increased efficiency (goals reached: distance of mouse movements, number of window switches, …), increased satisfaction (questionnaire, ratings, interview).

  • 18 participants: 3 graduate students and 15 Ph.D.s students; none of them used the semantic desktop before.
  • Introduction (to the scenario, to the dataset, get acquainted to the system), observation, and feedback phase.
  • Scenario: Real data select from the organizers of the Night of Computer Science in Koblenz (more than 140 emails, 44 files, 40 files via eMail, …)
  • Tasks: (1) organization tasks (familiarization: all emails in one folder and participants had to create a folder structure), (2) lookup tasks (baseline: re-finding information; baseline as this feature was not expected to be of greater benefit in contrast to others), (3) multi-item tasks (evaluation), (4) document-driven collaboration (evaluation), (5) information collation (evaluation)
  • Evaluation Wizard: guiding the user through the evaluation; presenting the lookup tasks, …, and questionnaire. The wizard tracked the execution time for each task.

Results: Semantic desktop can improve PIM.

Implementation: Runs on KDE with Thunderbird. download now

The Anti-Social Tagger – Detecting Spam in Social Bookmarking Systems

Wednesday, October 8th, 2008

Presentation by Andreas Hotho at Lernen, Wissen, Adaptivität (LWA 2008), University of Würzburg, 6.-8. October 2008. Track: KDML

The social bookmarking system BibSonomy has to deal with a lot of spam; which hamper the quality of search results and navigation. This talk focuses on detecting users as spammer, making all their posts invisible in the system. This decision is based on their tagging and personal data such as eMail etc. The authors present a framework that allows for automatic classification of spammers.

How to detect Spammers: Checking all their tags and, possible, the bookmarked sites. Spam posts are identified if:

  • Tags describing a web page do not fit to the content of the site.
  • Tags and/or topic of a post are not interesting for the system.

Problems:

  • Subjective notion of what is spam
  • No cross-check; noise
  • Only two classes: spam or non-spam
  • Maybe identification of spammers to not granular enough, rather flag posts as spam
  • User may have several accounts

Features:

  • Profile features (digits in name, digits in mails, length of the names, mails)
  • Activity features (time between registration and first post, number of tags per post – spammers use more, …)
  • Location features (number of users in the same domain or IP address)
  • Semantic features (automatic tag from spamming software “$Group” can be used to make tags public in some bookmarking systems, blacklist of spam tags, co-occurrence of information as “a spammer shares resources with about 18 other spammers, but only with 0.5 non-spammers”)

Classification algorithms: SVM (best), J48, Logistic regression (worse), and Naive Bayes

Capturing the needs of amateur web designers by means of examples

Wednesday, October 8th, 2008

Presentation by Victor de Boer (Human-Computer Studies Laboratory; University of Amsterdam) at Lernen, Wissen, Adaptivität (LWA 2008), University of Würzburg, 6.-8. October 2008. Track: ABIS

The authors present a tool (SiteGuide) that supports amateur web-users to design web pages: Helping them to select and organize information for their pages. User input a set of example sites that are similar to their intended website. SiteGuide scrapes and analyses the sites and captures their commonalities in a web site model. From this, SiteGuide generates a site for the users. In addition, the system can provide the difference of a draft site of the user to its internal model of the example sites.

Computational Intelligence for Communication and Cooperation Guidance in Adaptive eLearning Systems

Tuesday, October 7th, 2008

Presentation by Mirjam Köck at Lernen, Wissen, Adaptivität (LWA 2008), University of Würzburg, 6.-8. October 2008. Track: ABIS

Motivation: eLearning has become very popular, but still simple approaches of implementing adaptivity dominate. Computational intelligence is under-represented. The authors focus on challenges such as the autonomous knowledge acquisition, autonomous pattern identification at run-time, and expression of patterns in rules.

Adaptive learning guidance includes

  • navigation through learning materials
  • Guiding communication and cooperation activities (suggestion communication partners, contact persons for questions, …)

Approach: Using communication and collaboration activities (rather than contents/ tags) are used as input for the user models. Identify groups of learners based on communication and learning behaviours (observing the style of learning, the level of activities, activity pattern).

Collecting information such as the user’s online time, actions related to communication (read, write, update, delete), user’s current knowledge, learning activities (time needed for test, time spent on content before taking tests, performance of tests), content of communication items

Relations of interest: How is a user’s time spent on communication related to the learning curriculum?; does the knowledge state influence communication?; what is the degree of similarity between a user’s activity level in the communication area and content area?; …

Promising Technologies: Artificial Neural Networks (can discover activity clusters, can adapt components e.g. change weights, do not depend on continuous human intervention; but: blackbox-syndrome, missing explanation capability, rule extraction is difficult); Combined Neuro-Fuzzy Approaches, Bayesian Networks (combination of domain knowledge and data; derivation of causal relationships) -> Combination allows using Neural Networks e.g. for learning and making the hidden sector of Neural Networks more visible.

Prospect: Improve adaptation; reducing human efforts to ensure quality and up-to-dateness of model data; semi-automatic pattern recognition, classification and evaluation at run-time; predication of behaviour based on correlations; integration of CI approaches into popular learning environment (Sakai; see also Stephan Weibelzahl)

Towards an Automatic Service Composition for Generation of User-Sensitive Mashups

Tuesday, October 7th, 2008

Presentation by Thomas Fischer (University of Jena) at Lernen, Wissen, Adaptivität (LWA 2008), University of Würzburg, 6.-8. October 2008. Track: ABIS.

Mashups extract and combine data, functionalities, etc. from different websites into a single integrated tool.

See also Why Mashups = (REST + ‘Traditional SOA’) * Web 2.0

Evaluating the Usability of Adaptive Recommendations

Tuesday, October 7th, 2008

Presentation by Stephan Weibelzahl (National College of Irland) at Lernen, Wissen, Adaptivität (LWA 2008), University of Würzburg, 6.-8. October 2008. Track: ABIS.

Stephan Weibelzahl developed HTML-Tutor, an interactive learning environment which offers an introduction to HTML and publishing on the Web.

Most system in academia remain at the prototype level with poor usability. This is acceptable as prototypes are vehicles to provide proof of concepts of an approach. However, in the long run scientists need to demonstrate the effects and impact of adaptive systems. Consequently, we need to start taking usability criteria into account when developing and evaluating scientific software. The authors aim at analysing the effects of usability on adaptation.

An adaptive peer finder is presented. The system is based on the adaptive learning system AHA! by de Bra and Calvi.

Further Readings: