A MathSearch Competition

In my last post I just learned about a new search engine. We should really have a competition and example library for Math Search Engines. We talked about this some years back but we really need to get our act together, probably for the next MKM.

I can see three tasks that we have to accomplish for a competition

  1. collect a search corpus. It seems that the arXiv would be the right thing to start from here, it is big enough  to pick competition examples randomly.
  2. cooperate on an analysis pipeline and corpora. This would allow people to cooperate without having a full analysis pipeline.
  3. collect a corpus of search queries. This may be the biggest hurdle, since we need a gold standard of what we expect the hits to be
  4. come up with “divisions”. not all engines can do the same, so we should only let comparable engines compete; also multiple divisions will allow to have multiple trophies.
  5. build a competition harness. So that tests can be automated. This will also require and thus lead to general search APIs.

This is all I can think about at the moment, so give me your feedback.

Leave a Reply

You must be logged in to post a comment.