Welcome to the WebCLEF web site.
For Multi/Crosslingual retrieval the web is the natural and common setting. In the European context, many issues for which people turn to the web are essentially multilingual. These include culture, economy, education, leisure, travel. For IR folks, working with web data is simply very attractive.
WebCLEF is about evaluating cross-language retrieval systems in a web setting; WebCLEF has been running for three years now, with a shift from navigational queries to informational ones between 2006 and 2007:
- For WebCLEF 2005, the EuroGOV collection was built and used, consisting of a
crawl of governmental sites in Europe; WebCLEF 2005 participants developed 575
known-item topics against EuroGOV.
Please consult the pages on
WebCLEF 2005 for further
details.
- For WebCLEF 2006, close to 20 teams signed up, but only 8 took part;
After
WebCLEF 2006 we had a test collection containing over 700 known-item
topics (in close to 20 languages), and a shift in direction was deemed appropriate.
Please see the WebCLEF 2006 pages for further details.
- For WebCLeF 2007 a new, informational, task was defined. Further details can
be found on this page. Briefly, we implemented
a multilingual "information synthesis" task, where, for a given topic,
web pages were fetched from the live web, from which participating
systems then had to extract important snippets.
- For WebCLEF 2008 we will build on the 2007 task, andagain consider informational queries, harvesting important information from web search results produced by a commercial search engine. The best performing system from 2007 will be made available for participants to use as their baseline.
To get involved, visit the page with contact details.