CoSyne

Duration: 2010-2013
Funder: European Commission, FP7 STREP
Partners: Fondazione Bruno Kessler (Italy), Dublin City
University (Ireland), Heidelberg Institute for Theoretical Studies
(Germany), Netherlands Institute for Sound and Vision (The
Netherlands), Deutsche Welle (Germany), Dutch Chapter of the WikiMedia
Foundation (The Netherlands)

Website: http://www.cosyne.eu/
Summary: The combination of dynamic user-generated content and multi-lingual
aspects is particularly prominent in Wiki sites. Wikis have gained
increased popularity over the last few years as a means of
collaborative content creation as they allow users to set up and edit
web pages directly. A growing number of organizations use Wikis as an
efficient means to provide and maintain information across several
sites. Currently, multi-lingual Wikis rely on users to manually
translate different Wiki pages on the same subject. This is not only a
time-consuming procedure but also the source of many inconsistencies,
as users update the different language versions separately, and every
update would require translators to compare the different language
versions and synchronize the updates. The overall aim of the CoSyne
project is to automate the dynamic multi-lingual synchronization
process of Wikis.

CoSyne addresses the following challenges:

  • achieve robust translation of noisier user-generated content
    between 6 core languages (consisting of 4 core languages and 2
    languages with limited resources to demonstrate adaptability of
    the system),

  • improve machine translation quality by segment-specific
    adaptive modeling,

  • identify textual content overlap between segments of Wiki
    pages across languages to avoid redundant machine translation,

  • identify the optimal insertion points for translated content
    to preserve coherence,

  • analyze user edits to distinguish between factual content
    changes and corrections of machine translation output, and exploit
    the latter to improve machine translation performance in a
    self-learning manner.

People

  • Amit Bronner
  • Caroline van Impelen
  • Christophe Costa-Florencio
  • Spyros Martzoukos
  • Christof Monz