Machine Translation

Covid-19 MLIA Eval

Task Description

The goal of the Machine Translation (MT) task is to evaluate systems focused on the Covid-19 related text. The Covid-19 MT task addresses the following language pairs:

  • English-German
  • English-French
  • English-Spanish
  • English-Italian
  • English-Modern Greek
  • English-Swedish
The main challenge is that the text to be translated is specialized on the new and high-relavant topic of Covid-19. The task is open for beginners and established research groups from any area of interest in the scientific community, the public administration and the industry. At the end of each rounds, participants will write/update an incremental report explaining their system. The report will highlight which methods data have been used.

To participate in the Machine Translation task, groups need to register at the following link:

Register

Important Dates:

Release of training data: September xx, 2020

Release of test data: September xx, 2020

Results submission deadline: October xx, 2020

Report submission: November xx, 2020

Participation Guidelines

Organizers will provide training data for all language pairs. Participants can use these data and are also free to use out-of-domain data. Participants will use their systems to translate a test set of unseen sentences in the source language. The translation quality is measured by various automatic evaluation metrics. You may participate in any or all of the language pairs. Organizers provide a common framework to upload submissions where results can be compared. Multilingual systems, parsers, or morphological analyzers are allowed.

Participant Repository:

Participants are provided with a single repository for all the tasks they take part in. The repository contains the runs, resources, code, and report of each participant.

The repository is organised as follows:

Covid-19 MLIA Eval consists of three tasks run in three rounds. Therefore, the submission and score folders are organized into sub-folders for each task and round as follows:

Participants which do not take part in a given task or round can simply delete the corresponding sub-folders.

The goal of Covid-19 MLIA Eval is to speed up the creation of multilingual information access systems and (language) resources for Covid-19 as well as openly share these systems and resources as much as possible. Therefore, participants are more than encouraged to share their code and any additional (language) resources they have used or created.

All the contents of these repositories are realeased under the Creative Commons Attribution-ShareAlike 4.0 International License.

Rolling Technical Report:

The rolling technical report should be formatted according to the Springer LNCS format, using either the LaTeX or the Word template. LaTeX is the preferred format.

Corpora:

The training data contains over 1 million examples. The dataset is available here.

The test dataset will be available soon.

Submission Guidelines

Participating teams should satisfy the following guidelines:

Submission Upload:

Runs should be uploaded in the repository provided by the organizers. Following the repository structure discussed above, for example, a run submitted for the first round of the Machine Translation task should be included in submission/task3/round1.

Runs should be uploaded with the following name convention: <teamname>_task3_<freefield> where teamname is the name of the participating team, task3 is the identifier of the Machine Translation task, and freefield is a free field that participants can use as they prefer.

Performance scores for the submitted runs will be returned by the organizers in the score folder, which follows the same structure as the submission folder.

The rolling technical report has to be uploaded and kept update in the report folder.

Here, you can find a sample participant repository to get a better idea of its layout.

Evaluation:

The quality of the submitted systems will be evaluated with the following measures:

Organizers

Manuel Herranz, Pangeanic, Spain
m.herranzpangeanic.es

Mercedes García-Martínez, Pangeanic, Spain
m.garciapangeanic.com