Organizers will provide training data for all language pairs.
Participants must submit at least a system trained only with the provided data (constrained) for each of the language pairs they would like to participate.
Additionally, participants can use additional training data (not provided by the organisers) or existing translation systems specifying a flag that the system uses additional data (unconstrained).
System submissions that used the provided training data (constrained) will be distinguished from submissions that used additional data resources (unconstrained).
Note that basic linguistic tools such as taggers, parsers, or morphological analyzers or multilingual systems are allowed in the constrained condition.
Participants will use their systems to translate a test set of unseen sentences in the source language.
The translation quality is measured by various automatic evaluation metrics (BLEU will be the main evaluation metric for the 1st round).
You may participate in any or all of the language pairs.
Organizers will provide a framework to show the results to be compared.
Participant Repository:
Participants are provided with a single repository for all the tasks they take part in.
The repository contains the runs, resources, code, and report of each participant.
The repository is organised as follows:
-
submission
: this folder contains the runs submitted for the different tasks in the different evaluation rounds.
-
score
: this folder contains the performance scores of the submitted runs.
-
code
: this folder contains the source code of the developed system.
-
resource
: this folder contains (language) resources created during the participation.
-
report
: this folder contains the rolling technical report describing the techniques applied and insights gained during participation, round after round.
Covid-19 MLIA Eval consists of three tasks run in three rounds.
Therefore, the submission
and score
folders are organized into sub-folders for each task and round as follows:
-
submission/task1/round1
: for the runs submitted to the first round of the first task. Similar structure for the other tasks and rounds.
-
score/task1/round1
: for the performance scores of the runs submitted to the first round of the first task. Similar structure for the other tasks and rounds.
Participants which do not take part in a given task or round can simply delete the corresponding sub-folders.
The goal of Covid-19 MLIA Eval is to speed up the creation of multilingual information access systems and (language) resources for Covid-19 as well as openly share these systems and resources as much as possible. Therefore, participants are more than encouraged to share their code and any additional (language) resources they have used or created.
All the contents of these repositories are released under the Creative Commons Attribution-ShareAlike 4.0 International License.
Rolling Technical Report:
The rolling technical report should be formatted according to the Springer LNCS format, using either the LaTeX template or the Word template. LaTeX is the preferred format.
Corpora:
Utilities:
Automatic Evaluation:
A ranking with the results of the automatic evaluation is available at this website. This ranking will be updated periodically until the translations submission deadline is passed. New: Final results have been published in a very rough version of the findings rolling report.