Subtracks – Multilingual clinical summarization

Subtracks Overview

MultiClinSum-2 will feature one subtrack for each language (English, French, Spanish, Portuguese, Italian, Russian, Catalan, Norwegian, Danish, Romanian, German, Greek, Dutch, Czech, and Swedish), with each of them being evaluated independently. Participants are allowed to use both monolingual and multilingual models, as well as any existing resources for summarization as long as they report it.

Participation Options

Language Selection: Teams may choose to participate in as many or as few language subtracks as they wish. There is no requirement to submit results for all 15 languages, allowing participants to focus on languages relevant to their research interests or available resources.

Modeling Approaches: Both language-specific and cross-lingual approaches are welcomed. Participants may develop dedicated monolingual models tailored to individual languages, deploy a single multilingual model across multiple subtracks, or combine both strategies. All approaches will be evaluated fairly within each language subtrack.

Resource Usage: Participants are encouraged to leverage any available resources, tools, or pre-existing datasets that may enhance their summarization systems. This includes pre-trained language models, external medical vocabularies, translation tools, or supplementary clinical corpora. Full transparency regarding all utilized resources must be provided in the system description paper.

Evaluation

Each language subtrack will be evaluated independently using consistent metrics across all languages, ensuring fair comparison of system performance within each linguistic context while acknowledging the unique challenges presented by different languages.