Important Update: Proceedings paper submission deadline extended to 20 June at 15:00 CET.

MultiClinSUM Shared Task Homepage

The MultiClinSUM Track is organized by the Barcelona Supercomputing Center’s NLP for Biomedical Information Analysis group and promoted by Spanish and European projects such as DataTools4Heart, AI4HF and BARITONE.

What is MultiClinSUM?

MultiClinSUM is a shared task and set of resources focused on the multilingual summarization of clinical documents.

For more information about the MultiClinSUM task, check the Task Info tab, which includes the Motivation, Subtasks, Schedule, Registration and Submission pages.

To learn more about the MultiClinSUM corpus and how it was annotated, check the Data tab.

MultiClinSUM will be held as part of the BioASQ Workshop in the CLEF 2025 conference. For more information about them, check the Workshop tab.

MultiClinSUM is organized by the Barcelona Supercomputing Center’s NLP for Biomedical Information Analysis group (formerly Text Mining Unit).

Registration

You can register for the BioASQ 13 Workshop task of MultiClinSum using the official BioASQ registration link. Make sure that you select the “Task MultiClinSum”.

Related resources

At the NLP for Biomedical Information Analysis group (formerly Text Mining Unit), one of our missions is the open publication of datasets to train and benchmark biomedical information extraction, normalization and indexing systems. For that reason, we have released multiple datasets as part of shared tasks over the years. If you are interested in MultiClinSUM, you might want to take a look at some of our resources and competitions about:

Clinical content extraction: DisTEMIST (diseases), MedProcNER/ProcTEMIST (clinical procedures), SympTEMIST (signs and findings), CANTEMIST (tumour morphology), CodiEsp (coding to ICD), PharmaCoNER (chemicals and proteins), LivingNER (species and humans), MultiCardioNER (diseases and medications, includes the DrugTEMIST corpus as well as cardiology-specific data)
Socio-demographic / Social Determinants of Health content extraction: MEDDOPLACE (locations and more) MEDDOCAN (sensitive data), MEDDOPROF (occupations), ToxHabits (new this year!)
Information extraction in social media: SocialDisNER (diseases), ProfNER (occupations)
Linguistic aspects: BARR1 and BARR2 (abbreviation resolution)
Machine Translation: ClinSpEn (EN<->ES clinical content translation)

Schedule

MultiClinSum Sample Set Release: April 9th, 2025

MultiClinSum Train/Dev Set Release: May 16th, 2025.

MultiClinSum Test Set Texts Release: May 28th, 2025 (updated!)

Participant Test Predictions Deadline: June 3nd (22:00 CET), 2025 (IMPORTANT: Extended!)

Participant Evaluation Result Release: June 6th, 2025 (updated!)

Submission of Participant Papers Deadline: Update! Deadline extended to 20 June at 15:00 CET (see instructions)

Notification of Acceptance of Participant Papers: June 27th, 2025

Submission of Camera-ready Participant Papers Deadline: July 7th, 2025

BioASQ @ CLEF2025: September 9th-12th, 2025

Please refer to the Schedule page.

Motivation:

There is a rapid accumulation of various types of clinical content, including medical records and specialized publications such as clinical case reports. As a result, it is becoming increasingly challenging for healthcare professionals, biomedical researchers, and patients to process lengthy clinical documents in order to gain a clear understanding and overview of the key medical insights underlying patient conditions and outcomes.

This challenge applies not only to clinical content in English but also in other widely spoken languages such as Spanish, French, and Portuguese. Recent advances in automatic summarization and the use of large language models (LLMs) have shown promising results in condensing lengthy clinical texts into shorter summaries, effectively reducing their length while preserving essential clinical information.

Use cases and application scenarios for automatic clinical text summarization include:

Clinical Decision Support: Summarization condenses electronic health records (EHRs) into key events, diagnoses, medications, and outcomes, aiding timely and informed decisions.
Patient Discharge Summaries: Automatically generate readable, concise summaries from detailed clinical notes, enhancing continuity of care and patient understanding.
Medical Literature Review: Summarize key findings, methodologies, and outcomes from scientific articles to facilitate rapid literature reviews.
Multilingual Clinical Communication: Summarization systems, combined with translation, enable understanding across language barriers by distilling essential content from foreign-language reports.
Telemedicine & Remote Consultations: Summarized patient data supports fast review and focused diagnostics in remote care settings.
Clinical Trial Screening: Summarized profiles can accelerate eligibility assessment by highlighting key inclusion/exclusion criteria.
Medical Coding and Billing: Summarization helps highlight billable actions, diagnoses, and procedures to streamline medical coding workflows.
Patient-Facing Summaries: Summarization tools can generate simplified or lay-language versions of complex medical content to improve patient engagement and health literacy.

In light of recent technological advances, there is a pressing need to evaluate and benchmark the effectiveness of clinical summarization for case reports written in different languages.

MultiClinSum focuses on the automatic summarization of full clinical case reports in multiple languages—namely English, Spanish, French, and Portuguese. Since clinical case reports share certain similarities with electronic medical records, particularly discharge summaries, the insights derived from MultiClinSum may have practical relevance for clinical text analysis and various medical applications.

This task is based on a corpus of manually selected full clinical case reports along with their corresponding author-provided short summaries.

For evaluation purposes, automatically-generated summaries will be compared against human-generated summaries written by the original clinical authors, using Rouge-2 scores and BERTScore for evaluation assessment. The evaluation for each language will be done separately, and participants are encouraged to explore both monolingual and multilingual approaches as they wish. Thus the following sub-tasks will be posed: MultiClinSum-en (English data), MultiClinSum-es (Spanish data), MultiClinSum-fr (French data) and MultiClinSum-pt (Portuguese data).

Participating teams are requested to implement or evaluate automated systems that generate summaries from full clinical case reports.

Acknowledgements

The MultiClinSum track was funded by Spanish and European projects such as DataTools4Heart (Grant
Agreement No. 101057849), AI4HF (Grant Agreement No. 101080430). This publication is part of the
R &D &I project TED2021-129974B-C22, funded by MICIU/AEI /10.13039 /501100011033 and by the
European Union Next Generation EU/PRTR (BARITONE (Proyectos de Transición Ecológica y Transición
Digital 2021). We would also like to acknowledge the scientific committee members, Sophia Ananiadou,
Horacio Saggion and Simon Mille for their valuable feedback and suggestions regarding the task settings and evaluation scenarios as well as the BioASQ organizers and specially Anastasios Nentidis for their technical support.