July 2020 – CodiEsp

Participants systems

NLP4LIFE: https://github.com/sarahESL/CLEFeHealth2020-multilabel-bert
Exeter: https://github.com/aollagnier/eda_classification
SWAP: https://github.com/marcopoli/CODIESP-10
IAM: https://github.com/scossin/IAMsystem
ICB-UMA: https://github.com/guilopgar/CLEF-2020-CodiEsp
Hulat: https://github.com/pqueipo/Codiesp-CLEF-2020-eHealth-Task1
IMS: https://github.com/gmdn/CLEF2020
Baseline: https://github.com/tonifuc3m/codiesp-baseline-lookup

CLEF 2020

CodiEsp is included in the 2020 Conference and Labs of the Evaluation Forum, eHealth track, this year, an online-only event.

Registration closes On September, 19th https://www.eventbrite.co.uk/e/clef-2020-conference-and-labs-of-the-evaluation-forum-tickets-116107862743

CodiEsp videos available on YouTube: https://www.youtube.com/playlist?list=PL5uSCzf1azhA0crlSVCYMPqMUWd4mXc4x

CLEF eHealth Schedule

Day 1: Wednesday 23rd September – Presentation

Local Time (Amsterdam)	Title	Presenter	Affiliation	More info
09:00-09:10	Welcome and Introduction	Liadh Kelly		TBC
09:10-09:25	Task 1 Overview: CodiEsp	Antonio Miranda-Escalada	Barcelona Supercomputing Center	TBC
09:25-09:40	Task 2 Overview	Lorraine Goeuriot, Zhengyang Liu, Chenchen Xu		TBC
09:40-10:25	Task 1 participant presentations
	FLE at CLEF eHealth 2020: Text Mining and Semantic Knowledge for Automated Clinical Encoding	Nuria García-Santa	Fujitsu Laboratories of Europe (FLE), Spain	TBC
	IAM at CLEF eHealth 2020: concept annotation in Spanish electronic health records	Sebastien Cossin	Univ. Bordeaux, France	TBC
	ICD-10 coding based on semantic distance: LSI_UNED at CLEF eHealth 2020 Task 1	Mario Almagro	National University of Distance Education (UNED), Spain	TBC

Day 2: Thursday 24th September – CodiEsp session

Local Time (Amsterdam)	Title	Presenter	Affiliation	More info
9:00-09:15	Introduction and recap from CodiEsp Overview	Antonio Miranda-Escalada	Barcelona Supercomputing Center	TBC
09:15-10:20	Task 1 participant presentations
	ICB-UMA at CLEF e-Health 2020 Task 1: Automatic ICD-10 coding in Spanish with BERT	Guillermo López-García	Universidad de Málaga, Spain	video
	A study of Machine Learning models for Clinical Coding of Medical Reports at CodiEsp 2020	Marco Polignano	University of Bari Aldo Moro, Italy.	TBC
	Using the R Tidyverse for Multilingual Information Extraction. IMS UniPD ad CLEF eHealth 2020 Task 1	Giorgio Maria Di Nunzio	University of Padua, Italy	video
	IXA-AAA at CLEF eHealth 2020 CodiEsp Automatic classification of medical records with Multi-label Classifiers and Similarity Match Coders	Alberto Blanco	University of the Basque Country UPV/EHU, Spain	TBC
	Text Augmentation Techniques for Clinical Case Classification	Anais Ollagnier	University of Exeter, UK	video
	Convolutional Attention Models with Post-Processing Heuristics at CLEF eHealth 2020	Elias Moons	KU Leuven, Belgium	TBC
	Fraunhofer AICOS at CLEF eHealth 2020 Task 1: Clinical Code Extraction From Textual Data Using Fine-Tuned BERT Models	João Costa	Fraunhofer Portugal AICOS, Portugal	video
	Multilingual ICD-10 Code Assignment with Transformer Architectures using MIMIC-III Discharge Summaries	Henning Schäfer	University of Applied Sciences and Arts Dortmund, Germany	TBC
10:20-10:30	Closing remarks and wrapup	Antonio Miranda-Escalada	Barcelona Supercomputing Center	TBC

Day 3: Friday 25th September – Consumer Health Search

Local Time (Amsterdam)	Title	Presenter	More info
09:00-09:40	TBC	Marco Viviani	TBC
09:40-10:10	Task 2 participant presentations
	SandiDoc at CLEF 2020 - Consumer Health Search : AdHoc IR Task	Sandaru Seneviratne	TBC
	A Study on Reciprocal Ranking Fusion in Consumer Health Search	Giorgio Di Nunzio	TBC
10:10-10:30	Discussion and wrapup	Lorraine Goeuriot	TBC

Citations

Participants’ working notes will be published in the CEUR-WS proceedings (http://ceur-ws.org/). Task overview paper will also be published in the CEUR-WS proceedings.

@InProceedings{CLEFeHealth2020Task1Overview,
author={Antonio Miranda-Escalada and Aitor Gonzalez-Agirre and Jordi Armengol-Estapé and Martin Krallinger},
title="Overview of automatic clinical coding: annotations, guidelines, and solutions for non-English clinical cases at CodiEsp track of {CLEF eHealth} 2020",
booktitle = {{Working Notes of Conference and Labs of the Evaluation (CLEF) Forum}},
series    = {{CEUR} Workshop Proceedings},
year      = {2020},
}

Since CodiEsp is part of CLEF eHealth lab, lab overview paper will be published in the Springer LNCS proceedings.

@InProceedings{CLEFeHealth2020LabOverview,
author={Lorraine Goeuriot and Hanna Suominen and Liadh Kelly and Antonio Miranda-Escalada and Martin Krallinger and Zhengyang Liu and Gabriella Pasi and Gabriela {Saez Gonzales} and Marco Viviani and Chenchen Xu},
title="Overview of the {CLEF eHealth} Evaluation Lab 2020},
booktitle = {{Experimental IR Meets Multilinguality, Multimodality, and Interaction: Proceedings of the Eleventh International Conference of the CLEF Association (CLEF 2020)
}},
series    = {LNCS Volume number: 12260},
year      = {2020},
editor = {Avi Arampatzis and Evangelos Kanoulas and Theodora Tsikrika and Stefanos Vrochidis and Hideo Joho and Christina Lioma and Carsten Eickhoff and Aurélie Névéol and Linda Cappellato andNicola Ferro},
}

First Multilingual clinical NLP workshop: MUCLIN (MIE2020/EFMI)

There is an increasing interest in exploiting clinical texts by means of language technologies and text mining approaches. Structured clinical information, in the form of codified clinical records relying on controlled vocabularies such as ICD10 is a key resource for statistical analysis techniques applied to patient data.

Clinical natural language processing (NLP) and AI-based document indexing can result in tools for automatic clinical coding by exploiting directly the unstructured content of EHRs. Such tools are playing an increasing role to generate results that do complement health informatics approaches focusing on translational medicine challenges, by providing relevant diagnostic information extracted from clinical narratives. This implies that text mining generated clinical coding results can provide a rich clinical context for patient health information necessary for other data analysis processes like bioinformatics and OMICS data exploration.

The workshop will include a panel discussion and short flash talks with experts on the role of shared tasks to promote clinical and biomedical NLP.

Some of the proposed topics for the discussion are: generation of shareable data collections for clinical NLP and automatic coding systems, use of AI and deep learning methods applied to clinical text mining, exploitation of unstructured content of EHRs for translational medicine, explainable IA strategies, evaluation metrics and scenarios for automatic clinical coding systems, multilingual clinical coding strategies and shared tasks.

CodiEsp setting and results were presented at the workshop First Multilingual clinical NLP workshop (MUCLIN) at MIE2020. The original MIE programme can be accessed here. This year, MIE was virtual thanks to the collaboration of EFMI and took place on July 4th, 2020.

This workshop had two parts:

First, there had been a general overview presentation on shared tasks and the results and setting of a particular shared task, i.e. CodiEsp.

Then, there had been a panel discussion including flash talks by experts on the role of the shared tasks to promote clinical NLP, resources, tools, evaluation methods.

The talks are available in the MUCLIN youtube playlist.

Local Time (GMT+2)	Title	Presenter	Affiliation	More info
1:00 pm	Opening Remarks	Martin Krallinger	Barcelona Supercomputing Center	Video
1:05 pm	CodiEsp: clinical coding shared task in Spanish clinical case reports	Antonio Miranda	Barcelona Supercomputing Center	Video
1:20 pm	Panel Session: flash talks
Flash talk 1	FLE at CLEF eHealth 2020: Text Mining and Semantic Knowledge for Automated Clinical Encoding	Nuria García-Santa	Fujistsu	Video
Flash talk 2	Contribution of CLEF eHealth and WMT biomedical translation task to multilingual clinical NLP	Aurélie Névéol	CNRS, France	Video
Flash talk 3	The importance of shared tasks for biomedical text mining	Fabio Rinaldi	University of Zurich	Video
Flash talk 4	Multilingual Text Mining	Francisco Couto	University of Lisbon	Video
Flash talk 5	Lessons learnt on benchmarking at ELIXIR	Salvador Capella-Gutierrez	INB/Elixir, Barcelona Supercomputing Center	TBA
1:55	Closing remarks	Martin Krallinger	Barcelona Supercomputing Center

See EFMI program.

MUCLIN program