SympTEMIST Subtasks
SympTEMIST is made up of three different subtasks, each with its own aim:
Subtask 1: SymptomNER (Symptoms, Signs & Findings Named Entity Recognition)
In case of this main subtask, participants will be asked to automatically detect mention spans of symptoms (including also signs and findings) from clinical reports written in Spanish. Using the SympTEMIST corpus as training data (manually labelled mentions), they must create systems that are able to return the start and end position (character offsets) of all symptom entities mentioned in the text. The main evaluation metrics for this task will be precision, recall and f-score.
Note that we also provide access to additional other annotations for the documents used as training data beyond symptoms, which could be exploited further to improve the systems, namely the annotation of a) diseases (Distemist corpus), b) clinical procedures (MedProcNER corpus) and c) chemical compounds, drugs and genes/proteins (PharmaCoNER corpus). Links to these other resources will be posed at the Symptemist corpus zenodo webpage.
Subtask 2: SymptomNorm (Symptom Normalization & Entity Linking)
Given a list of symptom mentions in Spanish provided by the organizers (a subset derived from subtask 1), participating teams will be asked to automatically normalize or map these mentions to their corresponding SNOMED CT concept identifiers. The training data will consist in a set of manually mapped symptom mentions. The test set will consist in a list of symptom mentions provided by the organizers that have to mapped automatically by participants to SNOMED CT codes, which then will be compared/evaluated against the corresponding manually mapped identifiers. The evaluation metric for this task is accuracy, understood as the percentage of correctly normalized mentions out of the total.
Subtask 3: SymptomMultiNorm (Experimental English/multilingual Symptom Normalization)
This subtask is an experimental subtask with the aim of promoting entity linking & clinical concept normalization approaches for several languages, namely: English, Portuguese, French, Italian and Dutch. Additional resources will be released also for other languages (not included in the evaluation set) with particular needs of entity linking resources. For this task we will explore the use of automatic translation strategies to examine the strength and limitations of such approaches for clinical concept normalization focusing initially on a selected number of target languages.
As training data, organizers will release a list of automatically translated symptom mentions for each target language together with their SNOMED CT codes. Note that the SNOMED CT codes were derived from the corresponding original symptom annotation in Spanish (subtask 2). For evaluation purposes we will focus on five languages: English (SymptomMultiNorm-en), French (SymptomMultiNorm-fr), Portuguese (SymptomMultiNorm-pt), Dutch (SymptomMultiNorm-nl), Italian (SymptomMultiNorm-it). As test set a list of translated mentions will be provided for each language and participants have to return their corresponding SNOMED CT code. Participation can be for any of the languages, it is not mandatory to send submissions for all languages. The main evaluation metric for this task will be accuracy.
As additional resources we will also release versions for other languages not included during the evaluation process: Catalan, Swedish, Romanian, Czech, and at later stages also Danish and German.
Note that teams can submit their prediction for any of the five official languages included for the SymptomMultiNorm subtask. Since this is an experimental task, we encourage teams to work on the task as you like, be it creating a monolingual model for each language, creating a truly multilingual model that can tackle all of them at once or even a completely different approach.