As a complement to the Gold Standard MEDDOPROF corpus, we have updated the training data to include additional mentions of automatically labelled annotations that may be used (or not) by participants in their system. This complement is called MEDDOPROF-CE (Complementary Entities).
The CE version of the training data includes the Shared Task’s original manual annotations (with the labels for task one and two joint together, e.g. “PACIENTE-PROFESION”) and automatically generated clinical and linguistic entities. All in all, nine new entity types have been included: “síntoma” (symptom), “enfermedad” (disease), “procedimiento” (procedure), “fármaco” (drug), “org_vivo” (living organisms), “neg”/”nsco” (negation trigger and scope) and “unc”/”usco” (uncertainty trigger and scope).
The entities in the MEDDOPROF-CE version will not be evaluated in the task, but they can be used to test the impact of other entity types in the Shared Task’s tracks or for information discovery. We encourage participants to be creative and incorporate these additional layers into their systems as they wish.
The Complementary Entities dataset can be downloaded together with the training data on Zenodo.