Evaluation Script

  • Official evaluation script: Available on GitHub (beta version).
    This is the official evaluation script of the task.

Word embeddings

  • Spanish Medical Word Embeddings. Word embeddings generated from Spanish medical corpora. Download them from Zenodo.
    It can be used as a building block for clinical NLP systems used in Spanish texts.

Baseline

Dictionary lookup based on Levenshtein distance. It looks for train and development annotations in the test set.
Results (precision, recall, f1):

  • NER: 0.181, 0.737, 0.291
  • NORM: 0.18, 0.73, 0.288
  • CODING (MAP): 0.584

Linguistic Resources

  • CUTEXT. See it on GitHub.
    Medical term extraction tool.
    It can be used to extract relevant medical terms from clinical cases.
  • SPACCC POS Tagger. See it on Zenodo.
    Part Of Speech Tagger for Spanish medical domain corpus.
    It can be used as a component of your system.
  • NegEx-MES. See it on Zenodo.
    A system for negation detection in Spanish clinical texts based on NegEx algorithm.
    It can be used as a component of your system.
  • Negation corpus. See it on GitHub
    A Corpus of Negation and Uncertainty in Spanish Clinical Texts (and instructions to train the system).
  • AbreMES-X. See it on Zenodo.
    Software used to generate the Spanish Medical Abbreviation DataBase.
  • AbreMES-DB. See it on Zenodo.
    Spanish Medical Abbreviation DataBase.
    It can be used to fine-tune your system.
  • MeSpEn Glossaries. See it on Zenodo.
    Repository of bilingual medical glossaries made by professional translators.
    It can be used to fine-tune your system.

Terminological Resources

Other Relevant Systems

  • Live demo of a NER for drug/chemical/gene in Spanish clinical texts, here.
  • NER for drug/chemical/gene with BERT on Spanish clinical texts, here.
  • Alternative NER for drug/chemical/gene on Spanish clinical documents, here.