DISTEMIST gazetteer

  • DISTEMIST gazetteer: contains main terms and synonyms from the relevant branches of Snomed-CT for the grounding of disease mentions. Relevant for NER and Entity linking. Find it on Zenodo.

Evaluation Script

Word embeddings

  • Spanish Medical Word Embeddings. Word embeddings generated from Spanish medical corpora. Download them from Zenodo.
    It can be used as a building block for clinical NLP systems used in Spanish texts.


Dictionary lookup based on Levenshtein distance. It looks for train and development annotations in the test set.

Linguistic Resources

  • CUTEXT. See it on GitHub.
    Medical term extraction tool.
    It can be used to extract relevant medical terms from clinical cases.
  • SPACCC POS Tagger. See it on Zenodo.
    Part Of Speech Tagger for Spanish medical domain corpus.
    It can be used as a component of your system.
  • NegEx-MES. See it on Zenodo.
    A system for negation detection in Spanish clinical texts based on NegEx algorithm.
    It can be used as a component of your system.
  • Negation corpus. See it on GitHub
    A Corpus of Negation and Uncertainty in Spanish Clinical Texts (and instructions to train the system).
  • AbreMES-X. See it on Zenodo.
    Software used to generate the Spanish Medical Abbreviation DataBase.
  • AbreMES-DB. See it on Zenodo.
    Spanish Medical Abbreviation DataBase.
    It can be used to fine-tune your system.
  • MeSpEn Glossaries. See it on Zenodo.
    Repository of bilingual medical glossaries made by professional translators.
    It can be used to fine-tune your system.

Terminological Resources

  • List of valid Snomed-CT codes. To be published

Other Relevant Systems

  • Live demo of a NER for drug/chemical/gene in Spanish clinical texts, here.
  • NER for drug/chemical/gene with BERT on Spanish clinical texts, here.
  • Alternative NER for drug/chemical/gene on Spanish clinical documents, here.