DeCS (Health Sciences Descriptors) is a trilingual and structured vocabulary created by BIREME to serve as a unique language in indexing articles from scientific journals, books, conference proceedings, technical reports, and other types of materials, as well as for searching and retrieving subjects from scientific literature from information sources available on the Virtual Health Library (VHL) such as LILACS, MEDLINE, among others.

It was developed from the MeSH – Medical Subject Headings of the U.S. National Library of Medicine (NLM).

DeCS is part of the LILACS Methodology and is an integrating component of the Virtual Health Library. DeCS participates in the unified terminology development project, UMLS – Unified Medical Language System of the NLM, with the responsibility of contributing with the terms in Portuguese and Spanish.

The concepts that characterize the DeCS vocabulary are organized in a tree structure allowing a search on broader or narrower terms or on all terms from the same tree within the hierarchical structure.

Besides the MeSH terms, DeCS also includes terminology in specific areas such as Public Health, Homeopathy, Science and Health, and Health Surveillance in addition to the original MeSH terms.

The training data set lists the DeCs codes assigned to a record in the source data. The original XML data contains descriptors (no codes). This year, we have converted those descriptors to their correspondant decsCode using the DeCS 2020 conversion table available in the Zenodo page with the rest of data to participate in MESINESP. The format of this file is:

  • DeCs code
  • Preferred descriptor (the label used in the European DeCs 2019 set)
  • List of synonyms (the descriptors and synonyms from both European and Latin Spanish DeCs 2019 data sets, separated by pipes)

The DeCS 2020 version does not implement very specific descriptors for COVID. Thanks to the collaboration of BIREME, we have been able to include some new COVID-related descriptors that will be used in future versions of DeCS. Training articles do not use these terms, but they will appear in a future version of the development set that will enable systems to properly classify this type of content.

For more information check BioASQ Spanish Track.

Aknowledge:

Thanks to Bireme/OPS, copyright owner of DeCS 2020, for allowing us to expressly use the DeCS vocabulary in the MESINESP task.