Generation of definitions
To shortly summarize the BARR task 2, the core task of BARR2, the manual annotation process requires providing for all the abbreviations manually labelled in the clinical case texts its most probable correct definition (or long form). The most probable definition requires domain experts to read the context (clinical case) and, depending on the clinical context assign the most probable and commonly used definition.
This definition annotation process relies on several steps, which include:
- based on the expert domain knowledge of the annotator, in case it is totally clear, provide the correct definition directly.
- checking whether the actual clinical case already provides the potential correct candidate definition in the text.
- searching the potential correct definition for a particular abbreviation in the corresponding full text article of the clinical case.
- searching for abbreviations (SF-LF) derived from a large collection of Spanish medical articles (BARR-2017 document collections, SCIELO, IBECS,..).
- searching for candidate definitions using the original abbreviation as query input in the “Diccionario de Siglas Medicas”.
- searching in other online abbreviation resources available such as ALLIE and Acromine.
- using online searches in Google with the abbreviation as query and if necessary some additional keywords that might return abbreviation definitions for the candidate abbreviation.
- in some cases, specially when several alternative definitions or definition variants are found the domain expert might use the number of total hits returned in internet searches to decide which is the most prevalent definition variant.
Note that in general the annotation process tries to focus on the actual “best/most adequate” definition, that is not only the correct long form but also the most appropriate language of the abbreviation definition. In case the definition of an abbreviation can be correctly written in Spanish, the preferred abbreviation definition is in Spanish, but if based on abbreviation letters and the practical usage the definition is not Spanish (often English), the non-Spanish definition will be annotated.
We require detection of all abbreviations, medical or not medical.