Annotation Guidelines

This page gives an overview of the annotation and normalization scheme and process of the MedProcNER/ProcTEMIST corpus. More detailed information is available in the Annotation Guidelines, a 30+ pages long file that documents the corpus’s creation and annotation process. They are available on Zenodo.

The MedProcNER/ProcTEMIST guidelines were created by clinical experts at the same time as the DisTEMIST guidelines. After their definition, the guidelines were refined in several cycles of quality control and annotation consistency analysis before annotating the entire dataset, with a final agreement of… Additionally, once the manual annotation phase was finished, the corpus was thoroughly revised in a post-processing step to maximize consistency.

This page has two parts:

  1. Annotation
  2. Normalization

Annotation

The corpus includes only one label: PROCEDIMIENTO (clinical procedure). Despite this unicity, the label and corpus itself are very varied. Several kinds of procedures are annotated, including diagnostic, therapeutic, preventive and supportive procedures. Some specific examples are:

  • Simple medical exploration and inspection methods (that require little or no instrumentation): These procedures involve the use of basic diagnostic techniques to examine a patient’s body for signs of illness or disease. Examples include listening to the lungs with a stethoscope (“auscultación pulmonar”, pulmonary auscultation), feeling the abdomen for abnormalities (“palpación abdominal”, abdominal palpation), or checking the patient’s neurological responses (“exploración neurológica”, neurological examination).
  • Imaging tests: These procedures involve the use of advanced medical technology to produce images of the inside of the body, which can be used to diagnose and monitor various conditions. Examples include magnetic resonance imaging (MRI) of the brain (“RMN cerebral”), computed tomography (CT) of the chest with contrast (“TAC torácico con contraste”), or x-rays of the femur from the anterior-posterior (AP) view (“RX de fémur AP”).
  • Other medical tests: These procedures involve the use of laboratory tests or other diagnostic tools to evaluate a patient’s health status or monitor their condition. Examples include a complete blood count (“hemograma”, hemogram), electrocardiogram (“electrocardiograma” or “ECG”) to measure heart function, or electroencephalogram (“electroencefalograma” or “EEG”) to measure brain activity.
  • Administration of medications: These procedures involve the delivery of medications to treat or manage a patient’s medical condition. Examples include antibiotic therapy to treat bacterial infections (“antibioterapia”) or corticosteroids (“corticosteroides”) to reduce inflammation.
  • Administration of blood, plasma, serums, bolus and continuous medication pumps: These procedures involve the delivery of fluids, nutrients, or medications directly into a patient’s bloodstream. Examples include a blood transfusion to replace lost blood (“transfusión de 2 concentrados de hematíes”) or fluid therapy to treat dehydration (“sueroterapia”).
  • Simplified surgical treatments: These procedures involve minimally invasive or straightforward surgical procedures that can be performed relatively quickly and easily. Examples include removal of the prostate gland through an incision in the lower abdomen (“adenomectomía retropúbica”, retropubic adenomectomy) or placement of a testicular prosthesis (“se coloca prótesis testicular”).
  • Surgical descriptions: These procedures involve detailed accounts of surgical procedures, including the steps involved, the instruments used, and any complications that may arise. Examples include “reconstructed with a chin graft and an arched titanium plate” (“se reconstruyó con injerto de mentón y placa de titanio arqueada”) or “the intercortical gaps were filled with cancellous bone obtained from the donor area” (“los gaps intercorticales se rellenaron de hueso esponjoso obtenido de la zona donante”).

As with many other clinical entities, detecting and annotating procedures in structured text can be quite complicated due to the use of descriptive language, abbreviations, multiple parts (i.e. anatomical entities or instruments) and even ambiguous wording.

Physical examination revealed a good general state of health, with normal abdomen and genitalia; rectal examination was compatible with grade I/IV prostate adenoma.
Urinalysis showed 4 red blood cells/field and 0-5 leukocytes/field; the rest of the sediment was normal.
Normal haemogram; biochemistry showed glycaemia of 169 mg/dl and triglycerides of 456 mg/dl; normal liver and kidney function. PSA of 1.16 ng/ml.
Urine cytology was repeatedly suspicious for malignancy.
Simple abdominal X-ray shows degenerative changes in the lumbar spine and vascular calcifications in both hypochondrium and pelvis.
Urological ultrasound revealed the existence of simple cortical cysts in the right kidney, bladder without alterations with good capacity and prostate weighing 30g.
IVU showed bilateral renal normofunctionalism, calcifications on the right renal silhouette and arrhotic ureters with addition images in the upper third...

Normalization

All entities in the corpus were normalized to SNOMED CT concepts.