MEDDOPLACE Data

The MEDDOPLACE corpus is a collection of 1,000 clinical cases in Spanish from different medical specialties annotated with locations and location-related content such as nationalities, languages or travel words. Every mention in the corpus has been standardized using either GeoNames, PlusCodes or SNOMED-CT to allow the development and benchmarking of geographical normalization and geocoding systems.

  • For more information about the corpus content and format, check the Corpus Description page.
  • For more information about the annotation and normalization of locations, including corpus examples in Spanish and English, check the Annotation Guidelines page.
  • To download the corpus and some additional resources, check the Download page.