MEDDOPROF: MEDical DOcuments PROFessions recognition shared task

The MEDDOPROF Track is sponsored by Plan de Impulso de las Tecnologías del Lenguaje (Plan TL) and is part of the IberLEF 2021 evaluation campaign.

Please, cite us as:

Lima-López, Salvador, Eulàlia Farré-Maduell, Antonio Miranda-Escalada, Vicent Brivá-Iglesias, & Martin Krallinger. “NLP applied to occupational health: MEDDOPROF shared task at IberLEF 2021 on automatic recognition, classification and normalization of professions and occupations from medical texts.” Procesamiento del Lenguaje Natural [Online], 67 (2021): 243-256.

@article{meddoprof,
title={NLP applied to occupational health: MEDDOPROF shared task at IberLEF 2021 on automatic recognition, classification and normalization of professions and occupations from medical texts},
author={Lima-López, Salvador and Farré-Maduell, Eulàlia and Miranda-Escalada, Antonio and Brivá-Iglesias, Vicent and Krallinger, Martin},
journal = {Procesamiento del Lenguaje Natural}, volume = {67}, year={2021}, issn = {1989-7553},
url = {http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6393}, pages = {243--256}}
Figure 1. Overview of the Shared Task

About the task

Professions and employment status are crucial to our identity.  As children, we are often asked what we want to do when we grow up. Occupations have a radical impact on physical and mental health, habits and lifestyle choices. An entire medical specialty, occupational medicine, is needed for the prevention and management of the deleterious effects of our jobs on health (workplace accidents, short and long-term effect of exposition to toxic substances and pathogens, work-related mental health issues such as overburden and stress). The COVID-19 pandemic has also underscored this influence, as many people with specific occupations have been specially affected (for instance, health professionals and other essential workers). Unemployment, working illegally or temporary and precarious job conditions are other elements that can greatly affect people’s lives.

Tools that automatically detect these sociodemographic factors can help researchers to better characterize multiple health aspects related to specific occupations. However, up until now these entities have mostly been ignored. The MEDDOPROF Shared Task takes a more comprehensive look at occupations, also considering employment statuses and non-paid activities.

Outside medicine, we anticipate the use of systems resulting from MEDDOPROF may be used in fields such as social care, human resources, legal NLP and even gender studies. Other NLP tasks such as anonymization might also benefit from having more exhaustive annotated data in this regard.

Figure 2. Examples from the MEDDOPROF corpus related to multiple use cases. [Full text: (1), (2), (3)]