MEDDOPROF’s evaluation will be done using the official evaluation library, which can be downloaded from GitHub. This library is written in Python 3 and intended to be run via command line:

$ python main.py -g ../gs-data/ner/ -p ../toy-data/ner/ -s ner
$ python main.py -g ../gs-data/class/ -p ../toy-data/class -s class
$ python main.py -g ../gs-data/gs-norm.tsv -p ../toy-data/pred-norm.tsv -c ../meddoprof_valid_codes.tsv.tsv -s norm

For all subtasks, the relevant metrics are precision, recall and f1-score. The latter will be used to decide the award winners.