Evaluation

Evaluation will be done by comparing the automatically generated results to the results generated by manual annotation of experts, with the main metrics being micro-averaged precision, recall and F1-scores.

A specific evaluation library will be released for the task. In the meantime, you can use the MedProcNER evaluation library, using the options for sub-task 1, as it will be quite similar.