The LivingNER evaluation library is available on GitHub.

Please, make sure you have the latest version.

Evaluation process: (1) you submit your results, (2) we perform the evaluation off-line and (3) return the final scores.

These scripts are distributed as part of the LivingNER shared task. They are written in Python3 and intended to be run via command line:

$> python main.py -g ../gs-data/sample_entities_subtask1.tsv -p ../toy-data/sample_entities_subtask1_MISSING_ONE_FILE.tsv -s ner
$> python main.py -g ../gs-data/sample_entities_subtask2.tsv -p ../toy-data/sample_entities_subtask2_MISSING_ONE_FILE.tsv -s norm 
$> python main.py -g ../gs-data/sample_entities_subtask3.tsv -p ../toy-data/pred_sample_entities_subtask3.tsv -s app

They produce the evaluation metrics for the corresponding sub-tracks: precision, recall and F-score for LivingNER Species NER and LivingNER Species NORM.