Examples are taken from the evaluation library toy data, available on GitHub

Example 1: Evaluate the toy data output for CANTEMIST-NER

$> cd src
$> python main.py -g ../gs-data/ -p ../toy-data/ -s ner

-----------------------------------------------------
Clinical case name			Precision
-----------------------------------------------------
cc_onco1.ann		0.5
-----------------------------------------------------
cc_onco3.ann		1.0
-----------------------------------------------------

Micro-average precision = 0.846


-----------------------------------------------------
Clinical case name			Recall
-----------------------------------------------------
cc_onco1.ann		0.667
-----------------------------------------------------
cc_onco3.ann		1.0
-----------------------------------------------------

Micro-average recall = 0.917


-----------------------------------------------------
Clinical case name			F-score
-----------------------------------------------------
cc_onco1.ann		0.571
-----------------------------------------------------
cc_onco3.ann		1.0
-----------------------------------------------------

Micro-average F-score = 0.88

Example 2: Evaluate the toy data output for CANTEMIST-NORM

$> cd src
$> python main.py -g ../gs-data/ -p ../toy-data/ -s norm

-----------------------------------------------------
Clinical case name			Precision
-----------------------------------------------------
cc_onco1.ann		0.25
-----------------------------------------------------
cc_onco3.ann		1.0
-----------------------------------------------------

Micro-average precision = 0.769


-----------------------------------------------------
Clinical case name			Recall
-----------------------------------------------------
cc_onco1.ann		0.333
-----------------------------------------------------
cc_onco3.ann		1.0
-----------------------------------------------------

Micro-average recall = 0.833


-----------------------------------------------------
Clinical case name			F-score
-----------------------------------------------------
cc_onco1.ann		0.286
-----------------------------------------------------
cc_onco3.ann		1.0
-----------------------------------------------------

Micro-average F-score = 0.8

Example 3: Evaluate the toy data output for CANTEMIST-CODING

$> cd src
$> python main.py -g ../gs-data/gs-coding.tsv -p ../toy-data/pred-coding.tsv -c ../valid-codes.tsv -s coding

MAP estimate: 0.75