Examples are taken from the evaluation library toy data, available on GitHub
Example 1: Evaluate the toy data output for CANTEMIST-NER
$> cd src $> python main.py -g ../gs-data/ -p ../toy-data/ -s ner ----------------------------------------------------- Clinical case name Precision ----------------------------------------------------- cc_onco1.ann 0.5 ----------------------------------------------------- cc_onco3.ann 1.0 ----------------------------------------------------- Micro-average precision = 0.846 ----------------------------------------------------- Clinical case name Recall ----------------------------------------------------- cc_onco1.ann 0.667 ----------------------------------------------------- cc_onco3.ann 1.0 ----------------------------------------------------- Micro-average recall = 0.917 ----------------------------------------------------- Clinical case name F-score ----------------------------------------------------- cc_onco1.ann 0.571 ----------------------------------------------------- cc_onco3.ann 1.0 ----------------------------------------------------- Micro-average F-score = 0.88
Example 2: Evaluate the toy data output for CANTEMIST-NORM
$> cd src $> python main.py -g ../gs-data/ -p ../toy-data/ -s norm ----------------------------------------------------- Clinical case name Precision ----------------------------------------------------- cc_onco1.ann 0.25 ----------------------------------------------------- cc_onco3.ann 1.0 ----------------------------------------------------- Micro-average precision = 0.769 ----------------------------------------------------- Clinical case name Recall ----------------------------------------------------- cc_onco1.ann 0.333 ----------------------------------------------------- cc_onco3.ann 1.0 ----------------------------------------------------- Micro-average recall = 0.833 ----------------------------------------------------- Clinical case name F-score ----------------------------------------------------- cc_onco1.ann 0.286 ----------------------------------------------------- cc_onco3.ann 1.0 ----------------------------------------------------- Micro-average F-score = 0.8
Example 3: Evaluate the toy data output for CANTEMIST-CODING
$> cd src $> python main.py -g ../gs-data/gs-coding.tsv -p ../toy-data/pred-coding.tsv -c ../valid-codes.tsv -s coding MAP estimate: 0.75