The CodiEsp evaluation script can be downloaded from GitHub.

Please, make sure you have the latest version.


Introduction

These scripts are distributed as part of the Clinical Cases Coding in Spanish language Track (CodiEsp). They are intended to be run via command line:

$> python3 codiespD_P_evaluate.py -g /path/to/gold_standard.tsv -p /path/to/predictions.tsv -c /path/to/codes.tsv
$> python3 codiespX_evaluate.py -g /path/to/gold_standard.tsv -p /path/to/predictions.tsv -cD /path/to/codes-D.tsv -cP path/to/codes-P.tsv

They produce the evaluation metrics for the corresponding sub-tracks (Mean Average Precision for the sub-tracks CodiEsp-D and CodiEsp-P and the custom evaluation score of sub-track CodiEsp-X).

gold_standard.tsv must be the gold standard files distributed in the CodiEsp Track webpage or have the same format.

predictions.tsv must be the predictions file. For CodiEsp-D and CodiEsp-P, it is a tab-separated file with two columns: clinical case and code. Codes must be ordered by rank. For example:

CodiEsp-D and CodiEsp-P predictions example screenshot.

For CodiEsp-X, the file predictions.tsv is also a tab-separated file. In this case, with four columns: clinical case, reference position, code label, code. For example:

CodiEsp-X Predictions example screenshot.

codes.tsv must be the files with the valid codes downloaded from Zenodo.

Prerequisites

This software requires to have Python 3 installed on your system with the libraries Pandas, NumPy, SciPy, Matplotlib, and trectools.

Directory structure

The directory structure of the evaluation library GitHub repository is not mandatory to run the Python scripts.

Usage

Both scripts accept the same two parameters:

  • The --gs_path (-g) option specifies the path to the Gold Standard file.
  • The --pred_path (-p) option specifies the path to the predictions file.

In addition, codiespD_P_evaluate.py requires an extra parameter:

  • The --valid_codes_path (-c) option specifies the path to the list of valid codes for the CodiEsp subtask we are evaluating.

Finally, codiespX_evaluate.py requires two extra parameters:

  • The --valid_codes_D_path (-cD) option specifies the path to the list of valid codes for the CodiEsp-D subtask.
  • The --valid_codes_P_path (-cP) option specifies the path to the list of valid codes for the CodiEsp-P subtask.

Contact for technical issues

Antonio Miranda-Escalada (antonio.miranda@bsc.es)

License