Submission

Submission instructions (quick guide)

  1. Compile the 5 prediction files of each subtask in a separate ZIP file.
  2. Access https://easychair.org/conferences/?conf=clefehealth2020runs
  3. Select enter as author
  4. Read the instructions.
    Tips:
    – Use your registration team name and the subtask name in the title.
    – Do not include in the same ZIP files from different subtasks.
    – Verify the files have correct format.
    – Include in the ZIP file a text document explaining who you are, your registration mail and registration name.
  5. Fill the form.
  6. Upload one ZIP and click submit.

Click here for a PDF with more detailed instructions.

Submission file format

Prediction files SHOULD NOT have headings (in the pictures below are added to ease interpretability). The first line of the prediction file should be a prediction.

For CodiEsp-D and CodiEsp-P, the submission file is tab-separated with two columns: clinical case and code. Codes must be ordered by rank/confidence, with more relevant codes first. For example:

CodiEsp-D and CodiEsp-P predictions example screenshot.

For CodiEsp-X, the submission file is also tab-separated. In this case, with four columns: clinical case, reference position, code label, code. Codes do not need to be ordered by rank here. For example:

CodiEsp-X Predictions example screenshot.

Examples

The CodiEsp evaluation script can be downloaded from GitHub.

Please, make sure you have the latest version.


Example 1: CodiEsp-D or CodiEsp-P
Evaluate the system output pred_D.tsv against the gold standard gs_D.tsv (both inside toy_data subfolders).

$>  python3 codiespD_P_evaluation.py -g gold/toy_data/gs_D.tsv -p system/toy_data/pred_D.tsv -c codiesp_codes/codiesp-D_codes.tsv

MAP estimate: 0.444

Example 2: CodiEsp-X
Evaluate the system output pred_X.tsv against the gold standard gs_X.tsv (both inside toy_data subfolders).

$> python3 codiespX_evaluation.py -g gold/toy_data/gs_X.tsv -p system/toy_data/pred_X.tsv -cD codiesp_codes/codiesp-D_codes.tsv -cP codiesp_codes/codiesp-P_codes.tsv 

-----------------------------------------------------
Clinical case name			Precision
-----------------------------------------------------
S0000-000S0000000000000-00		nan
-----------------------------------------------------
S1889-836X2016000100006-1		0.625
-----------------------------------------------------
codiespX_evaluation.py:248: UserWarning: Some documents do not have predicted codes, document-wise Precision not computed for them.

Micro-average precision = 0.556


-----------------------------------------------------
Clinical case name			Recall
-----------------------------------------------------
S0000-000S0000000000000-00		nan
-----------------------------------------------------
S1889-836X2016000100006-1		0.455
-----------------------------------------------------
codiespX_evaluation.py:260: UserWarning: Some documents do not have Gold Standard codes, document-wise Recall not computed for them.

Micro-average recall = 0.385


-----------------------------------------------------
Clinical case name			F-score
-----------------------------------------------------
S0000-000S0000000000000-00		nan
-----------------------------------------------------
S1889-836X2016000100006-1		0.526
-----------------------------------------------------
codiespX_evaluation.py:271: UserWarning: Some documents do not have predicted codes, document-wise F-score not computed for them.
codiespX_evaluation.py:274: UserWarning: Some documents do not have Gold Standard codes, document-wise F-score not computed for them.

Micro-average F-score = 0.455


__________________________________________________________

MICRO-AVERAGE STATISTICS:

Micro-average precision = 0.556

Micro-average recall = 0.385

Micro-average F-score = 0.455

Contact for technical issues

Antonio Miranda-Escalada (antonio.miranda@bsc.es)

Evaluation Library

The CodiEsp evaluation script can be downloaded from GitHub.

Please, make sure you have the latest version.


Introduction

These scripts are distributed as part of the Clinical Cases Coding in Spanish language Track (CodiEsp). They are intended to be run via command line:

$> python3 codiespD_P_evaluate.py -g /path/to/gold_standard.tsv -p /path/to/predictions.tsv -c /path/to/codes.tsv
$> python3 codiespX_evaluate.py -g /path/to/gold_standard.tsv -p /path/to/predictions.tsv -cD /path/to/codes-D.tsv -cP path/to/codes-P.tsv

They produce the evaluation metrics for the corresponding sub-tracks (Mean Average Precision for the sub-tracks CodiEsp-D and CodiEsp-P and the custom evaluation score of sub-track CodiEsp-X).

gold_standard.tsv must be the gold standard files distributed in the CodiEsp Track webpage or have the same format.

predictions.tsv must be the predictions file. For CodiEsp-D and CodiEsp-P, it is a tab-separated file with two columns: clinical case and code. Codes must be ordered by rank. For example:

CodiEsp-D and CodiEsp-P predictions example screenshot.

For CodiEsp-X, the file predictions.tsv is also a tab-separated file. In this case, with four columns: clinical case, reference position, code label, code. For example:

CodiEsp-X Predictions example screenshot.

codes.tsv must be the files with the valid codes downloaded from Zenodo.

Prerequisites

This software requires to have Python 3 installed on your system with the libraries Pandas, NumPy, SciPy, Matplotlib, and trectools.

Directory structure

The directory structure of the evaluation library GitHub repository is not mandatory to run the Python scripts.

Usage

Both scripts accept the same two parameters:

  • The --gs_path (-g) option specifies the path to the Gold Standard file.
  • The --pred_path (-p) option specifies the path to the predictions file.

In addition, codiespD_P_evaluate.py requires an extra parameter:

  • The --valid_codes_path (-c) option specifies the path to the list of valid codes for the CodiEsp subtask we are evaluating.

Finally, codiespX_evaluate.py requires two extra parameters:

  • The --valid_codes_D_path (-cD) option specifies the path to the list of valid codes for the CodiEsp-D subtask.
  • The --valid_codes_P_path (-cP) option specifies the path to the list of valid codes for the CodiEsp-P subtask.

Contact for technical issues

Antonio Miranda-Escalada (antonio.miranda@bsc.es)

License