{"id":12,"date":"2022-03-31T12:53:05","date_gmt":"2022-03-31T12:53:05","guid":{"rendered":"https:\/\/temu.bsc.es\/clinspen\/?page_id=12"},"modified":"2023-03-09T09:52:35","modified_gmt":"2023-03-09T09:52:35","slug":"clinical-cases","status":"publish","type":"page","link":"https:\/\/temu.bsc.es\/clinspen\/clinical-cases\/","title":{"rendered":"ClinSpEn-Clinical Cases"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\"><strong>ClinSpEn-Clinical Cases<\/strong><\/h1>\n\n\n\n<p>The <strong>ClinSpEn-CC (clinical cases)<\/strong> dataset is a collection of <strong>EN-ES parallel COVID-19 clinical cases<\/strong>.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Track direction: EN &gt; ES<\/li>\n\n\n\n<li>Link to data: <a href=\"https:\/\/doi.org\/10.5281\/zenodo.6497350\" target=\"_blank\" rel=\"noreferrer noopener\">Zenodo<\/a>.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Overview<\/h3>\n\n\n\n<p>Clinical cases are a text genre where a patient&#8217;s current condition, medical history, clinical presentation, examinations, treatment and diagnosis are described. They can be pretty similar to Electronic Health Records (EHRs) both in form and content. However, unlike EHRs, clinical cases are often free of privacy-related issues. This means that they can be used as substitute to train NLP systems for the clinical domain.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Corpus Description<\/h3>\n\n\n\n<p>The dataset&#8217;s clinical cases were carefully selected to cover a wide range of aspects related to COVID-19: different types of patients (children, adults, elderly and pregnant people, babies), different comorbidities (cancer, mental health issues, immunosuppressed patients) and symptomatology (mild and severe presentations, dermatologic, immunologic and psychiatric manifestations, thrombosis, &#8230;). The reports were translated from English to Spanish by a professional medical translator on a first step and revised by a clinical expert on a second step.<\/p>\n\n\n\n<p>ClinSpEn-CC includes a total of 202 parallel clinical cases. Each file is duplicated, with the Spanish version having a &#8220;.es&#8221; extension and the English files having a &#8220;.en&#8221; extension. Each report has been parallelized so that every sentence&#8217;s line number corresponds to the same sentence&#8217;s line number in both languages.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Corpus Partitions<\/h3>\n\n\n\n<p>ClinSpEn-CC is divided as follows:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>ClinSpEn-CC Sample Set<\/strong> (50 documents)<\/li>\n\n\n\n<li><strong>ClinSpEn-CC Test Set<\/strong> (152 documents)<\/li>\n<\/ul>\n\n\n\n<p>In addition, we include a larger collection of 9,804 monolingual (English) clinical cases of different topics and specialties (<em>background set<\/em>) that can be used to evaluate the systems&#8217; performance in new, unseen data.<\/p>\n\n\n\n<p>Below is an example of a parallel case report taken from the sample set:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"928\" height=\"348\" src=\"https:\/\/temu.bsc.es\/clinspen\/wp-content\/uploads\/2022\/07\/clinspen-clinicalcase-example.png\" alt=\"\" class=\"wp-image-168\" srcset=\"https:\/\/temu.bsc.es\/clinspen\/wp-content\/uploads\/2022\/07\/clinspen-clinicalcase-example.png 928w, https:\/\/temu.bsc.es\/clinspen\/wp-content\/uploads\/2022\/07\/clinspen-clinicalcase-example-300x113.png 300w, https:\/\/temu.bsc.es\/clinspen\/wp-content\/uploads\/2022\/07\/clinspen-clinicalcase-example-768x288.png 768w\" sizes=\"auto, (max-width: 928px) 100vw, 928px\" \/><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>ClinSpEn-Clinical Cases The ClinSpEn-CC (clinical cases) dataset is a collection of EN-ES parallel COVID-19 clinical cases. Overview Clinical cases are a text genre where a patient&#8217;s current condition, medical history, clinical presentation, examinations, treatment and diagnosis are described. They can be pretty similar to Electronic Health Records (EHRs) both in form and content. However, unlike [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-12","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/temu.bsc.es\/clinspen\/wp-json\/wp\/v2\/pages\/12","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/temu.bsc.es\/clinspen\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/temu.bsc.es\/clinspen\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/temu.bsc.es\/clinspen\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/temu.bsc.es\/clinspen\/wp-json\/wp\/v2\/comments?post=12"}],"version-history":[{"count":23,"href":"https:\/\/temu.bsc.es\/clinspen\/wp-json\/wp\/v2\/pages\/12\/revisions"}],"predecessor-version":[{"id":260,"href":"https:\/\/temu.bsc.es\/clinspen\/wp-json\/wp\/v2\/pages\/12\/revisions\/260"}],"wp:attachment":[{"href":"https:\/\/temu.bsc.es\/clinspen\/wp-json\/wp\/v2\/media?parent=12"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}