{"id":406,"date":"2022-05-03T14:15:41","date_gmt":"2022-05-03T13:15:41","guid":{"rendered":"https:\/\/temu.bsc.es\/livingner\/?p=406"},"modified":"2022-05-09T12:48:39","modified_gmt":"2022-05-09T11:48:39","slug":"multilingual-corpus","status":"publish","type":"post","link":"https:\/\/temu.bsc.es\/livingner\/2022\/05\/03\/multilingual-corpus\/","title":{"rendered":"Multilingual corpus"},"content":{"rendered":"\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><em>Download the LivingNER corpus (including the multilingual corpus) from&nbsp;<\/em><a rel=\"noreferrer noopener\" href=\"https:\/\/doi.org\/10.5281\/zenodo.6376662\" target=\"_blank\">zenodo<\/a><\/p><\/blockquote>\n\n\n\n<p>We have generated the annotated (and normalized to NCBI Taxonomy) training and validation sets in 6 languages: English, Portuguese, Catalan, Italian, French, and Romanian.&nbsp;The process was:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>The&nbsp;text files were translated with a neural machine translation system.<\/li><li>The annotations were translated with the same&nbsp;neural machine translation system.<\/li><li>The translated annotations were transferred to the translated&nbsp;text files using an annotation transfer technology.<\/li><\/ol>\n\n\n\n<p>If you want to visualize the multilingual resources, check out this Brat server:\u00a0<a rel=\"noreferrer noopener\" href=\"https:\/\/temu.bsc.es\/mLivingNER\/#\/translations\/\" target=\"_blank\">https:\/\/temu.bsc.es\/mLivingNER\/#\/translations\/<\/a><br>For instance, you can see the parallel annotations\u00a0in\u00a0<a rel=\"noreferrer noopener\" href=\"https:\/\/temu.bsc.es\/mLivingNER\/diff.xhtml#\/translations\/en\/annotation_transfer\/train\/casos_clinicos_cardiologia34?diff=\/translations\/fr\/annotation_transfer\/train\/\" target=\"_blank\">English vs\u00a0in French<\/a>, or in\u00a0<a rel=\"noreferrer noopener\" href=\"https:\/\/temu.bsc.es\/mLivingNER\/diff.xhtml#\/translations\/cat\/annotation_transfer\/train\/casos_clinicos_cardiologia35?diff=\/gold-standard\/train\/\" target=\"_blank\">Spanish (the gold standard) vs in Italian<\/a>.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/temu.bsc.es\/livingner\/wp-content\/uploads\/2022\/05\/LivingNER_multilingual-1-1024x576.png\" alt=\"\" class=\"wp-image-428\" srcset=\"https:\/\/temu.bsc.es\/livingner\/wp-content\/uploads\/2022\/05\/LivingNER_multilingual-1-1024x576.png 1024w, https:\/\/temu.bsc.es\/livingner\/wp-content\/uploads\/2022\/05\/LivingNER_multilingual-1-300x169.png 300w, https:\/\/temu.bsc.es\/livingner\/wp-content\/uploads\/2022\/05\/LivingNER_multilingual-1-768x432.png 768w, https:\/\/temu.bsc.es\/livingner\/wp-content\/uploads\/2022\/05\/LivingNER_multilingual-1-1536x864.png 1536w, https:\/\/temu.bsc.es\/livingner\/wp-content\/uploads\/2022\/05\/LivingNER_multilingual-1-2048x1152.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>LivingNER Multilingual corpus overview<\/figcaption><\/figure><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Download the LivingNER corpus (including the multilingual corpus) from&nbsp;zenodo We have generated the annotated (and normalized to NCBI Taxonomy) training and validation sets in 6 languages: English, Portuguese, Catalan, Italian, French, and Romanian.&nbsp;The process was: The&nbsp;text files were translated with a neural machine translation system. The annotations were translated with the same&nbsp;neural machine translation system. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[],"class_list":["post-406","post","type-post","status-publish","format-standard","hentry","category-data"],"_links":{"self":[{"href":"https:\/\/temu.bsc.es\/livingner\/wp-json\/wp\/v2\/posts\/406","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/temu.bsc.es\/livingner\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/temu.bsc.es\/livingner\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/temu.bsc.es\/livingner\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/temu.bsc.es\/livingner\/wp-json\/wp\/v2\/comments?post=406"}],"version-history":[{"count":3,"href":"https:\/\/temu.bsc.es\/livingner\/wp-json\/wp\/v2\/posts\/406\/revisions"}],"predecessor-version":[{"id":429,"href":"https:\/\/temu.bsc.es\/livingner\/wp-json\/wp\/v2\/posts\/406\/revisions\/429"}],"wp:attachment":[{"href":"https:\/\/temu.bsc.es\/livingner\/wp-json\/wp\/v2\/media?parent=406"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/temu.bsc.es\/livingner\/wp-json\/wp\/v2\/categories?post=406"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/temu.bsc.es\/livingner\/wp-json\/wp\/v2\/tags?post=406"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}