Pubmed is a free search engine used to access the Medline database, a bibliographical database of references and abstracts on life sciences and biomedical topics. It is maintained by the U.S. National Library of Medicine (NLM). This corpus contains titles and abstracts from 127,619 records. Users can find the metadata of each record written in Dublin Core format. The original XML file of the record provided by PubMed is provided as well. Languages: - English - Spanish This corpus contains the following files: - Pubmed-dublin_core-Sp-En.tar.bz2: This file contain the metadata files for each Pubmed record in Dublin Core format. We can find the titles and abstracts in English and Spanish inside these XML files. - '' contains the title in Spanish. - '' contains the title in English. - '' contains the abstract in Spanish. - '' contains the abstract in Spanish. Other XML nodes include the following information: - '': Pubmed identifier (PMID). - '': ISSN code of the journal. - '': authors of the article. - '': language the article was originally written in. - '': name of the journal the record was published in. - '': Pubmed identifier (PMID). - '': record's keywords. - '': record type. - '': date of publication (year and month). - Pubmed-original_xml-Sp-En.tar.bz2: This file contains the original metadata for each record found in the Pubmed repository in XML format. Here we list the nodes used to map their equivalences in Dublin Core standard metadata: - '': title in Spanish. - '': title in English. - '': abstract in Spanish. - '': abstract in English. - '': Pubmed identifier. - '': ISSN code of the journal. - '': authors of the article. - '': language the article was originally written in. - '' (inside node '<Journal>'): name of the journal the record was published in. - '<Keyword>': record's keywords. - '<PublicationType>': record type. - '<DateCompleted>': date of publication (year and month). This corpus is available for free use.