OLAC Record oai:catalogue.elra.info:ELRA-W0090 |
Metadata | ||
Title: | EUROPARL Corpus Parallel Corpora: Portuguese-English | |
Access Rights: | Rights available for: nonCommercialUse, commercialUse | |
Date Available (W3CDTF): | 2016-01-20 | |
Date Issued (W3CDTF): | 2016-01-20 | |
Date Modified (W3CDTF): | 2016-01-20 | |
Description: | The EUROPARL Corpus (Portuguese-English subpart of the parallel corpora), was extracted from the proceedings of the European Parliament. It contains transcriptions of sessions dating back from 1996 to 2011, with a total of approximately 58,324,562 tokens of European Portuguese (L1) and 49,216,896 tokens of English (translation). The EUROPARL Corpus is composed of one text file for the English corpus and two files for the Portuguese version: a text file and an annotated file. The text version contains plain text and no further annotation. The Portuguese annotated file is a four-column file with one token per line, followed by a PoS tag and a lemma. The corpus was automatically PoS-tagged with MBT tagger (http://ilk.uvt.nl/mbt/), and lemmatized with MBLEM (http://ilk.uvt.nl/mbma/), following the annotation scheme of the Corpus of Reference of Contemporary Portuguese. | |
Identifier: | ELRA-W0090 | |
ISLRN: 435-502-922-727-2 | ||
Identifier (URI): | https://catalog.elra.info/en-us/repository/browse/ELRA-W0090/ | |
Language: | English | |
Portuguese | ||
Language (ISO639): | eng | |
por | ||
Medium: | Not specified | |
Publisher: | ELRA (European Language Resources Association) | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | ELRA Catalogue of Language Resources | |
Description: | http://www.language-archives.org/archive/catalogue.elra.info | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:catalogue.elra.info:ELRA-W0090 | |
DateStamp: | 2016-01-20 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | n.a. 2016. ELRA (European Language Resources Association). | |
Terms: | area_Europe country_GB country_PT dcmi_Text iso639_eng iso639_por olac_primary_text |