OLAC Record oai:www.ldc.upenn.edu:LDC2018T17 |
Metadata | ||
Title: | TRAD Chinese-French Parallel Text -- Broadcast News | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Linguistic Data Consortium, and ELDA. TRAD Chinese-French Parallel Text -- Broadcast News LDC2018T17. Web Download. Philadelphia: Linguistic Data Consortium, 2018 | |
Contributor: | Linguistic Data Consortium | |
ELDA | ||
Date (W3CDTF): | 2018 | |
Date Issued (W3CDTF): | 2018-07-16 | |
Description: | *Introduction* TRAD Chinese-French Parallel Text -- Broadcast News was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 30,000 Chinese characters from GALE Phase 1 Chinese Broadcast News Parallel Text - Part 3 (LDC2008T18). The PEA-TRAD project (Translation as a Support for Document Analysis) was supported by the French Ministry of Defense (DGA). Its purpose was to develop speech-to-speech translation technology for multiple languages (e.g., Arabic, Chinese, Pashto) from a variety of domains. ELDA developed several corpora for this effort. The Linguistic Data Consortium (LDC) has also released the following TRAD corpora: * TRAD Chinese-French Parallel Text -- Blog (LDC2018T02) * TRAD Arabic-French Parallel Text -- Newsgroup (LDC2018T13) * TRAD Arabic-French Parallel Text -- Newswire (LDC2018T21) *Data* This release consists of 977 segments (translation units) from 139 documents. The source data is Chinese broadcast news collected and translated into English by LDC for the DARPA GALE (Global Autonomous Language Exploitation) program. Information about the ELDA translation team, translation guidelines and validation results is contained in the documentation accompanying this release. The Chinese source file contains 33,571 characters and the French reference translation contains 22,424 words. The data is presented in two unicode-encoded XML files along with an associated DTD. *Samples* Please view this source sample and reference sample. *Updates* None at this time. | |
Extent: | Corpus size: 1232 KB | |
Identifier: | LDC2018T17 | |
https://catalog.ldc.upenn.edu/LDC2018T17 | ||
ISBN: 1-58563-853-6 | ||
ISLRN: 294-874-641-186-6 | ||
DOI: 10.35111/7fw4-ev85 | ||
Language: | Mandarin Chinese | |
French | ||
Language (ISO639): | cmn | |
fra | ||
License: | TRAD Chinese-French Parallel Text – Broadcast News Agreement (For-Profit): https://catalog.ldc.upenn.edu/license/trad-chinese-french-parallel-text-broadcast-news-agreement-for-profit.pdf | |
TRAD Chinese-French Parallel Text – Broadcast News Agreement (Non-Member): https://catalog.ldc.upenn.edu/license/trad-chinese-french-parallel-text-broadcast-news-agreement-non-member.pdf | ||
TRAD Chinese-French Parallel Text – Broadcast News Agreement (Not-For-Profit): https://catalog.ldc.upenn.edu/license/trad-chinese-french-parallel-text-broadcast-news-agreement-not-for-profit.pdf | ||
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2018T17 | |
Rights Holder: | Portions © 2005, 2006 China Central TV, © 2005, 2006 Phoenix TV, © 2018 ELDA, © 2005-2006, 2008, 2018 Trustees of the University of Pennsylvania | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2018T17 | |
DateStamp: | 2020-11-30 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Linguistic Data Consortium; ELDA. 2018. Linguistic Data Consortium. | |
Terms: | area_Asia area_Europe country_CN country_FR dcmi_Text iso639_cmn iso639_fra olac_primary_text |