OLAC Record oai:lindat.mff.cuni.cz:11858/00-097C-0000-0023-10B2-F |
Metadata | ||
Title: | Many Czech References for 50 Sentences Selected from WMT11 Data | |
Bibliographic Citation: | http://hdl.handle.net/11858/00-097C-0000-0023-10B2-F | |
Creator: | Bojar, Ondřej | |
Macháček, Matouš | ||
Tamchyna, Aleš | ||
Zeman, Daniel | ||
Date (W3CDTF): | 2013-12-10T13:41:44Z | |
Date Available: | 2013-12-10T13:41:44Z | |
Description: | This dataset contains the whole set of very many Czech translations for 50 English source sentences coming from WMT11 test set (http://www.statmt.org/wmt11). In total, there are 15431447 Czech sentences, i.e. 300k reference translations per source English sentence on average, but the exact number greatly varies across sentences. You can find more details in included README file. If you use this dataset, please cite the following paper which describes the technique used to construct the Czech translations: Bojar Ondřej, Macháček Matouš, Tamchyna Aleš, Zeman Daniel: Scratching the Surface of Possible Translations. Lecture Notes in Computer Science, Vol. 8082, Text, Speech and Dialogue: 16th International Conference, TSD 2013. Proceedings, Copyright © Springer Verlag, Berlin / Heidelberg, ISBN 978-3-642-40584-6, ISSN 0302-9743, pp. 465-474, 2013, DOI: 10.1007/978-3-642-40585-3_59 | |
P406/11/1499 of the Grant Agency of the Czech Republic, FP7-ICT-2011-7-288487 (MosesCore) of the European Union and 1356213 of the Grant Agency of the Charles University | ||
Identifier (URI): | http://hdl.handle.net/11858/00-097C-0000-0023-10B2-F | |
Language: | Czech | |
Language (ISO639): | ces | |
Publisher: | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) | |
Rights: | Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) | |
http://creativecommons.org/licenses/by-sa/3.0/ | ||
Subject: | machine translation | |
automatic machine translation evaluation | ||
reference translation | ||
Type: | corpus | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University | |
Description: | http://www.language-archives.org/archive/lindat.mff.cuni.cz | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:lindat.mff.cuni.cz:11858/00-097C-0000-0023-10B2-F | |
DateStamp: | 2021-06-29 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Bojar, Ondřej; Macháček, Matouš; Tamchyna, Aleš; Zeman, Daniel. 2013. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL). | |
Terms: | area_Europe country_CZ dcmi_Text iso639_ces olac_primary_text |