OLAC Record oai:catalogue.elra.info:ELRA-S0388 |
Metadata | ||
Title: | GlobalPhone Bulgarian Pronunciation Dictionary 260k entries (extended version) | |
Access Rights: | Rights available for: nonCommercialUse, commercialUse | |
Date Available (W3CDTF): | 2017-04-06 | |
Date Issued (W3CDTF): | 2017-04-06 | |
Date Modified (W3CDTF): | 2017-04-06 | |
Description: | This extended version of the Bulgarian Pronunciation Dictionary called Bulgarian-Dict260k contains pronunciations of more than 260,000 word forms. The dictionary matches in phone set and format the original GlobalPhone Bulgarian Pronunciation Dictionary (see ELRA-S0351) of 20,000 word forms. Bulgarian-Dict260k was built based on the extension of the Bulgarian GlobalPhone text database to improve language modeling and to reduce the high Out-Of-Vocabulary rate resulting from the rich morphology of the Bulgarian language. For this purpose, roughly 9 Million word tokens were collected from the internet sources of national, international, and economic news available from the online newspapers "Banker" (http://www.banker.bg/), "Kesh" (http://www.cash.bg), and “Sega" (http://www.segabg.com/). After text cleaning and normalization, all word forms were extracted. Pronunciations were created in an automatic process using hand-crafted grapheme-to-phoneme rules. The generated pronunciations were manually cross-checked by native speakers, correcting potential errors of the automatic generation. | |
Identifier: | ELRA-S0388 | |
ISLRN: 799-402-906-876-5 | ||
Identifier (URI): | https://catalog.elra.info/en-us/repository/browse/ELRA-S0388/ | |
Language: | Bulgarian | |
Language (ISO639): | bul | |
Medium: | Not specified | |
Publisher: | ELRA (European Language Resources Association) | |
Type (DCMI): | Text | |
Sound | ||
Type (OLAC): | lexicon | |
OLAC Info |
||
Archive: | ELRA Catalogue of Language Resources | |
Description: | http://www.language-archives.org/archive/catalogue.elra.info | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:catalogue.elra.info:ELRA-S0388 | |
DateStamp: | 2017-04-06 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | n.a. 2017. ELRA (European Language Resources Association). | |
Terms: | area_Europe country_BG dcmi_Sound dcmi_Text iso639_bul olac_lexicon |