OLAC Record oai:www.ldc.upenn.edu:LDC2014S02 |
Metadata | ||
Title: | King Saud University Arabic Speech Database | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Alsulaiman, Mansour, et al. King Saud University Arabic Speech Database LDC2014S02. Web Download. Philadelphia: Linguistic Data Consortium, 2014 | |
Contributor: | Alsulaiman, Mansour | |
Muhammad, Ghulam | ||
Abdelkader, Bencherif Mohamed | ||
Mahmood, Awais | ||
Ali, Zulfiqar | ||
Date (W3CDTF): | 2014 | |
Date Issued (W3CDTF): | 2014-02-17 | |
Description: | *Introduction* King Saud University Arabic Speech Database was developed by Speech Group (SG) at King Saud University and contains 590 hours of recorded Arabic speech from 269 male and female speakers. The utterances include read and spontaneous speech. The recordings were conducted in varied environments representing quiet and noisy settings. *Data* The corpus was designed principally for speaker recognition research. However, other possible applications include first language recognition, mobile effect, multichannel effect, and use of different type of microphones. The speech sources are word lists, sentence lists, paragraphs and question and answer sessions. Read speech text includes the following: * Sets of sentences devised to cover allophones of each phoneme, phonetic balance, and differentiation of accents. * Word lists developed to minimize missing phonemes and to represent nasals fricatives, commonly used words, and numbers. * Two paragraphs selected because they included all letters of the alphabet and were easy to read. Spontaneous speech was captured through question and answer sessions where speakers answer questions displayed on screen. The questions were on general topics such as the weather and food and included the speaker name or number. The speakers were Saudis and non-Saudis. Among the non-Saudi participants were Arabs and non-Arabs. All female speakers were either Saudis or non-Saudi Arabs. Male speakers included non-Arabs from the Indian subcontinent, Africa, South East Asia and East Europe. Non-Arab participants were required to be able to read Arabic at an acceptable level. Most of the Non-Arab speakers were from the fourth level in the Arabic Linguistics Institute at King Saud University. The non-Saudi participants represented 28 nationalities and were chosen from clusters of areas or countries. Each speaker was recorded in three different environments: in a soundproof room , in an office and in a cafeteria. The recordings were collected via different microphones and a mobile phone and averaged between 16-19 minutes. The recordings were done in three sessions with a time-gap of an approximately 6 weeks. The data was verified for missing recordings, problems with the recording system or errors in the recording process. All files are presented as two channel 48 kHz 16-bit FLAC compressed PCM wav files. Note that sizes and file names in the documentation are for the uncompressed wav files. *Samples* Please view this male sample and female sample. *Updates* None at this time. | |
Extent: | Corpus size: 148897792 KB | |
Format: | Sampling Rate: 48000 | |
Sampling Format: pcm | ||
Identifier: | LDC2014S02 | |
https://catalog.ldc.upenn.edu/LDC2014S02 | ||
ISBN: 1-58563-669-X | ||
ISLRN: 789-673-729-277-5 | ||
DOI: 10.35111/vpqe-bz17 | ||
Language: | Arabic | |
Language (ISO639): | ara | |
License: | King Saud University Arabic Speech Database: https://catalog.ldc.upenn.edu/license/ksu-arabic-speech-database.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2014S02 | |
Rights Holder: | Portions © 2014 King Saud University, © 2014 Trustees of the University of Pennsylvania | |
Type (DCMI): | Sound | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2014S02 | |
DateStamp: | 2020-11-30 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Alsulaiman, Mansour; Muhammad, Ghulam; Abdelkader, Bencherif Mohamed; Mahmood, Awais; Ali, Zulfiqar. 2014. Linguistic Data Consortium. | |
Terms: | dcmi_Sound iso639_ara olac_primary_text |