OLAC Record oai:www.ldc.upenn.edu:LDC2021S04 |
Metadata | ||
Title: | The SSNCE Database of Tamil Dysarthric Speech | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Vijayalakshmi, P., T. A. Mariya Celin, and T. Nagarajan. The SSNCE Database of Tamil Dysarthric Speech LDC2021S04. Web Download. Philadelphia: Linguistic Data Consortium, 2021 | |
Contributor: | Vijayalakshmi, P. | |
Mariya Celin, T. A. | ||
Nagarajan, T. | ||
Date (W3CDTF): | 2021 | |
Date Issued (W3CDTF): | 2021-05-17 | |
Description: | *Introduction* The SSNCE Database of Tamil Dysarthric Speech was developed by the Speech Lab, SSN College of Engineering, India, in collaboration with the Indian National Institute of Empowerment of Persons with Multiple Disabilities (NIEPMD) and contains approximately eight hours of Tamil speech data, time-aligned transcripts and metadata collected from 30 speakers (20 dysarthric speakers and 10 non-dysarthric speakers). Dysarthria is a speech disorder caused by muscle weakness which can result in slowed and slurred speech that is difficult to understand. Common causes of dysarthria include nervous system disorders and conditions that cause facial paralysis or tongue or throat muscle weakness. *Data* The non-dysarthric speakers consisted of five female and five male subjects. The dysarthric speakers (7 female, 13 male) reported a diagnosis of cerebral palsy and ranged in age from 12 years old to 37 years old. The speech data was collected between 2015 and 2017 in two sessions at NIEPMD. In total, each speaker recorded 365 utterances consisting of single words and of sentences that included a combination of common and uncommon Tamil phrases. The corpus includes time-aligned phonetic transcripts for all collected speech data. Additional documentation includes phoneme mappings and speaker metadata. Audio data is presented as 16-bit 16kHz FLAC compressed linear pcm wav. Transcripts are presented as UTF-8 encoded plain text. *Samples* Please view the following samples: * Audio sample (FLAC) * Phonetic Transcript (TXT) * Word Transcript (TXT) * Plain Transcript (TXT) *Updates* None at this time. | |
Extent: | Corpus size: 614629 KB | |
Format: | Sampling Rate: 16000 | |
Sampling Format: pcm | ||
Identifier: | LDC2021S04 | |
https://catalog.ldc.upenn.edu/LDC2021S04 | ||
ISBN: 1-58563-965-6 | ||
ISLRN: 064-987-156-004-1 | ||
DOI: 10.35111/hkh2-vh40 | ||
Language: | Tamil | |
Language (ISO639): | tam | |
License: | The SSNCE Database of Tamil Dysarthric Speech Agreement: https://catalog.ldc.upenn.edu/license/the-ssnce-database-of-tamil-dysarthric-speech-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2021S04 | |
Rights Holder: | Portions © 2021 Speech Lab, SSN College of Engineering, © 2021 Trustees of the University of Pennsylvania | |
Type (DCMI): | Sound | |
Text | ||
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2021S04 | |
DateStamp: | 2022-01-01 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Vijayalakshmi, P.; Mariya Celin, T. A.; Nagarajan, T. 2021. Linguistic Data Consortium. | |
Terms: | area_Asia country_IN dcmi_Sound dcmi_Text iso639_tam olac_primary_text |