OLAC Record oai:www.ldc.upenn.edu:LDC2024S06 |
Metadata | ||
Title: | Diaspora Tibetan Speech | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Geissler, Christopher, Sarah Babinski, and Jason Shaw. Diaspora Tibetan Speech LDC2024S06. Web Download. Philadelphia: Linguistic Data Consortium, 2024 | |
Contributor: | Geissler, Christopher | |
Babinski, Sarah | ||
Shaw, Jason | ||
Date (W3CDTF): | 2024 | |
Date Issued (W3CDTF): | 2024-06-17 | |
Description: | *Introduction* Diaspora Tibetan Speech was developed at Yale University. It contains approximately 28 hours of Tibetan elicited speech by 73 speakers from the diaspora Tibetan community in Kathmandu, Nepal, along with transcripts, elicitation materials and speaker demographic information. *Data* Recordings were collected in 2016. All speakers were adults and varied in age as well as age of diaspora. A substantial number of speakers were born in Nepal. Each speaker contributed one recording comprising a series of elicitation tasks: some demographic information; a word list and numbers; some sentences in isolation; a scripted story; and free speech based on "frog story" type illustrations. All elicitation materials are included with the corpus documentation in PDF format. The word- and number-list sections of the recordings were time aligned at the word level as Praat TextGrids. Five recordings were fully transcribed word-for-word by a native Tibetan speaker and are presented in both Microsoft Word and PDF format to preserve font encoding. The transcripts are not time-aligned but include general time stamps. Other transcripts are available as Excel spreadsheets with word-to-word correspondence of Tibetan script, phonetic transcription, and English translation. Demographic information includes age at recording, age at diaspora, and other information. The audio data is presented as single channel, 16 kHz, 16-bit wav files. *Sample* Please view the following samples: * Audio (wav) * Transcription (docx) * Dictionary (xlsx) * TextGrid *Updates* None at this time. | |
Extent: | Corpus size: 3100518 KB | |
Format: | Sampling Rate: 16000 | |
Sampling Format: pcm | ||
Identifier: | LDC2024S06 | |
https://catalog.ldc.upenn.edu/LDC2024S06 | ||
ISLRN: 883-684-044-738-1 | ||
DOI: 10.35111/b8wr-w485 | ||
Language: | Tibetan | |
Language (ISO639): | bod | |
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2024S06 | |
Rights Holder: | Portions © 2024 Dr. Christopher Geissler, © 2024 Dr. Sarah Babinski, © 2024 Dr. Jason Shaw, © 2024 Trustees of the University of Pennsylvania | |
Type (DCMI): | Sound | |
Text | ||
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2024S06 | |
DateStamp: | 2024-06-20 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Geissler, Christopher; Babinski, Sarah; Shaw, Jason. 2024. Linguistic Data Consortium. | |
Terms: | area_Asia country_CN dcmi_Sound dcmi_Text iso639_bod olac_primary_text |