OLAC Record oai:www.ldc.upenn.edu:LDC2024S08 |
Metadata | ||
Title: | Dialogs Re-Enacted Across Languages | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Ward, Nigel G., et al. Dialogs Re-Enacted Across Languages LDC2024S08. Web Download. Philadelphia: Linguistic Data Consortium, 2024 | |
Contributor: | Ward, Nigel G. | |
Avila, Jonathan E. | ||
Rivas, Emilia | ||
Marco, Divette | ||
Date (W3CDTF): | 2024 | |
Date Issued (W3CDTF): | 2024-07-15 | |
Description: | *Introduction* Dialogs Re-Enacted Across Languages was developed at the University of Texas at El Paso. It contains approximately 17 hours of conversational speech in English and Spanish by 129 unique bilingual speakers, specifically, short fragments extracted from spontaneous conversations and close re-enactments in the other language by the original speakers, for 3816 pairs of matching utterances. *Data* Data was collected in 2022-2023. Participants were recruited from among students at the University of Texas at El Paso which is located on the US-Mexico border. All participants were bilingual speakers of General American English and of Mexico-Texas Border Spanish. Their self-described dialects for English were El Paso and for Spanish, mostly "El Paso/Juarez." Each speaker pair had a ten minute conversation in one language. From these conversations, various fragments of the conversations were chosen for re-enactment, and the original speakers produced equivalents in the other language. Each re-enactment was vetted for fidelity to the original and naturalness in the target language. After recording, fragments were mapped to the translated re-enactments using ELAN, an annotation tool for audio and video recordings. Metadata about conversations, participants, re-enactments and utterances are included in this release. The audio data is presented as flac compressed, single channel, 16 kHz, 16-bit linear PCM. *Samples* Please listen to the following samples: * English utterance (FLAC) * Spanish utterance (FLAC) *Updates* None at this time. | |
Extent: | Corpus size: 974652 KB | |
Format: | Sampling Rate: 16000 | |
Sampling Format: pcm | ||
Identifier: | LDC2024S08 | |
https://catalog.ldc.upenn.edu/LDC2024S08 | ||
ISLRN: 859-445-294-766-0 | ||
DOI: 10.35111/2pac-j365 | ||
Language: | English | |
Spanish | ||
Language (ISO639): | eng | |
spa | ||
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2024S08 | |
Rights Holder: | Portions © 2024 The University of Texas at El Paso, © 2024 Trustees of the University of Pennsylvania | |
Type (DCMI): | Sound | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2024S08 | |
DateStamp: | 2024-09-27 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Ward, Nigel G.; Avila, Jonathan E.; Rivas, Emilia; Marco, Divette. 2024. Linguistic Data Consortium. | |
Terms: | area_Europe country_ES country_GB dcmi_Sound iso639_eng iso639_spa olac_primary_text |