OLAC Record
oai:www.ldc.upenn.edu:LDC2024S08

Metadata
Title:Dialogs Re-Enacted Across Languages
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Ward, Nigel G., et al. Dialogs Re-Enacted Across Languages LDC2024S08. Web Download. Philadelphia: Linguistic Data Consortium, 2024
Contributor:Ward, Nigel G.
Avila, Jonathan E.
Rivas, Emilia
Marco, Divette
Date (W3CDTF):2024
Date Issued (W3CDTF):2024-07-15
Description:*Introduction* Dialogs Re-Enacted Across Languages was developed at the University of Texas at El Paso. It contains approximately 17 hours of conversational speech in English and Spanish by 129 unique bilingual speakers, specifically, short fragments extracted from spontaneous conversations and close re-enactments in the other language by the original speakers, for 3816 pairs of matching utterances. *Data* Data was collected in 2022-2023. Participants were recruited from among students at the University of Texas at El Paso which is located on the US-Mexico border. All participants were bilingual speakers of General American English and of Mexico-Texas Border Spanish. Their self-described dialects for English were El Paso and for Spanish, mostly "El Paso/Juarez." Each speaker pair had a ten minute conversation in one language. From these conversations, various fragments of the conversations were chosen for re-enactment, and the original speakers produced equivalents in the other language. Each re-enactment was vetted for fidelity to the original and naturalness in the target language. After recording, fragments were mapped to the translated re-enactments using ELAN, an annotation tool for audio and video recordings. Metadata about conversations, participants, re-enactments and utterances are included in this release. The audio data is presented as flac compressed, single channel, 16 kHz, 16-bit linear PCM. *Samples* Please listen to the following samples: * English utterance (FLAC) * Spanish utterance (FLAC) *Updates* None at this time.
Extent:Corpus size: 974652 KB
Format:Sampling Rate: 16000
Sampling Format: pcm
Identifier:LDC2024S08
https://catalog.ldc.upenn.edu/LDC2024S08
ISLRN: 859-445-294-766-0
DOI: 10.35111/2pac-j365
Language:English
Spanish
Language (ISO639):eng
spa
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2024S08
Rights Holder:Portions © 2024 The University of Texas at El Paso, © 2024 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2024S08
DateStamp:  2024-09-27
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Ward, Nigel G.; Avila, Jonathan E.; Rivas, Emilia; Marco, Divette. 2024. Linguistic Data Consortium.
Terms: area_Europe country_ES country_GB dcmi_Sound iso639_eng iso639_spa olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2024S08
Up-to-date as of: Thu Oct 24 7:31:29 EDT 2024