OLAC Record
oai:www.ldc.upenn.edu:LDC98T29

Metadata
Title:1997 Spanish Broadcast News Transcripts (HUB4-NE)
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Munoz, Elisa, Jennifer Alabiso, and David Graff. 1997 Spanish Broadcast News Transcripts (HUB4-NE) LDC98T29. Web Download. Philadelphia: Linguistic Data Consortium, 1998
Contributor:Munoz, Elisa
Alabiso, Jennifer
Graff, David
Date (W3CDTF):1998
Description:*Introduction* This corpus contains a portion of the acoustic data designated as the training set for the 1997 DARPA HUB4 Spanish Benchmark. It contains speech and transcripts of 30 hours of broadcast news from the following sources: Televisa, Univision and VOA. Corresponding speech data is released as 1997 Spanish Broadcast News Speech (HUB4-NE) (LDC98S74) *Data* All acoustic files are in NIST SPHERE format, without compression. The sample data are 16-bit linear PCM, 16-KHz sample frequency, single channel. Most files contain 30 minutes of recorded material, and some contain 60 or 120 minutes (approximately); the sampling format requires roughly two megabytes (MB) per minute of recording, so the file sizes are typically around 60 MB, with some files ranging up to 120 or 240 MB. The transcripts are in SGML format, using the same markup conventions that have been applied to the other 1997 Broadcast News speech corpora (in English and Mandarin). *Samples* Please view this SGML sample. *Updates* There are no updates at this time.
Extent:Corpus size: 3824 KB
Identifier:LDC98T29
https://catalog.ldc.upenn.edu/LDC98T29
ISBN: 1-58563-128-0
ISLRN: 873-191-836-513-0
DOI: 10.35111/1b28-g771
Language:Spanish
Language (ISO639):spa
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC98T29
Rights Holder:Portions © 1997 Televisa S.A. de C.V., © 1997 Univision Network Limited Partnership, © 1997, 1998 Trustees of the University of Pennsylvania
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC98T29
DateStamp:  2021-06-16
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Munoz, Elisa; Alabiso, Jennifer; Graff, David. 1998. Linguistic Data Consortium.
Terms: area_Europe country_ES dcmi_Text iso639_spa olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC98T29
Up-to-date as of: Thu Oct 24 7:30:06 EDT 2024