OLAC Record: Speech Sentiment Annotations

OLAC Record
oai:www.ldc.upenn.edu:LDC2020T14

Metadata

Title: Speech Sentiment Annotations

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Chen, Eric Y., et al. Speech Sentiment Annotations LDC2020T14. Web Download. Philadelphia: Linguistic Data Consortium, 2020

Contributor: Chen, Eric Y.

Lu, Zhiyun

Xu, Hao

Cao, Liangliang

Zhang, Yu

Fan, James

Date (W3CDTF): 2020

Date Issued (W3CDTF): 2020-07-15

Description: *Introduction* Speech Sentiment Annotations was developed by Google Inc. It consists of sentiment labels (positive, negative, neutral) for approximately 49,500 utterances covering 140 hours of audio from Switchboard-1 Release 2 (LDC97S62). Switchboard-1 Release 2 consists of approximately 260 hours of telephone speech from 543 speakers across the United States (302 male speakers, 241 female speakers). A computer-driven telephone collection platform paired two subjects for each conversation and provided a discussion topic. No two speakers conversed together more than once and no one speaker talked more than once on a given topic. *Data* Switchboard speech files were segmented based on the start and end time of transcript turns. Annotators listened to the audio corresponding to each segment (utterance) and classified each into positive, negative or neutral categories based on the emotion and attitude of the speaker. Annotators provided a justification for positive and negative classifications using a flow chart. Further information about the methodology and annotation process is contained in the documentation accompanying this release. The data is stored as a single UTF-8 encoded tab-delimited file. The annotation column in each row includes judgments from at least three annotators. *Samples* Please view the following sample (TXT). *Updates* None at this time.

Extent: Corpus size: 2290 KB

Identifier: LDC2020T14

https://catalog.ldc.upenn.edu/LDC2020T14

ISBN: 1-58563-939-7

ISLRN: 793-392-711-640-2

DOI: 10.35111/HBK2-0H71

Language: English

Language (ISO639): eng

License: LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC2020T14

Rights Holder: Portions © 1997, 2020 Trustees of the University of Pennsylvania

Type (DCMI): Text

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC2020T14

DateStamp: 2021-01-01

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Chen, Eric Y.; Lu, Zhiyun; Xu, Hao; Cao, Liangliang; Zhang, Yu; Fan, James. 2020. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Text iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2020T14
Up-to-date as of: Wed Oct 29 7:02:01 EDT 2025

Metadata
Title:		Speech Sentiment Annotations
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Chen, Eric Y., et al. Speech Sentiment Annotations LDC2020T14. Web Download. Philadelphia: Linguistic Data Consortium, 2020
Contributor:		Chen, Eric Y.
		Lu, Zhiyun
		Xu, Hao
		Cao, Liangliang
		Zhang, Yu
		Fan, James
Date (W3CDTF):		2020
Date Issued (W3CDTF):		2020-07-15
Description:		Introduction Speech Sentiment Annotations was developed by Google Inc. It consists of sentiment labels (positive, negative, neutral) for approximately 49,500 utterances covering 140 hours of audio from Switchboard-1 Release 2 (LDC97S62). Switchboard-1 Release 2 consists of approximately 260 hours of telephone speech from 543 speakers across the United States (302 male speakers, 241 female speakers). A computer-driven telephone collection platform paired two subjects for each conversation and provided a discussion topic. No two speakers conversed together more than once and no one speaker talked more than once on a given topic. Data Switchboard speech files were segmented based on the start and end time of transcript turns. Annotators listened to the audio corresponding to each segment (utterance) and classified each into positive, negative or neutral categories based on the emotion and attitude of the speaker. Annotators provided a justification for positive and negative classifications using a flow chart. Further information about the methodology and annotation process is contained in the documentation accompanying this release. The data is stored as a single UTF-8 encoded tab-delimited file. The annotation column in each row includes judgments from at least three annotators. Samples Please view the following sample (TXT). Updates None at this time.
Extent:		Corpus size: 2290 KB
Identifier:		LDC2020T14
		https://catalog.ldc.upenn.edu/LDC2020T14
		ISBN: 1-58563-939-7
		ISLRN: 793-392-711-640-2
		DOI: 10.35111/HBK2-0H71
Language:		English
Language (ISO639):		eng
License:		LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC2020T14
Rights Holder:		Portions © 1997, 2020 Trustees of the University of Pennsylvania
Type (DCMI):		Text
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC2020T14
DateStamp:		2021-01-01
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Chen, Eric Y.; Lu, Zhiyun; Xu, Hao; Cao, Liangliang; Zhang, Yu; Fan, James. 2020. Linguistic Data Consortium.
Terms:		area_Europe country_GB dcmi_Text iso639_eng olac_primary_text