![]() |
OLAC Record oai:www.ldc.upenn.edu:LDC2015T10 |
| Metadata | ||
| Title: | RST Signalling Corpus | |
| Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
| Bibliographic Citation: | Das, Debopam, Maite Taboada, and Paul McFetridge. RST Signalling Corpus LDC2015T10. Web Download. Philadelphia: Linguistic Data Consortium, 2015 | |
| Contributor: | Das, Debopam | |
| Taboada, Maite | ||
| McFetridge, Paul | ||
| Date (W3CDTF): | 2015 | |
| Date Issued (W3CDTF): | 2015-06-15 | |
| Description: | *Introduction* RST Signalling Corpus was developed at Simon Fraser University and contains annotations for signalling information added to RST Discourse Treebank (LDC2002T07). RST Discourse Treebank (RST-DT) is a collection of English news texts annotated for rhetorical relations under the RST (Rhetorical Structure Theory) framework. In RST Signalling Corpus, information about textual signals -- such as although, because, thus -- and signals such as tense, lexical chains or punctuation were added as an annotation layer to examine how rhetorical relations are signalled in discourse. *Data* The source data consists of 385 Wall Street Journal news articles from the Penn Treebank annotated for rhetorical relations in RST Discourse Treebank. As in RST-DT, the data in this release is divided into a training set (347 articles) and a test set (38 articles). The signalling annotation in this data set was performed using the UAM CorpusTool version 2.8.12. Files are presented as UTF-8 encoded XML and plain text. The corpus is divided into three annotation sub-directories: training, test and full. All sub-directories include source, metadata, signalling annotation, and dtd files. *Samples* Please view the following samples: * Metadata Sample * Signal Sample * Text Sample *Updates* None at this time. | |
| Extent: | Corpus size: 38176 KB | |
| Identifier: | LDC2015T10 | |
| https://catalog.ldc.upenn.edu/LDC2015T10 | ||
| ISBN: 1-58563-719-X | ||
| ISLRN: 256-234-245-630-4 | ||
| DOI: 10.35111/5sm9-m096 | ||
| Language: | English | |
| Language (ISO639): | eng | |
| License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
| Medium: | Distribution: Web Download | |
| Publisher: | Linguistic Data Consortium | |
| Publisher (URI): | https://www.ldc.upenn.edu | |
| Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2015T10 | |
| Rights Holder: | Portions © 1987-1989 Dow Jones & Company, Inc., © 2015 Depobam Das, © 2015 Maite Taboada, © 1995, 1999, 2002, 2015 Trustees of the University of Pennsylvania | |
| Type (DCMI): | Text | |
| Type (OLAC): | primary_text | |
OLAC Info |
||
| Archive: | The LDC Corpus Catalog | |
| Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
| GetRecord: | OAI-PMH request for OLAC format | |
| GetRecord: | Pre-generated XML file | |
OAI Info |
||
| OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2015T10 | |
| DateStamp: | 2020-11-30 | |
| GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
| Citation: | Das, Debopam; Taboada, Maite; McFetridge, Paul. 2015. Linguistic Data Consortium. | |
| Terms: | area_Europe country_GB dcmi_Text iso639_eng olac_primary_text | |