OLAC Linguistic Data Type Vocabulary

Date issued:2002-06-28
Status of document:WithdrawnRecommendation.
This version:http://www.language-archives.org/REC/type-20020628.html
Latest version:http://www.language-archives.org/REC/type.html
Previous version:http://www.language-archives.org/REC/type-20020612.html
Abstract:

This document specifies the controlled vocabulary of language resource types used by OLAC. The linguistic data type vocabulary describes the nature of the content of a resource from a linguistic standpoint.

Editors: Heidi Johnson (mailto:ailla@ailla.org)
Helen Aristar Dry (mailto:hdry@linguistlist.org)
Changes since previous version:

Adds: transcription/phonemic, transcription/kinesic, annotation/translation, annotation/phonological, annotation/semantic, annotation/eye-gaze, annotation/facial-expression, description/phonological, description/kinesic, description/pedagogical, description/comparative, dataset/kinesic.

Deletes: transcription/eye-gaze, transcription/facial-expression, transcription/translation, transcription/phonological, transcription/semantic, description/eye-gaze, description/facial-expression, dataset/eye-gaze, dataset/facial-expression. Genre Type section.

Copyright © Heidi Johnson (University of Texas at Austin), Helen Aristar Dry (Eastern Michigan University) . This material may be distributed and repurposed subject to the terms and conditions set forth in the Creative Commons Attribution-ShareAlike 2.5 License.

Table of contents

  1. Introduction
  2. Linguistic Data Type
References

1. Introduction

Key points: two-level systems, multiple categories for a single resource, parallelism of the transcription and annotation subcategories.

2. Linguistic Data Type

Each term of the controlled vocabulary is described in one of the following subsections. The heading gives the encoded value for the term that is to be used as the value of the code attribute of the Type.linguistic metadata element [OLAC-MS]. Under the heading, the term is described in four ways. Name gives a descriptive label for the term. Definition is a one-line summary of what the term means. Comments offers more details on what the term represents. Examples may also be given to illustrate how the term is meant to be applied.

A further label, Subterms, appears when the term permits more specific refinements. In such cases, the generic (top-level) terms may be chosen, or one of its more specific refinements.

transcription

NameTranscription
DefinitionA transcription is a written representation of an audio or visual signal.
Comments

A resource can be identified as a transcription if it includes a type of transcription as part of the content; for example, the first line of an interlinear analysis is often some type of transcription.

Subterms

transcription/phonetic

NamePhonetic transcription
DefinitionA phonetic transcription represents the signal at the phonetic level.
Comments

Phonetic transcription may be narrow or broad, and will typically use the International Phonetic Alphabet [IPA] in a standard encoding (e.g. [Unicode-IPA], [SAMPA]). Phonological transcriptions are also classified here.

transcription/phonemic

NamePhonemic transcription
DefinitionA phonemic transcription represents the signal at the level of the phoneme.
Comments

Phonemic transcription may use the International Phonetic Alphabet [IPA] or a practical orthography.

transcription/prosodic

NameProsodic transcription
DefinitionThe resource includes prosodic transcription.
Comments

A prosodic transcription is a symbolic record of intonation, stress, tone or other suprasegmental features that is expressed independently of regular phonetic transcription.

transcription/orthographic

NameOrthographic transcription
DefinitionAn orthographic transcription uses a standard or conventional orthography.
Comments

Orthographic transcriptions differ from phonemic transcriptions that use a practical orthography in that they include orthographic conventions for punctuation, capitalization, etc.

transcription/gestural

NameGestural transcription
DefinitionThe resource includes gestural transcription.

transcription/kinesic

NameKinesic transcription
DefinitionA kinesic transcription represents eye, face, and body movements.
Comments

Kinesic transcriptions represent systematic uses of facial expressions, body language, and eye gaze that are used to communicate meaning.

transcription/musical

NameMusical transcription
DefinitionA musical transcription represents music.

annotation

NameAnnotation
DefinitionThe resource includes information which annotates some other linguistic record.
Comments

A linguistic annotation is defined as structured linguistic information that is explicitly aligned to some spatial and/or temporal extent of some other linguistic record.

Subterms

annotation/phonetic

NamePhonetic annotation
DefinitionThe resource includes phonetic annotation.
Comments

An example of a phonetic annotation is the TIMIT database, in which each element of phonetic transcription is associated with a range of samples in a digital audio file [TIMIT].

annotation/phonological

Name
Definition

annotation/prosodic

NameProsodic annotation
DefinitionThe resource includes prosodic annotation.

annotation/gestural

NameGestural annotation
DefinitionThe resource includes gestural annotation.

annotation/kinesic

Name
DefinitionThe resource includes aligned representations of eye, face, and body movements.

annotation/morphological

NameMorphological annotation
DefinitionThe resource includes morphological annotation.
Comments

A morphological annotation is a morphological transcription where the component morphemes are aligned with some other linguistic record, such as an orthographic transcription or a speech signal. An example of morphological annotation is interlinear text with aligned morphemic glosses.

annotation/part-of-speech

NamePart-of-speech annotation
DefinitionThe resource includes aligned part-of-speech tags.

annotation/syntactic

NameSyntactic annotation
DefinitionThe resource includes aligned syntactic transcription.
Comments

A syntactic annotation might include supra-lexical features such as word order or auxiliary phrase constructions. They may thus be aligned with phrases or clauses rather than smaller segments of the annotated record.

annotation/semantic

Name
Definition

annotation/discourse

NameDiscourse annotation
DefinitionThe resource includes aligned discourse transcription.

annotation/translation

NameTranslation
DefinitionA translation is a version of the resource in another language.
Comments

Translations may align with different linguistic levels of the resource: morpheme-by-morpheme translation, word-by-word translation, sentence-level free translation, or discourse-level free translation.

annotation/musical

NameMusical annotation
DefinitionThe resource includes musical annotation.

dataset

NameDataset
DefinitionThe resource is a structured set of data items.
Comments

A dataset is a collection of items organized in a structured format for some specific research purpose. Examples of datasets are: a database of sentences illustrating deictic terms; an inflectional affix paradigm; a list of utterance tokens in a uniform context (e.g. "Say [pat] now.").

Subterms

dataset/phonetic

NamePhonetic dataset
DefinitionThe dataset is comprised of phonetic data.

dataset/phonological

NamePhonological dataset
Definition

dataset/prosodic

NameProsodic dataset
DefinitionThe dataset is comprised of prosodic data.

dataset/orthographic

NameOrthographic dataset
DefinitionThe dataset is comprised of orthographic data.

dataset/gestural

NameGestural dataset
DefinitionThe dataset is comprised of gestural data.

dataset/kinesic

NameKinesic dataset
DefinitionThe dataset is comprised of kinesic data.

dataset/morphological

NameMorphological dataset
DefinitionThe dataset is comprised of morphological data.

dataset/part-of-speech

NamePart-of-speech dataset
DefinitionThe dataset is comprised of part-of-speech data.

dataset/syntactic

NameSyntactic dataset
DefinitionThe dataset is comprised of syntactic data.

dataset/semantic

NameSemantic dataset
DefinitionThe dataset is comprised of semantic data.

dataset/discourse

NameDiscourse dataset
DefinitionThe dataset is comprised of discourse data.

dataset/musical

NameMusical dataset
DefinitionThe dataset is comprised of musical data.

description

NameDescription
DefinitionThe resource is a linguistic description.
Comments

A description is any description or analysis of a language. Unlike a transcription or an annotation, the structure of a description is independent of the structure of the linguistic events that it describes.

Subterms

description/phonetic

NamePhonetic description
DefinitionThe resource includes description of phonetic characteristics.

description/phonological

NamePhonological description
DefinitionThe resource includes descriptionof phonological characteristics.

description/prosodic

NameProsodic description
DefinitionThe resource includes description of prosodic characteristics.

description/orthographic

NameOrthographic description
DefinitionThe resource includes documentation of a writing system.

description/gestural

NameGestural description
DefinitionThe resource includes description of gestural characteristics.

description/kinesic

NameKinesic description
DefinitionThe resource includes description of kinesic characteristics.

description/morphological

NameMorphological description
DefinitionThe resource includes description of morphological characteristics.

description/part-of-speech

NamePart-of-speech description
DefinitionThe resource includes description of part-of-speech characteristics.

description/syntactic

NameSyntactica description
DefinitionThe resource includes description of syntactic characteristics.

description/semantic

NameSemantic description
DefinitionThe resource includes description of semantic characteristics.

description/discourse

NameDiscourse description
DefinitionThe resource includes description of discourse characteristics.

description/pedagogical

NamePedagogical description
DefinitionThe resource includes pedagogical description.
Comments

A pedagogical description is a style of presentation intended for use in teaching people to use the language.

description/comparative

NameComparative description
DefinitionThe resource includes comparative or typological description.

lexicon

NameLexicon
DefinitionThe resource includes a systematic listing of lexical items.
Subterms

lexicon/dictionary

NameDictionary
DefinitionThe resource includes a dictionary.
Comments

This includes any resource that lists words or morphemes and defines them. It contrasts with a word list in that the definitions are complex (rather than being one-word equivalents) and the entries may include other information like part of speech, related words, and illustrative sentences.

lexicon/wordlist

NameWord list
DefinitionThe resource includes a word list.
Comments

A word list is a list of reference words in a major language for which the nearest equivalent word in a target language has been elicited (for instance, the Swadesh 100-word list).

lexicon/wordnet

NameWordNet
DefinitionThe resource includes a semantic wordnet.
Comments

Whereas a dictionary documents the meanings of words by means of definitions, a word net documents meanings by building a web of semantic relationships [WordNet].

lexicon/thesaurus

NameThesaurus
DefinitionThe resource includes a thesaurus.
Comments

A thesaurus is a list of words or concepts arranged according to sense.

lexicon/terminology

NameTerminology
DefinitionThe resource includes a terminological lexicon.
Comments

A terminological lexicon is a glossary of domain-specific terms. Examples are technical terminology, kinship terms, color terms, acronyms, ...

lexicon/proper-names

NameName Dictionary
DefinitionThe resource includes only proper names sa dictionary headwords.

lexicon/bilingual

NameBilingual Lexicon
DefinitionThe resource includes definitions in another language.

lexicon/etymological

NameEtymological Lexicon
DefinitionThe lexicon contains etymological information.

lexicon/phonetic

NamePhonetic Lexicon
DefinitionThe lexicon contains phonetic information, including pronunciation, phonology, stress, rhymes.

lexicon/frequency

NameFrequency Lexicon
DefinitionThe lexicon contains frequency information.

lexicon/analytical

NameAnalytical Lexicon
DefinitionThe lexicon contains analytical information.
Comments

Analytical information includes such things as morphological derivation, grammatically related forms, argument structure, ...

text

NameText
DefinitionThis is a primary resource: the object of study.
Comments

A text is defined as any primary resource or research material, such as a literary work, film, or recording of natural discourse.

Subterms

text/narrative

NameNarrative
DefinitionA monologic discourse which represents temporally organized events.
Comments

Types of narratives include historical, traditional, and personal narratives, myths, folktales, fables, and humorous stories.

text/oratory

NameOratory
Definition"The art of public speaking, or of speaking eloquently according to rules or conventions.
Comments

Examples of oratory include sermons, lectures, political speeches, and invocations.

text/dialogue

NameDialogue
DefinitionAn interactive discourse with two or more participants.
Comments

Examples of dialogues include conversations, interviews, correspondence, consultations, greetings and leave-takings.

text/singing

NameSinging
Definition"Words or sounds [articulated] in succession with musical inflections or modulations of the voice" OED.
Comments

Examples of singing include chants, songs, and choruses.

text/drama

NameDrama
DefinitionA planned, creative, rendition of discourse with two or more participants.

text/formulaic

NameFormulaic
DefinitionThe resource is a ritually or conventionally structured discourse.
Comments

Examples of formulaic discourse are prayers, curses, blessings, charms, curing rituals, marriage vows, and oaths.

text/procedural

NameProcedural
DefinitionAn explanation or description of a method, process, or situation having ordered steps.
Comments

Examples of procedural discourses include recipes, instructions, and plans.

text/report

NameReport
DefinitionA factual account of some event or circumstance.
Comments

Examples of reports include news reports, essays, and commentaries.

text/ludic

NameLudic
DefinitionLudic discourse is language whose primary function is to be part of play, or a style of speech that involves a creative manipulation of the structures of the language.
Comments

Examples of ludic discourse are play languages, jokes, secret languages, and speech disguises.

text/unintelligible speech

NameUnintelligible speech
DefinitionThe resource consists of utterances that are not intended to be interpretable as ordinary language.
Comments

Examples of unintelligible speech include sacred languages, speaking in tongues, and singing syllables (fa-la-la).


To do

Write the introduction. Explain that typical resources will contain multiple types.


References

[OED]Oxford English Dictionary
<http://dictionary.oed.com/entrance.dtl>
[OLAC-MS]OLAC Metadata Set.
<http://www.language-archives.org/OLAC/olacms.html>
[SAMPA]Speech Assessment Methods Phonetic Alphabet
<http://www.phon.ucl.ac.uk/home/sampa/home.htm>
[TIMIT]TIMIT Acoustic-Phonetic Continuous Speech Corpus
<http://www.ldc.upenn.edu/Catalog/LDC93S1.html>
[Unicode-IPA]Unicode IPA Extensions
<http://www.unicode.org/unicode/uni2book/ch07.pdf>
[WordNet]WordNet - a Lexical Database for English
<http://www.cogsci.princeton.edu/~wn/>