OLAC Record oai:www.ldc.upenn.edu:LDC2008L01 |
Metadata | ||
Title: | An English Dictionary of the Tamil Verb | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Schiffman, Harold, and Vasu Renganathan. An English Dictionary of the Tamil Verb LDC2008L01. Web Download. Philadelphia: Linguistic Data Consortium, 2008 | |
Contributor: | Schiffman, Harold | |
Renganathan, Vasu | ||
Date (W3CDTF): | 2008 | |
Date Issued (W3CDTF): | 2008-04-22 | |
Description: | *Introduction* An English Dictionary of the Tamil Verb represents over twenty-five years of work led by Harold F. Schiffman, Professor, emeritus, of Dravidian Lingusitics and Culture at the University of Pennsylvania's Department of South Asia Studies. It contains translations for 6597 English verbs and defines 9716 Tamil verbs. This release presents the dictionary in two formats: Adobe PDF and XML. The PDF format displays the dictionary in a human readable form and is suitable for printing. The XML version is a purely electronic form which, while readable by humans, is intended mainly for application development and the creation of searchable electronic databases. In the electronic XML version each entry contains the following: the English entry or head word; the Tamil equivalent (in Tamil script and transliteration); the verb class and transitivity specification; the spoken Tamil pronunciation (audio files in mp3 format); the English definition(s); additional Tamil entries (if applicable); example sentences or phrases in Literary Tamil, Spoken Tamil (with a corresponding audio file in .mp3 format) and an English translation; and Tamil synonyms or near-synonyms, where appropriate. Some foods referenced in the example sentences are illustrated in html files that include detailed description of each dish. It is expected that the dictionary will be useful for Tamil learners, scholars and others interested in the Tamil language. *The Tamil Verb* Tamil is an official language of India, Singapore and Sri Lanka and has roughly 66 million native speakers worldwide. Most Tamil speakers live in the Tamil Nadu State of India and northeastern Sri Lanka, but the extended diaspora includes Malaysia, Mauritius and Singapore. Tamil is also a Classical Language of India. A member of the Dravidian language family, it boasts a rich literary tradition stretching back over 2200 years. Tamil is a diglossic language, meaning that it consists of at least two distinct forms. Spoken Tamil (ST) refers to the numerous vernacular dialects, and Literary Tamil (LT) refers to the form of the language used in print and most broadcast news media. The dialects of Spoken Tamil fall along regional divisions and along caste lines; there is no widely adopted standard for ST, although one seems to be emerging. Educated Tamil speakers, on the other hand, generally use LT with little variation in written communication. It appears, however, that a common dialect of ST may be emerging as a result of a growing broadcast media and increased rates of higher education. That dialect resembles the upper caste (non-Brahman) dialects spoken in the urban centers of Tamil Nadu and borrows verbs from LT. The spoken examples in An English Dictionary of the Tamil Verb reflect this emerging common dialect. Tamil is also an agglutinative language, meaning that it constructs verbs by appending inflections in the form of suffixes onto basic verb-stem morphemes. These inflections primarily denote tense, aspect, voice and mood. As far as voice is concerned, however, though LT may mark a verb as passive on occasion, ST rarely makes this distinction. These suffixes mark whether a verb is transitive or intransitive, that is, they indicate whether the subject is acted on by the verb or whether the subject is the actor. Mood is implied by verb tense, but may also be provided by verbal auxiliaries expressing various degrees of probability, futurity, ability, and their negatives. Tamil also may add suffixes that mark aspectual distinctions, such as whether an action is considered to be perfective ('complete' and/or 'definite') or whether it is ongoing or imperfective ('continuous' or 'durative'), as well as other distinctions. Aspect is a category that is undergoing increasing grammaticalization and is therefore more usual in ST than in LT. As is common with this process, aspectual distinctions are 'speaker-centered', i.e. they provide personal observations (some analysts have referred to this as 'attitude' or 'point of view', which is of course what the word 'aspect' originally means) which describe the speaker's frame of mind concerning the event depicted in the sentence -- whether it is perceived to be beneficial or detrimental, positive or negative, voluntary or involuntary, etc. Aspectual distinctions vary widely among dialects, both because of the variability of the grammaticalization process and for historical reasons, and Tamil speakers can code-switch among different dialects depending on context and audience. In addition, ST and LT treat the grammar of verbs differently. Finding exact equivalents between English and Tamil verbs is very difficult as a result of Tamil's diglossic nature and because of the difficulty of mapping English aspectual distinctions onto Tamil aspectual categories. An English Dictionary of the Tamil Verb seeks to meet needs not currently addressed by existing English-Tamil dictionaries. The main goal of this dictionary is to get an English-knowing user to a Tamil verb, irrespective of whether he or she begins with an English verb or some other item, such as an adjective; this is because what may be a verb in Tamil may in fact not be a verb in English, and vice versa. Since the number of English entries is limited (slightly less than 10,000) there may not be main entries for certain low-frequency items like 'pounce' but this item does appear as a synonym for 'jump, leap', and some other verbs, so searching for 'pounce' will get the user to a Tamil verb via the synonym field. The main goal is therefore to specifically concentrate on supplying the kinds of information lacking in all previous attempts to capture the equivalencies between English and Tamil. *Data* The text in the XML version of the dictionary is UTF-8. A dtd and W3C Schema have been provided for validation. In addition, an example XSLT style sheet has been provided to assist the novice in XML transformations. *Samples* For an example of the material in this publication, please examine this screen capture of a typical dictionary entry and its accompanying audio file(mp3). *Updates* A new browsing utility developed by Vasu Renganathan, University of Pennsylvania, is now available for download below. It allows users to interactively search the XML version of the dictionary, and it presents the entries in a user friendly format. This tool works only on Internet Explorer with ActiveX Objects activated. Download To install, simply unpack the package and place the resulting directory labeled "tools" in the top level of dictionary installation. New downloads after 5/5/2017 will have this browsing utility already as part of the corpus. | |
Extent: | Corpus size: 871424 KB | |
Identifier: | LDC2008L01 | |
https://catalog.ldc.upenn.edu/LDC2008L01 | ||
ISBN: 1-58563-464-6 | ||
ISLRN: 054-578-209-297-7 | ||
DOI: 10.35111/1m5x-cw94 | ||
Language: | Tamil | |
English | ||
Language (ISO639): | tam | |
eng | ||
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2008L01 | |
Rights Holder: | Portions © 1978-2008 Trustees of the University of Pennsylvania | |
Subject: | Tamil language | |
Subject (ISO639): | tam | |
Type (DCMI): | Sound | |
Text | ||
Type (OLAC): | lexicon | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2008L01 | |
DateStamp: | 2020-11-30 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Schiffman, Harold; Renganathan, Vasu. 2008. Linguistic Data Consortium. | |
Terms: | area_Asia area_Europe country_GB country_IN dcmi_Sound dcmi_Text iso639_eng iso639_tam olac_lexicon | |
Inferred Metadata | ||
Country: | India | |
Area: | Asia |