OLAC Record: 2007 CoNLL Shared Task - Basque, Catalan, Czech & Turkish

OLAC Record
oai:catalogue.elra.info:ELRA-W0121

Metadata

Title: 2007 CoNLL Shared Task - Basque, Catalan, Czech & Turkish

Access Rights: Rights available for: nonCommercialUse

Date Available (W3CDTF): 2017-12-21

Date Issued (W3CDTF): 2017-12-21

Date Modified (W3CDTF): 2017-12-21

Description: 2007 CoNLL Shared Task - Basque, Catalan, Czech & Turkish consists of dependency treebanks in four languages used as part of the CoNLL 2007 shared task on multi-lingual dependency parsing and domain adaptation. The languages covered in this release are: Basque, Catalan, Czech and Turkish.The Conference on Computational Natural Language Learning (CoNLL) is accompanied every year by a shared task intended to promote natural language processing applications and evaluate them in a standard setting. In 2006 and 2007, the shared task was devoted to the parsing of syntactic dependencies using corpora from up to thirteen languages. The task aimed to define and extend the then-current state of the art in dependency parsing, a technology that complemented previous tasks by producing a different kind of syntactic description of input text. The 2007 shared task added a domain adaptation track for English in addition to the multilingual track. More information about CoNLL and the 2007 shared task are available respectively at: http://www.signll.org/conll/ and http://www.conll.org/previous-tasks. The source data in the treebanks in this release consists principally of various texts (e.g., textbooks, news, literature) annotated in dependency format. In general, dependency grammar is based on the idea that the verb is the center of the clause structure and that other units in the sentence are connected to the verb as directed links or dependencies. This is a one-to-one correspondence: for every element in the sentence there is one node in the sentence structure that corresponds to that element. In constituency or phrase structure grammars, on the other hand, clauses are divided into noun phrases and verb phrases and in each sentence, one or more nodes may correspond to one element. All of the data sets in this release are dependency treebanks.The individual data sets are:The 3LB Treebank (Basque)CESS-Cat Dependency Treebank (Catalan)Prague Dependency Treebank 2.0 (Czech)METU-Sabanci Turkish Treebank (Turkish)This corpus is distributed jointly with LDC. LDC Catalogue Reference is: https://catalog.ldc.upenn.edu/LDC2017T19.

Identifier: ELRA-W0121

ISLRN: 769-620-932-723-2

Identifier (URI): https://catalog.elra.info/en-us/repository/browse/ELRA-W0121/

Language: Turkish

Czech

Basque

Catalan; Valencian

Language (ISO639): tur

ces

eus

cat

Medium: downloadable

Publisher: ELRA (European Language Resources Association)

Type (DCMI): Text

Type (OLAC): primary_text

OLAC Info

Archive: ELRA Catalogue of Language Resources

Description: http://www.language-archives.org/archive/catalogue.elra.info

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:catalogue.elra.info:ELRA-W0121

DateStamp: 2017-12-21

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: n.a. 2017. ELRA (European Language Resources Association).
Terms: area_Asia area_Europe country_CZ country_ES country_TR dcmi_Text iso639_cat iso639_ces iso639_eus iso639_tur olac_primary_text

http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-W0121
Up-to-date as of: Wed Oct 1 0:57:10 EDT 2025

Metadata
Title:		2007 CoNLL Shared Task - Basque, Catalan, Czech & Turkish
Access Rights:		Rights available for: nonCommercialUse
Date Available (W3CDTF):		2017-12-21
Date Issued (W3CDTF):		2017-12-21
Date Modified (W3CDTF):		2017-12-21
Description:		2007 CoNLL Shared Task - Basque, Catalan, Czech & Turkish consists of dependency treebanks in four languages used as part of the CoNLL 2007 shared task on multi-lingual dependency parsing and domain adaptation. The languages covered in this release are: Basque, Catalan, Czech and Turkish.The Conference on Computational Natural Language Learning (CoNLL) is accompanied every year by a shared task intended to promote natural language processing applications and evaluate them in a standard setting. In 2006 and 2007, the shared task was devoted to the parsing of syntactic dependencies using corpora from up to thirteen languages. The task aimed to define and extend the then-current state of the art in dependency parsing, a technology that complemented previous tasks by producing a different kind of syntactic description of input text. The 2007 shared task added a domain adaptation track for English in addition to the multilingual track. More information about CoNLL and the 2007 shared task are available respectively at: http://www.signll.org/conll/ and http://www.conll.org/previous-tasks. The source data in the treebanks in this release consists principally of various texts (e.g., textbooks, news, literature) annotated in dependency format. In general, dependency grammar is based on the idea that the verb is the center of the clause structure and that other units in the sentence are connected to the verb as directed links or dependencies. This is a one-to-one correspondence: for every element in the sentence there is one node in the sentence structure that corresponds to that element. In constituency or phrase structure grammars, on the other hand, clauses are divided into noun phrases and verb phrases and in each sentence, one or more nodes may correspond to one element. All of the data sets in this release are dependency treebanks.The individual data sets are:The 3LB Treebank (Basque)CESS-Cat Dependency Treebank (Catalan)Prague Dependency Treebank 2.0 (Czech)METU-Sabanci Turkish Treebank (Turkish)This corpus is distributed jointly with LDC. LDC Catalogue Reference is: https://catalog.ldc.upenn.edu/LDC2017T19.
Identifier:		ELRA-W0121
Identifier:		ISLRN: 769-620-932-723-2
Identifier (URI):		https://catalog.elra.info/en-us/repository/browse/ELRA-W0121/
Language:		Turkish
		Czech
		Basque
		Catalan; Valencian
Language (ISO639):		tur
		ces
		eus
		cat
Medium:		downloadable
Publisher:		ELRA (European Language Resources Association)
Type (DCMI):		Text
Type (OLAC):		primary_text
OLAC Info
Archive:		ELRA Catalogue of Language Resources
Description:		http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:catalogue.elra.info:ELRA-W0121
DateStamp:		2017-12-21
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		n.a. 2017. ELRA (European Language Resources Association).
Terms:		area_Asia area_Europe country_CZ country_ES country_TR dcmi_Text iso639_cat iso639_ces iso639_eus iso639_tur olac_primary_text