OLAC Record oai:catalogue.elra.info:ELRA-W0030 |
Metadata | ||
Title: | Al-Hayat Arabic Corpus | |
Access Rights: | Rights available for: nonCommercialUse, commercialUse | |
Date Available (W3CDTF): | 2002-01-15 | |
Date Issued (W3CDTF): | 2002-01-15 | |
Date Modified (W3CDTF): | 2008-11-19 | |
Description: | The corpus was developed in the course of a research project at the University of Essex, in collaboration with the Open University.The corpus contains Al-Hayat newspaper articles with value added for Language Engineering and Information Retrieval applications development purposes.The data have been distributed into 7 subject-specific databases, thus following the Al-Hayat subject tags: General, Car, Computer, News, Economics, Science, and Sport.Mark-up, numbers, special characters and punctuation have been removed. The size of the total file is 268 MB. The dataset contains 18,639,264 distinct tokens in 42,591 articles, organised in 7 domains. | |
Identifier: | ELRA-W0030 | |
ISLRN: 365-777-769-398-7 | ||
Identifier (URI): | https://catalog.elra.info/en-us/repository/browse/ELRA-W0030/ | |
Language: | Arabic | |
Language (ISO639): | ara | |
Medium: | Not specified | |
Publisher: | ELRA (European Language Resources Association) | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | ELRA Catalogue of Language Resources | |
Description: | http://www.language-archives.org/archive/catalogue.elra.info | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:catalogue.elra.info:ELRA-W0030 | |
DateStamp: | 2002-01-15 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | n.a. 2002. ELRA (European Language Resources Association). | |
Terms: | dcmi_Text iso639_ara olac_primary_text |