OLAC Record
oai:catalogue.elra.info:ELRA-W0030

Metadata
Title:Al-Hayat Arabic Corpus
Access Rights: Rights available for: nonCommercialUse, commercialUse
Date Available (W3CDTF):2002-01-15
Date Issued (W3CDTF):2002-01-15
Date Modified (W3CDTF):2008-11-19
Description:The corpus was developed in the course of a research project at the University of Essex, in collaboration with the Open University.The corpus contains Al-Hayat newspaper articles with value added for Language Engineering and Information Retrieval applications development purposes.The data have been distributed into 7 subject-specific databases, thus following the Al-Hayat subject tags: General, Car, Computer, News, Economics, Science, and Sport.Mark-up, numbers, special characters and punctuation have been removed. The size of the total file is 268 MB. The dataset contains 18,639,264 distinct tokens in 42,591 articles, organised in 7 domains.
Identifier:ELRA-W0030
ISLRN: 365-777-769-398-7
Identifier (URI):https://catalog.elra.info/en-us/repository/browse/ELRA-W0030/
Language:Arabic
Language (ISO639):ara
Medium:Not specified
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-W0030
DateStamp:  2002-01-15
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2002. ELRA (European Language Resources Association).
Terms: dcmi_Text iso639_ara olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-W0030
Up-to-date as of: Fri Apr 19 6:28:18 EDT 2024