OLAC Record oai:catalogue.elra.info:ELRA-W0325 |
Metadata | ||
Title: | Wojood - A corpus for nested Arabic Named Entity Recognition | |
Access Rights: | Rights available for: nonCommercialUse, commercialUse | |
Date Available (W3CDTF): | 2022-09-27 | |
Date Issued (W3CDTF): | 2022-09-27 | |
Description: | Wojood consists of about 550,000 tokens (Modern Standard Arabic and dialect) that are manually annotated with 21 entity types (person, group of people, occupation, organization, geopolitical entity, location, facility, event, date, time, language, website, law, product, cardinal number, ordinal number, percent, quantity, unit, money, currency). It covers multiple domains (Media, History, Culture, Health, Finance, ICT, Law, Elections, Politics, Migration, Terrorism, social media) and was annotated with nested entities. The corpus contains about 75K entities and 22.5% of which are nested. The corpus was annotated using the IOB2 tagging scheme and is available in CSV format. | |
Identifier: | ELRA-W0325 | |
ISLRN: 688-718-284-176-0 | ||
Identifier (URI): | https://catalog.elra.info/en-us/repository/browse/ELRA-W0325/ | |
Language: | Arabic | |
Language (ISO639): | ara | |
Medium: | Not specified | |
Publisher: | ELRA (European Language Resources Association) | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | ELRA Catalogue of Language Resources | |
Description: | http://www.language-archives.org/archive/catalogue.elra.info | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:catalogue.elra.info:ELRA-W0325 | |
DateStamp: | 2022-09-27 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | n.a. 2022. ELRA (European Language Resources Association). | |
Terms: | dcmi_Text iso639_ara olac_primary_text |