OLAC Record
oai:lindat.mff.cuni.cz:11234/1-2591

Metadata
Title:Somali Web Corpus
Bibliographic Citation:http://hdl.handle.net/11234/1-2591
Creator:Suchomel, Vít
Rychlý, Pavel
Date (W3CDTF):2018-01-11T15:32:25Z
Date Available:2018-01-11T15:32:25Z
Description:Somali web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated.
Identifier (URI):http://hdl.handle.net/11234/1-2591
Language:Somali
Language (ISO639):som
Publisher:Masaryk University, NLP Centre
Rights:NLP Centre Web Corpus License
https://lindat.mff.cuni.cz/repository/xmlui/page/license-NLPC-WeC
Subject:text corpora
Ethiopian languages
web corpora
under-resourced languages
Somali
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-2591
DateStamp:  2021-06-29
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Suchomel, Vít; Rychlý, Pavel. 2018. Masaryk University, NLP Centre.
Terms: area_Africa country_SO dcmi_Text iso639_som olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-2591
Up-to-date as of: Thu Oct 5 0:40:50 EDT 2023