OLAC Record oai:www.ldc.upenn.edu:LDC2016T21 |
Metadata | ||
Title: | KAFD: Arabic Font Database | |
Access Rights: | Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining | |
Bibliographic Citation: | Luqman, Hamzah, Sabri Mahmoud, and Sameh Awaida. KAFD: Arabic Font Database LDC2016T21. Web Download. Philadelphia: Linguistic Data Consortium, 2016 | |
Contributor: | Luqman, Hamzah | |
Mahmoud, Sabri A. | ||
Awaida, Sameh | ||
Date (W3CDTF): | 2016 | |
Date Issued (W3CDTF): | 2016-10-19 | |
Description: | *Introduction* KAFD: Arabic Font Database was developed by King Fahd University of Petroleum & Minerals and Qassim University. It is comprised of approximately 2.5 million scanned Arabic printed pages in a variety of fonts, sizes and resolutions along with corresponding transcripts. KAFD was designed for research in Arabic text recognition. *Data* The scanned Arabic texts were collected from publications covering various subjects such as religion, medicine, science and history. Texts were printed in 40 different fonts, 10 sizes and four styles. Scans were made at 100, 200, 300 and 600 dpi (dots per inch). The database is available in two formats: at the page level and at the line level. Images are presented as TIFF images and transcripts are in plain text format. Individual font folders are compressed into RAR archives. The data is divided into training, validation and test sets. *Samples* Please view this image sample and text sample. *Updates* None at this time. | |
Extent: | Corpus size: 276843952 KB | |
Identifier: | LDC2016T21 | |
https://catalog.ldc.upenn.edu/LDC2016T21 | ||
ISBN: 1-58563-773-4 | ||
ISLRN: 859-947-665-680-4 | ||
DOI: 10.35111/hmnj-ks39 | ||
Language: | Arabic | |
Language (ISO639): | ara | |
License: | LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf | |
Medium: | Distribution: Web Download | |
Publisher: | Linguistic Data Consortium | |
Publisher (URI): | https://www.ldc.upenn.edu | |
Relation (URI): | https://catalog.ldc.upenn.edu/docs/LDC2016T21 | |
Rights Holder: | Portions © 2016 King Fahd University of Petroleum & Minerals, Trustees of the University of Pennsylvania | |
Type (DCMI): | StillImage | |
Text | ||
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | The LDC Corpus Catalog | |
Description: | http://www.language-archives.org/archive/www.ldc.upenn.edu | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:www.ldc.upenn.edu:LDC2016T21 | |
DateStamp: | 2021-07-19 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Luqman, Hamzah; Mahmoud, Sabri A.; Awaida, Sameh. 2016. Linguistic Data Consortium. | |
Terms: | dcmi_StillImage dcmi_Text iso639_ara olac_primary_text |