OLAC Record oai:lindat.mff.cuni.cz:11234/1-4935 |
Metadata | ||
Title: | A Human-Annotated Dataset of Scanned Images and OCR Texts from Medieval Documents: Supplementary Materials | |
Bibliographic Citation: | http://hdl.handle.net/11234/1-4935 | |
Creator: | Novotný, Vít | |
Horák, Aleš | ||
Date (W3CDTF): | 2022-12-05T11:18:51Z | |
Date Available: | 2022-12-05T11:18:51Z | |
Description: | These are supplementary materials for an open dataset of scanned images and OCR texts from 19th and 20th century letterpress reprints of documents from the Hussite era. The dataset contains human annotations for layout analysis, OCR evaluation, and language identification and is available at http://hdl.handle.net/11234/1-4615. These supplementary materials contain OCR texts from different OCR engines for book pages for which we have both high-resolution scanned images and annotations for OCR evaluation. | |
Identifier (URI): | http://hdl.handle.net/11234/1-4935 | |
Language: | Czech | |
English | ||
German | ||
Latin | ||
Language (ISO639): | ces | |
eng | ||
deu | ||
lat | ||
Publisher: | Masaryk University, Brno | |
Rights: | Public Domain Dedication (CC Zero) | |
http://creativecommons.org/publicdomain/zero/1.0/ | ||
Subject: | ocr | |
optical character recognition | ||
language identification | ||
image super-resolution | ||
sr | ||
Medieval | ||
Type: | corpus | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University | |
Description: | http://www.language-archives.org/archive/lindat.mff.cuni.cz | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:lindat.mff.cuni.cz:11234/1-4935 | |
DateStamp: | 2022-12-07 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Novotný, Vít; Horák, Aleš. 2022. Masaryk University, Brno. | |
Terms: | area_Europe country_CZ country_DE country_GB country_VA dcmi_Text iso639_ces iso639_deu iso639_eng iso639_lat olac_primary_text |