OLAC Record oai:lindat.mff.cuni.cz:11234/1-4615 |
Metadata | ||
Title: | A Human-Annotated Dataset of Scanned Images and OCR Texts from Medieval Documents | |
Bibliographic Citation: | http://hdl.handle.net/11234/1-4615 | |
Creator: | Novotný, Vít | |
Seidlová, Kristýna | ||
Vrabcová, Tereza | ||
Horák, Aleš | ||
Date (W3CDTF): | 2021-12-10T12:28:54Z | |
Date Available: | 2021-12-10T12:28:54Z | |
Description: | This is an open dataset of scanned images and OCR texts from 19th and 20th century letterpress reprints of documents from the Hussite era. The dataset contains human annotations for layout analysis, OCR evaluation, and language identification. | |
Identifier (URI): | http://hdl.handle.net/11234/1-4615 | |
Language: | German | |
Czech | ||
Latin | ||
English | ||
Language (ISO639): | deu | |
ces | ||
lat | ||
eng | ||
Publisher: | Masaryk University, Brno | |
Rights: | Public Domain Dedication (CC Zero) | |
http://creativecommons.org/publicdomain/zero/1.0/ | ||
Subject: | ocr | |
optical character recognition | ||
language identification | ||
image super-resolution | ||
sr | ||
Medieval | ||
Type: | corpus | |
Type (DCMI): | Text | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University | |
Description: | http://www.language-archives.org/archive/lindat.mff.cuni.cz | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:lindat.mff.cuni.cz:11234/1-4615 | |
DateStamp: | 2021-12-10 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | Novotný, Vít; Seidlová, Kristýna; Vrabcová, Tereza; Horák, Aleš. 2021. Masaryk University, Brno. | |
Terms: | area_Europe country_CZ country_DE country_GB country_VA dcmi_Text iso639_ces iso639_deu iso639_eng iso639_lat olac_primary_text |