OLAC Record
oai:lindat.mff.cuni.cz:11372/LRT-5196

Metadata
Title:OpenLegalData (2022 - Corpus)
Bibliographic Citation:http://hdl.handle.net/11372/LRT-5196
Creator:Rüdiger, Jan Oliver
Date (W3CDTF):2023-08-01T09:25:22Z
Date Available:2023-08-01T09:25:22Z
Description:OpenLegalData is a free and open platform that makes legal documents and information available to the public. The aim of this platform is to improve the transparency of jurisprudence with the help of open data and to help people without legal training to understand the justice system. The project is committed to the Open Data principles and the Free Access to Justice Movement. OpenLegalData's DUMP as of 2022-10-18 was used to create this corpus. The data was cleaned, automatically annotated (TreeTagger: POS & Lemma) and grouped based on the metadata (jurisdiction - BundeslandID - sub-size if applicable - ex: Verwaltungsgerichtsbarkeit_11_05.cec6.gz - jurisdiction: administrative jurisdiction, BundeslandID = 11 - sub-corpus = 05). Sub-corpora are randomly split into 50 MB each. Corpus data is available in CEC6 format. This can be converted into many different corpus formats - use the software www.CorpusExplorer.de if necessary.
Identifier (URI):http://hdl.handle.net/11372/LRT-5196
Language:German
Language (ISO639):deu
Publisher:Rüdiger, Jan Oliver
Rights:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
http://creativecommons.org/licenses/by-nc-sa/4.0/
Subject:corpus
legal texts
legal domain
annotated corpus
NLP
CorpusExplorer
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11372/LRT-5196
DateStamp:  2023-08-01
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Rüdiger, Jan Oliver. 2023. Rüdiger, Jan Oliver.
Terms: area_Europe country_DE dcmi_Text iso639_deu olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11372/LRT-5196
Up-to-date as of: Thu Oct 5 0:43:36 EDT 2023