OLAC Record
oai:lindat.mff.cuni.cz:11234/1-3293

Metadata
Title:Posts of German PC Games Online Forum
Bibliographic Citation:http://hdl.handle.net/11234/1-3293
Creator:Kissling, Jürg
Date (W3CDTF):2020-09-29T15:25:47Z
Date Available:2020-09-29T15:25:47Z
Description:Contains linguistic annotated data from the Online-Forum PC Games (https://forum.pcgames.de). The forum is concerned about gaming. All posts (approx. 2.4 mio) where scraped in April 2019 (details see Kissling 2019), resulting in 120 mio tokens of almost 70'000 authors. The data is saved in a SQL-database and can be accessed using eg. pg_restore. The database itself and the tables of the database contain detailed self-descriptions. In this database you find tokenized, part-of-speech-tagged and party lemmatized information of every token in the forum and its metadata (usernames and their location in the forum structure, e.g. which post(s), thread, subforum it belongs to). The order of the words in a post cannot be reconstructed with this corpus. Usernames were replaced with author_ids to protect the personal rights of the post authors. Additional information: As this corpus was analyzed in terms of productivity and language contact of German and English (Kissling 2020), there is additional information about German base forms found in present day English, mainly focussing on the formula "German_verb_stem + -en = English verb infinitive". Therefore the API of the Oxford Dictionary of English was used. You will find the results of the API request done with Oxford Dictionary of English in the table infinitives. The corpus can be used without using this information, too. Calculations were performed at sciCORE (http://scicore.unibas.ch/) scientific computing core facility at University of Basel on 2019-09-10. This database contains all of the primary corpus of Kissling (2020). Sources: Kissling, J. (2019). Computerunterstütztes Verfahren zur Erhebung eigener Textkorpus-Daten. Methodenentwicklung und Anwendung auf 2.4 Mio. Posts des Forums PC Games.de [certification thesis]. Universität Basel. Kissling, J. (2020). Produktivität englischer Verben im Deutschen [master thesis]. Universität Basel. The used scraper is available on github: https://github.com/vizzerdrix55/web-scraping-vBulletin-forum
Identifier (URI):http://hdl.handle.net/11234/1-3293
Language:German
Language (ISO639):deu
Publisher:University of Basel, Switzerland
Rights:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
http://creativecommons.org/licenses/by-nc-sa/4.0/
Subject:forum
gaming
productivity
language contact
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-3293
DateStamp:  2021-06-29
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Kissling, Jürg. 2020. University of Basel, Switzerland.
Terms: area_Europe country_DE dcmi_Text iso639_deu olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-3293
Up-to-date as of: Thu Oct 5 0:41:07 EDT 2023