OLAC Record
oai:lindat.mff.cuni.cz:11234/1-5139

Metadata
Title:LongEval Test Collection
Bibliographic Citation:http://hdl.handle.net/11234/1-5139
Creator:Galuščáková, Petra
Devaud, Romain
Gonzalez-Saez, Gabriela
Mulhem, Philippe
Goeuriot, Lorraine
Piroi, Florina
Popel, Martin
Date (W3CDTF):2023-04-28T08:50:24Z
Date Available:2023-04-28T08:50:24Z
Description:The collection consists of queries and documents provided by the Qwant search Engine (https://www.qwant.com). The queries, which were issued by the users of Qwant, are based on the selected trending topics. The documents in the collection are the webpages which were selected with respect to these queries using the Qwant click model. Apart from the documents selected using this model, the collection also contains randomly selected documents from the Qwant index. The collection serves as the official test collection for the 2023 LongEval Information Retrieval Lab (https://clef-longeval.github.io/) organised at CLEF. The collection contains test datasets for two organized sub-tasks: short-term persistence (sub-task A) and long-term persistence (sub-task B). The data for the short-term persistence sub-task was collected over July 2022 and this dataset contains 1,593,376 documents and 882 queries. The data for the long-term persistence sub-task was collected over September 2022 and this dataset consists of 1,081,334 documents and 923 queries. Apart from the original French versions of the webpages and queries, the collection also contains their translations into English.
Identifier (URI):http://hdl.handle.net/11234/1-5139
Language:French
English
Language (ISO639):fra
eng
Publisher:Université Grenoble Alpes
Qwant
Research Studios Austria
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:Qwant LongEval Attribution-NonCommercial-ShareAlike License
https://lindat.mff.cuni.cz/repository/xmlui/page/Qwant_LongEval_BY-NC-SA_License
Subject:information retrieval
cross-language
cross-lingual information retrieval
parallel corpus
search
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-5139
DateStamp:  2023-04-28
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Galuščáková, Petra; Devaud, Romain; Gonzalez-Saez, Gabriela; Mulhem, Philippe; Goeuriot, Lorraine; Piroi, Florina; Popel, Martin. 2023. Université Grenoble Alpes.
Terms: area_Europe country_FR country_GB dcmi_Text iso639_eng iso639_fra olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-5139
Up-to-date as of: Thu Oct 5 0:43:34 EDT 2023