OLAC Record
oai:lindat.mff.cuni.cz:11234/1-5847

Metadata
Title:Debiasing Algorithm through Model Adaptation
Bibliographic Citation:http://hdl.handle.net/11234/1-5847
Creator:Limisiewicz, Tomasz
Mareček, David
Musil, Tomáš
Date (W3CDTF):2025-03-10T10:47:40Z
Date Available:2025-03-10T10:47:40Z
Description:Debiasing Algorithm through Model Adaptation (DAMA) is based on guarding stereotypical gender signals and model editing. DAMA is performed on specific modules prone to convey gender bias, as shown by causal tracing. Our novel method effectively reduces gender bias in LLaMA models in three diagnostic tests: generation, coreference (WinoBias), and stereotypical sentence likelihood (StereoSet). The method does not change the model’s architecture, parameter count, or inference cost. We have also shown that the model’s performance in language modeling and a diverse set of downstream tasks is almost unaffected. This package contains both the source codes and English, English-to-Czech, and English-to-German datasets.
Identifier (URI):http://hdl.handle.net/11234/1-5847
Language:English
Czech
German
Language (ISO639):eng
ces
deu
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:The MIT License (MIT)
http://opensource.org/licenses/mit-license.php
Subject:gender-bias
large language models
machine translation
Type:toolService
Type (DCMI):Software

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-5847
DateStamp:  2025-03-10
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Limisiewicz, Tomasz; Mareček, David; Musil, Tomáš. 2025. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Europe country_CZ country_DE country_GB dcmi_Software iso639_ces iso639_deu iso639_eng


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-5847
Up-to-date as of: Tue Mar 11 0:32:48 EDT 2025