What does a Text Classifier Learn about Morality? An Explainable Method for Cross-Domain Comparison of Moral Rhetoric - code

doi: 10.4121/1e71138c-be26-4652-971a-48a84837df8e.v1
The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
doi: 10.4121/1e71138c-be26-4652-971a-48a84837df8e
Datacite citation style:
Liscio, Enrico; Araque, Oscar; Gatti, Lorenzo; Constantinescu, Ionut; C.M. (Catholijn) Jonker et. al. (2023): What does a Text Classifier Learn about Morality? An Explainable Method for Cross-Domain Comparison of Moral Rhetoric - code. Version 1. 4TU.ResearchData. software. https://doi.org/10.4121/1e71138c-be26-4652-971a-48a84837df8e.v1
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite
Software

Code for the paper "What does a Text Classifier Learn about Morality? An Explainable Method for Cross-Domain Comparison of Moral Rhetoric", published at ACL '23. This code implements Tomea, an Explainable AI method for investigating the difference in how language models represent morality across domains. Given a pair of datasets and models trained on the datasets, Tomea generates 10 m-distances and one d-distance to measure the difference between the datasets, based on the SHAP method. We make pairwise comparisons of seven models trained on the MFTC datasets (available at this DOI: 10.4121/646b20e3-e24f-452d-938a-bcb6ce30913c).

history
  • 2023-12-18 first online, published, posted
publisher
4TU.ResearchData
format
Python code
funding
  • Hybrid Intelligence Center (a 10-year programme funded by the Dutch Ministry of Education, Culture and Science through the Netherlands Organisation for Scientific Research).
  • European Union's Horizon 2020 research and innovation program (grant code STG–677576) European Research Council
organizations
TU Delft, Faculty of Electrical Engineering, Mathematics and Computer Science, Department of Intelligent Systems
Universidad Politécnica de Madrid, Departamento de Ingeniería de Sistemas Telemáticos
University of Twente, Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS), Human Media Interaction (HMI)
ETH Zürich Department of Computer Science,
ISI Foundation, Data Science Laboratory
Leiden University, Leiden Institute of Advanced Computer Science

DATA

To access the source code, use the following command:

git clone https://data.4tu.nl/v3/datasets/14693061-797f-4105-b4b3-812c7cbcd759.git