Data associated with the article: "Exploring the Viability of ChatGPT for Personal Data Anonymization in Government: A Comprehensive Analysis of Possibilities, Risks, and Ethical Implications"

doi:10.4121/a1dfacbe-b463-404f-a3d7-dab8485e6458.v1

The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
doi: 10.4121/a1dfacbe-b463-404f-a3d7-dab8485e6458

Datacite citation style:

van Staalduine, Nina (2024): Data associated with the article: "Exploring the Viability of ChatGPT for Personal Data Anonymization in Government: A Comprehensive Analysis of Possibilities, Risks, and Ethical Implications" . Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/a1dfacbe-b463-404f-a3d7-dab8485e6458.v1

Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Dataset

usage stats

views

downloads

categories

keywords

AI in Government, ChatGPT, Ethical Implications, Personal Data Anonymization

time coverage

February 2023 - July 2023

licence

CC BY-SA 4.0

export as...

RefWorks, BibTeX, Reference Manager, Endnote, DataCite, NLM, DC, CFF

by Nina van Staalduine

Artificial Intelligence (AI) applications are expected to promote government service delivery and quality, more efficient handling of cases, and bias reduction in decision-making. One potential benefit of the AI tool ChatGPT is that it may support governments in the anonymization of data. However, it is not clear whether ChatGPT is appropriate to support data anonymization for public organizations. Hence, this study examines the possibilities, risks, and ethical implications for government organizations to employ ChatGPT in the anonymization of personal data. We use a case study approach, combining informal conversations, formal interviews, a literature review, document analysis and experiments to conduct a three-step study. First, we describe the technology behind ChatGPT and its operation. Second, experiments with three types of data (fake data, original literature and modified literature) show that ChatGPT exhibits strong performance in anonymizing these three types of texts. Third, an overview of significant risks and ethical issues related to ChatGPT and its use for anonymization within a specific government organization was generated, including themes such as privacy, responsibility, transparency, bias, human intervention, and sustainability. One significant risk in the current form of ChatGPT is a privacy risk, as inputs are stored and forwarded to OpenAI and potentially other parties. This is unacceptable if texts containing personal data are anonymized with ChatGPT. We discuss several potential solutions to address these risks and ethical issues. This study contributes to the scarce scientific literature on the potential value of employing ChatGPT for personal data anonymization in government. In addition, this study has practical value for civil servants who face the challenges of data anonymization in practice including resource-intensive and costly processes.

history

2024-02-02 first online, published, posted

publisher

4TU.ResearchData

format

docx

language

funding

Afstudeerstage Justitiële Informatiedienst

organizations

Ministerie van Justitie en Veiligheid, Justitiële Informatiedienst; TU Delft, Faculty of Technology, Policy and Management

DATA

files (1)

91,452 bytesMD5:f99266ff22e1a2d90b219d0b70f54757ChatGPT Anonymisation.zip
download all files (zip)
91,452 bytes unzipped