Data associated with the article: "Exploring the Viability of ChatGPT for Personal Data Anonymization in Government: A Comprehensive Analysis of Possibilities, Risks, and Ethical Implications"
doi: 10.4121/a1dfacbe-b463-404f-a3d7-dab8485e6458
Artificial Intelligence (AI) applications are expected to promote government service delivery and quality, more efficient handling of cases, and bias reduction in decision-making. One potential benefit of the AI tool ChatGPT is that it may support governments in the anonymization of data. However, it is not clear whether ChatGPT is appropriate to support data anonymization for public organizations. Hence, this study examines the possibilities, risks, and ethical implications for government organizations to employ ChatGPT in the anonymization of personal data. We use a case study approach, combining informal conversations, formal interviews, a literature review, document analysis and experiments to conduct a three-step study. First, we describe the technology behind ChatGPT and its operation. Second, experiments with three types of data (fake data, original literature and modified literature) show that ChatGPT exhibits strong performance in anonymizing these three types of texts. Third, an overview of significant risks and ethical issues related to ChatGPT and its use for anonymization within a specific government organization was generated, including themes such as privacy, responsibility, transparency, bias, human intervention, and sustainability. One significant risk in the current form of ChatGPT is a privacy risk, as inputs are stored and forwarded to OpenAI and potentially other parties. This is unacceptable if texts containing personal data are anonymized with ChatGPT. We discuss several potential solutions to address these risks and ethical issues. This study contributes to the scarce scientific literature on the potential value of employing ChatGPT for personal data anonymization in government. In addition, this study has practical value for civil servants who face the challenges of data anonymization in practice including resource-intensive and costly processes.
- 2024-02-02 first online, published, posted
- Afstudeerstage Justitiƫle Informatiedienst
DATA
- 91,452 bytesMD5:
f99266ff22e1a2d90b219d0b70f54757
ChatGPT Anonymisation.zip -
download all files (zip)
91,452 bytes unzipped