(1.58 GB)
Download file

Dataset and Analyses for Extracting Schemas from Thought Records using Natural Language Processing

Download (1.58 GB)
posted on 29.09.2021, 14:31 by Franziska BurgerFranziska Burger
This dataset contains all data and analysis scripts pertaining to the research conducted for the PLOSOne paper: “Natural language processing for cognitive therapy: extracting schemas from thought records.” The cognitive approach to psychotherapy aims to change patients' maladaptive schemas, that is, overly negative views on themselves, the world, or the future. To obtain awareness of these views, they record their thought processes in situations that caused pathogenic emotional responses. To date, the schemas underlying such thought records have been largely manually identified. Using recent advances in natural language processing, we take this one step further by automatically extracting schemas from thought records. We used the Amazon Mechanical Turk crowd sourcing platform to collect a set of 1600 thought records. In total, these thought records contain 5747 thoughts of various depth levels, with the automatic thought constituting the most shallow level and the core belief the deepest level.
We here deliver:

1. a natural language dataset: the thoughts delineated by participants in the scenario-based and open thought records
2. reliability analyses: all thoughts were labeled with respect to the degree to which they reflect a set of 9 possible schemas by the first author. An independent second coder also labeled a sample of the thoughts.
3. analyses to determine whether automatic identification of thoughts is possible.
4. additional materials (scenarios, instruction videos, qualtrics survey, osf preregistration form) that could assist in the replication of the study.


4TU research centre for Humans and Technology




Time coverage



csv, docx, html, ipynb, pdf, png, rnw


TU Delft, Faculty of Electrical Engineering, Mathematics and Computer Science; 4TU research centre for Humans and Technology