Dataset and Analyses for Extracting Schemas from Thought Records using Natural Language Processing (dataset)

Dataset and Analyses for Extracting Schemas from Thought Records using Natural Language Processing

The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
doi: 10.4121/16685347

Datacite citation style:

Franziska Burger (2021): Dataset and Analyses for Extracting Schemas from Thought Records using Natural Language Processing. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/16685347.v1

Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Dataset

usage stats

2189

views

citations

560

downloads

categories

keywords

cognitive therapy, core beliefs, natural language processing, thought records

time coverage

2019,2020

licence

CC0

export as...

RefWorks, BibTeX, Reference Manager, Endnote, DataCite, NLM, DC, CFF

by Franziska Burger

This dataset contains all data and analysis scripts pertaining to the research conducted for the PLOSOne paper: “Natural language processing for cognitive therapy: extracting schemas from thought records.” The cognitive approach to psychotherapy aims to change patients' maladaptive schemas, that is, overly negative views on themselves, the world, or the future. To obtain awareness of these views, they record their thought processes in situations that caused pathogenic emotional responses. To date, the schemas underlying such thought records have been largely manually identified. Using recent advances in natural language processing, we take this one step further by automatically extracting schemas from thought records. We used the Amazon Mechanical Turk crowd sourcing platform to collect a set of 1600 thought records. In total, these thought records contain 5747 thoughts of various depth levels, with the automatic thought constituting the most shallow level and the core belief the deepest level.

We here deliver:

1. a natural language dataset: the thoughts delineated by participants in the scenario-based and open thought records

2. reliability analyses: all thoughts were labeled with respect to the degree to which they reflect a set of 9 possible schemas by the first author. An independent second coder also labeled a sample of the thoughts.

3. analyses to determine whether automatic identification of thoughts is possible.

4. additional materials (scenarios, instruction videos, qualtrics survey, osf preregistration form) that could assist in the replication of the study.

history