Users' needs for a digital smoking cessation application and how to address them: Data and analysis code

doi: 10.4121/20284131.v2
The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
doi: 10.4121/20284131
Datacite citation style:
Albers, Nele; Mark A. Neerincx; Kristell M. Penfornis; Brinkman, Willem-Paul (2022): Users' needs for a digital smoking cessation application and how to address them: Data and analysis code. Version 2. 4TU.ResearchData. dataset.
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite
choose version:
version 2 - 2022-07-13 (latest)
version 1 - 2022-07-12

This is the data and analysis code underlying the paper "Users' needs for a digital smoking cessation application and how to address them: A mixed-methods study" by Nele Albers, Mark A. Neerincx, Kristell M. Penfornis, and Willem-Paul Brinkman. The goal of this paper is to get a more accurate assessment of users' needs for eHealth applications for behavior change, with the focus on smoking cessation.


The paper is based on a longitudinal study on the crowdsourcing platform Prolific run between 20 May 2021 and 30 June 2021. The Human Research Ethics Committee of Delft University of Technology granted ethical approval for the research (Letter of Approval number: 1523).  

In this study, smokers who were contemplating or preparing to quit smoking interacted with the text-based virtual coach Sam in up to five conversational sessions. In each session, participants were assigned a new preparatory activity for quitting smoking, such as thinking of and writing down reasons for quitting smoking. Since becoming more physically active may make it easier to quit smoking, half of the activities addressed becoming more physically active. In the next session, participants were asked to provide feedback on their activity by means of a rating of the effort they spent on the activity and a free-text response describing their experience with the activity. After the five sessions, participants filled in a post-questionnaire in which they were asked about their feedback on their activity from the last session, barriers and motivators for doing their activities, and views on videos of interaction scenarios. These videos showed interaction scenarios such as receiving motivational messages from the virtual coach, telling one's social environment about one's quit attempt, or consulting a general practitioner in case of a smoking relapse. There were 13 different videos, and each participant saw 2 of them.

The study was pre-registered in the Open Science Framework (OSF): This pre-registration describes the study design, measures, etc. Note that the data we provide here is only a part of the data collected in the study, namely, the data related to studying user needs. Data on the acceptance of the virtual coach can be found separately here:

The implementation of the virtual coach Sam can be found here:

The formulations for the 24 preparatory activities used in the study can be found in Table S1 in the paper.


We collected four main types of data from participants:

  • Activity effort and experience: The effort people spent on their activity from the last session and their experience with it.
  • Barriers and motivators for doing the activities: Participants answered two free-text questions in the post-questionnaire.
  • Views on interaction scenarios: For each interaction scenario, participants provided a rating on a scale from -5 to 5 describing their intention to engage in the interaction and a free-text response after the prompt "Why do you think so?".
  • User characteristics: We measured several user characteristics (e.g., quitter self-identity, smoking frequency, weekly exercise amount, Big-Five personality, gender).

Please consult the "Readme"-file in the "Data"-folder for more information on the data we collected.


We conducted a mixed-methods analysis that consisted of a thematic analysis with the addition of triangulation with literature and quantitative data.

  • 2022-07-12 first online
  • 2022-07-13 published, posted
.zip, .csv, .xlsx, .md, .pdf, .txt, .py, .ipynb, .Rmd, .bib, .tar.gz, .jpg
  • This work is part of the multidisciplinary research project Perfect Fit, which is supported by several funders organized by the Netherlands Organization for Scientific Research (NWO), program Commit2Data - Big Data & Health (project number 628.011.211).
TU Delft, Faculty of Electrical Engineering, Mathematics and Computer Science, Department of Intelligent Systems, Interactive Intelligence


files (1)