A Day in the Life with an Automated Vehicle: Supporting Data and Software

Supporting data and software for the paper 'A Day in the Life with an Automated Vehicle: Empirical Analysis of Data from an Interactive Stated Activity-Travel Survey'
Authors: B. Pudane, S. van Cranenburgh, C. Chorus
Transport and Logistics Group, Department of Engineering Systems and Services, Faculty of Technology, Policy and Management, TU Delft
Email: b.pudane@tudelft.nl



This dataset contains data from an interactive stated activity-travel survey aimed to investigate daily schedules of future automated vehicle users. 
The survey was conducted in the Netherlands, in July 2019. 509 respondents completed this survey.
This dataset also contains the survey tool (in Dutch) and data analysis files supporting the paper.

Briefly about the survey.
The goal of the survey was to collect two daily activity schedules from travellers: first, (an approximate) schedule of a recent workday, and second, the same workday reimagined in the future with automated vehicles (AVs).
In the present schedule, respondents reported an approximate recent day, constrained by the fact that only single of the present travel modes (car, public transport, bicycle, etc. - selected by the respondent) could be used.
In the future schedule, respondent redesigned the same day while imagining that only fully automated vehicles are available for any trips.
A special emphasis in the survey was on the activities performed during travel. It was hypothesised that AVs may offer more or enhanced activities during travel, and that this availablility could reshape entire daily schedules.
The survey tool was custom-built for this survey (by Game Tailors: https://gametailors.com/), and it includes an interactive interface that allows the respondents to visually design their schedules.
Read more about the survey in the above-referenced paper.



The following information documents all the files available in the folders.

The folder 'Raw data' contains the collected data in three formats.
First, there is an Excel summary of the responses: 'Survey_data_AV_Time-use.xlsx'. 
The main sheet 'Data' contains responses to all survey questions and a summarised version of the schedules. 
That is, the summary contains the total durations of each selected activity (which is differentiated between stationary and on-board locations), but it has no information on the clock-times of the activities.
The next sheet 'Codebook' contains the key for the encoded answers to multiple choice questions and the categorical socio-demographic data.
The last sheet 'Error count' summarises the cleaning of the 'Data' sheet. The cleaning corrected 6 main types of errors; however, we were conservative to label a situation as erroneous. 
An example of a corrected error is the following: if respondent selected the activity type 'Other', and then keyed in description of an existing activity type (e.g., 'watch Netflix'), then the type is changed to the existing one (e.g., 'Leisure').
The corresponding cells in the 'Data' sheet that were altered in the cleaning process, have an added comment with the error code. 
In addition to cleaning the data, we also excluded 13 of the 509 responses from the MDCEV analysis. Three conditions led to an exclusion: (1) no work or (2) no sleep activity in the data; (3) first (last) trip not starting (ending) at home.
These 13 rows are labelled in the Excel sheet, in the right-most columns of sheet 'Data'.

Second, the complete information of the responses (except for the socio-demographic pairing) can be found in 'survey.json' in a form of a code. Unlike the Excel file, json also includes the clock-times of activities. 
The identified errors, as described above, were corrected also in this file.

Third, a graphical representation of the schedules can be found in 'Survey schedules graphically.pdf'. These graphical representations are not the screenshots of the created schedules: activities have been colour-coded for clarity.
The pdf file excludes all other data besides schedules (i.e., socio-demographics and answers to other questions). The identified errors, as described above, are labelled using comments in this file.



The folder 'Data and codes of MDCEV analysis' contains processed data for a convenient use in estimating MDCEV. In addition, it contains the R codes of our MDCEV models and descriptive statistics.

The data files (csv) use long format (instead of wide format as in the Raw data file), they exclude the qualitative open-ended answers from the raw data, and we use short names for activities in them. 
t_a01_o is sleep duration during travel; t_a01_s is sleep duration stationary; t_a01 is the total sleep duration. Activity sequence further: 2 - get ready; 3 - work; 4 - meal; 5 - shop; 6 - service; 7 - leisure; 8 - household; 9 - other; 10 - travel time; 11 - do nothing during travel.

Files 'mdcev_onboard_const-only.R' and 'mdcev_onboard_with_AV_per_SD.R' estimate the two models for on-board activities. They use the data file 'Survey_data_MDCEV_onboard_model.csv'.
Files 'mdcev_stationary_const-only.R' and 'mdcev_stationary_with_AV_per_SD.R' estimate the two models for stationary activities. They use the data file 'Survey_data_MDCEV_stationary_model.csv'.

File 'All_MDCEV_results.xlsx' contains all results in a table (Tables 2 and 3 in the paper), and a simple computation for the prediction results (for Tables 4 and 5 in the paper).
File 'Descriptive_analysis_graphs.xlsx' contains Figures 2 and 3 from the paper, which are obtained using results from descriptives.R code. The latter also includes code for Figure 4 from the paper. 



The folder 'Survey tool' contains the offline versions of the survey (in Dutch), which can be run on Windows, Macintosh or Linux operating systems. 
It also contains the video introduction of the survey in a separate file, and the script of the video text (in Dutch).
Finally, it has a screenshot of the main survey task, which is also translated in English.
