**** Data and Analysis Underlying the Research into The Artificial Social Agent Questionnaire (ASAQ) - Development and Evaluation of a Validated Instrument for Capturing Human Interaction Experiences with Artificial Social Agents ****
Author: Siska Fitrianie, Merijn Bruijnes,  Amal Abdulrahman, Willem-Paul Brinkman
Corresponding author: Siska Fitrianie, s.fitrianie@tudelft.nl
https:

Delft University of Technology
P.O. Box 5031,
2600 GA Delft,
The Netherlands



Description of the dataset
------------------------------------------------------------------------------------------------------------------------------------
The Archive.zip is a complementary material of the journal: 
Siska Fitrianie, Merijn Bruijnes,  Amal Abdulrahman, Willem-Paul Brinkman. (????) The Artificial Social Agent Questionnaire (ASAQ) - Development and Evaluation of a Validated Instrument for Capturing Human Interaction Experiences with Artificial Social Agents. International Journal Human Computer Studies. Elsevier

We present data, scripts, and analysis based on studies on the construct defining, reliability validity, construct validity, cross-validity, predictive validity, concurrent validity, and interpretability of the Artificial Social Agent Questionnaire (ASAQ). Furthermore, the calculation of sample sizes for using the constructs/dimensions of the short and long versions of the ASAQ are also given. To access the content, one can follow the table of contents in underlyingIJHCS_ASAQ.pdf, which corresponds to the table of contents (outline) of the journal. The captions of tables and figures containing the results that correspond to the tables and figures displayed in the journal, are also listed and provided. Additionally, we present the script to generate an ASAQ chart in the folder "ASAQ Chart generator" including examples of input based on the long and shorts ASAQ versions  of the ASAQ representative set 2024. 

Various data were collected between 2019-2022 to support the analysis. Parts of this work were funded by the 4TU Pride and Prejudice project. We acknowledge the efforts of all OSF workgroup members, and in particular mention the large recurring efforts of Nele Albers, Andrea Bönsch, Jonathan Ehret, Fengxian Li, and Deborah Richards.



Abstract of the journal
------------------------------------------------------------------------------------------------------------------------------------
Validating claims and replicating findings on the impact of artificial social agents (ASA), such as virtual agents, conversational agents and social robots, require a standardised measurement instrument that researchers can employ in different settings and for various agents. Such an instrument would allow researchers to evaluate their agents and establish insights beyond their specific study context. Therefore, we present the long and short versions of the ASA questionnaire (ASAQ) for evaluating human-ASA interaction on 19 constructs, such as the agent's believability, sociability, and coherence. It has been developed by an international workgroup with more than 100 ASA-researchers over multiple years who identified community-relevant constructs and associated questionnaire items and examined the questionnaire's reliability, validity, and interpretability. The result is a questionnaire that can capture more than 80% of the constructs that studies in the intelligent virtual agent community investigate, with acceptable levels of reliability, content validity, construct validity, and cross-validity. We suggest that ASA-researchers use the ASAQ short version to report their agent's psychographic information and the ASAQ long version to analyse any constructs in depth that are specifically relevant to their agent or study. Finally, this paper gives clear instructions for practical use, such as sample size estimations, and how to present results.



Requirements
------------------------------------------------------------------------------------------------------------------------------------
1. Pdf reader to read .pdf file of the markdown
2. Text editor to read comma-separated-values (.csv) files of the dataset
3. Internet browser for generating ASAQ Charts
3. Excel to read "ASAQ Chart Generator/csv_asaq_29agents.xlsx"
4. Image reader to display "ASAQ Chart Generator/howto.png"



Data and Analysis Code
------------------------------------------------------------------------------------------------------------------------------------
Analysis codes are included in the markdown file: underlyingIJHCS_ASAQ.pdf. The file was generated from underlyingIJHCS_ASAQ.Rmd using R (v4.2.1) 

The following are data used in the analysis:
- General purpose data:
  1. data/constructASAQdesc.csv 
  # Description ASAQ's constructs/dimensions
  # CID = construct/dimension's id used before study Mid 2021. It uses notation "C<construct-number>D<dimension-number>"
  # CABR = construct/dimension's id used after study Mid 2021. They are the abbreviation of the construct/dimension' name
  # CNAME = construct/dimension's name (see Table 1 for the complete version)
  # NUM_ITEM = the number of items in the construct/dimension
  # SHORT_ITEM_ID = the ids of 24 representative items that are used to establish the short version of ASAQ. This id used before study Mid 2021
  # SHORT_ITEM_REF = the ids of the short version of ASAQ that are corresponding to the long version of ASAQ

  2. data/itemASAQdesc.csv
  # Description ASAQ's items
  # ITEM_ID = item's id used before study Mid 2021. It used notation "C<construct-number>D<dimension-number>Q<item-number>"
  # ITEM_REF = item's id used after study Mid 2021. It uses notation <construct-abbreviation><item-number>
  
- Expert votes on classifiying 189 constructs found in IVA Proceedings 2013-2018: 
  data/constructDefining/expertReview.csv
  # Expert voting on classifiying 189 constructs found in IVA proceedings 2013-2018 into 7 concepts
  # No = IVA construct id [C1__ .. C189]
  # Construct = construct's name 
  # AProp, ASoc, ARole, IntQ, HImpr, AImpr, HAttr = the number of voters classifying the corresponding construct into concept,  respectively Agent's Properties, Agent's Social Trait, Agent's Role (Performance), Interaction Quality, Human's Impression of the Interaction, Impression of the agent's after the Interaction, and Human's Attributes (to Support the Interaction)
  # Other = the number of voters classifying the corresponding construct other than the 7 concepts
  # nVoters = the total number of voters for the corresponding construct
  # SingleCls = the concept that holds the majority (>= 50%) voters for the corresponding construct
  # PairCls_1 = the concept that holds the first largest number of voters but below 50% voters for the corresponding construct
  # PairCls_2 = the concept that holds the second largest number of voters but below 50% voters for the corresponding construct

- Content validation dataset:  
  1. data/contentValidation/q431_expertGeneratedItems.csv 
  # In the content validation study, experts (n=31) were asked by 223 questions. In each question, they asked whether each item presented was intended for the questioned  construct/dimension. In a question, at most two target items (i.e., items that are intended for the questioned construct/dimension) and two random distractor items (i.e., items that should not be for the questioned construct/dimension).  
  # QUESTIONID = to the question id. It's composed by C?? is the construct id, D?? is the dimension id, and S* is the number of questions [1..223]
  # CONID = the construct id of the target. It uses the notation before study Mid 2021.
  # CONNAME = the construct name of the target
  # DIMID = the dimension id of the target. It uses the dimensional and notation before study Mid 2021.
  # DIMNAME = the dimension name of the target
  # ITEMID = the item id of the target. It uses the notation before study Mid 2021.
  # ITEMQID = the id of the (YES or NO) questions in qualtrics of the target items; It's composed by <QUESTIONID>_1 or _3; 
  # ITEMRID = the reference id the target items C??D??S*_1_TRGT and C??D??S*_2_TRGT
  # ITEMQ = the questionnaire item statement

  2. data/contentValidation/qTargets_vs_Distractors.csv
  # The result of the content validity study with 31 experts. It contained the list of pairing of one target item and one distractor item in a (qualtrics) question, and the answer of the experts for both items. 
  # PRID = anonymous id of the experts. An expert was asked to answer 233 questions. It was allowed not to answer.
  # QUESTIONID = to the question id (i.e. task number). It's composed by C?? is the construct id, D?? is the dimension id, and S* is the number of questions [1..223]
  # TRGT_ITEMID = the item id of the target. It uses the notation before study Mid 2021.
  # TRGT_ITEMQID = the id of the (YES or NO) questions in qualtrics of the target items; It's composed by <QUESTIONID>_1 or _3.
  # TRGT_ITEMRID = the reference id of the target items C??D??S*_1_TRGT and C??D??S*_2_TRGT
  # TRGT_ITEMQ = the questionnaire item statement
  # DSTR_ITEMID = the item id of the distractor. It uses the notation before study Mid 2021.
  # DSTR_ITEMQID = the id of the (YES or NO) questions in qualtrics of the distractor items; It's composed by <QUESTIONID>_2 or _4;
  # DSTR_ITEMRID = the reference id of the distractor items C??D??S*_1_DSTR and C??D??S*_2_DSTR
  # DSTR_ITEMQ = the questionnaire item statement
  # TRGT_ANS = the answer of the expert for the target item
  # DSTR_ANS = the answer of the expert for the distractor item

- ASAQ Representative set 2024:
  # 29 agents of the ASAQ Representative Set 2024 (14 agents in the Study Mid 2021 and 15 in the Study 2022):
  # "AIBO", "ALEXA", "ALICE", "AMY", "CHAPPIE", "C3PO", "DEEPBLUE", "DOG", "EFFIE", "THE FISH", "FURBY", "FURHAT", "GEMINOID", "HAL 9000", "iCAT", "KITT", "LOLA", "MARCUS", "NAO", "PARO", "POPPIE", "ROBOT BOSS", "SAM", "SARAH", "SIM SENSEI", "SIRI", "THE AMBIENT LIGHT TV", "THE NEGOTIATOR", "THE ZOMBIE"
  Note:
  * SAM was used in the dataset for to call Samantha (from the movie HER), 
  * POPPIE was used in the dataset for POPPY (from the Semaine project)
  * ROBOT BOSS is one of Double Robot agents, 
  * SARAH is one of the Soul Machine agents, and 
  * MARCUS is one of characters in the movie Terminator,
  * More info about other ASAs see: https://osf.io/7wmjh/

- ASAQ datasets:
  Note: 
  * For all datasets, all questionnaire-item data (with the column name) starting with "R_" have been reversed.
  - Study Early '21 dataset:
    1. data/1stReliabilityAnalysis/result_part_1.csv
    2. data/1stReliabilityAnalysis/result_part_2.csv
    # The Study Early '21 with 192 crowd-workers on Prolific
    # PRID = Anonymous id of the participant
    # STARTDATE = starting datetime of the data entry
    # ENDDATE = ending datetime of the data entry
    # RECORDDATE = datetime when the data was recorded
    # GROUP = group of participants, i.e., one of four groups - 1stPOV_part_1, 1stPOV_part_2, 3rdPOV_part_1,  and 3rdPOV_part_2
    # CheckScore = the number of attention-check-correct answers given by participants 
    # C01D01Q?? .. C19D00Q?? = the participant's rates on the questionnaire items. The id of items was based the notation before Study Mid 2021
  
  - Predictive validation dataset (3 experts' ratings) 
    1. data/predictiveValidation/scoreExpertA.csv
    2. data/predictiveValidation/scoreExpertB.csv 
    3. data/predictiveValidation/scoreExpertC.csv
    # Three experts' prediction on the scores of ASAQ constructs for 54 agents.
    # One dataset file per expert
    # ASA = the agent's name
    # C1D1 .. C19 = the participant's rates on the questionnaire items. The id of constructs was based the notation before Study Mid 2021

  - Study Mid '21 dataset:
    data/constructValidation/result_all.csv 
    # The Study Mid 2021 study involving 532 crowd-workers on Prolific
    # PRID = Anonymous id of the participant
    # STARTDATE = starting datetime of the data entry
    # ENDDATE = ending datetime of the data entry
    # RECORDDATE = datetime when the data was recorded
    # GROUP = group of participants, i.e. 14 agents
    # CheckScore = the number of attention-check-correct answers given by participants 
    # C01D01Q?? .. C19D00Q?? = the participant's rates on the questionnaire items. The id of items was based the notation before Study Mid 2021

  - Study '22 dataset:
    data/crossValidation/result_all.csv
    # The Cross Validation study involving 534 crowd-workers on Prolific
    # PRID = Anonymous id of the participant
    # STARTDATE = starting datetime of the data entry
    # ENDDATE = ending datetime of the data entry
    # RECORDDATE = datetime when the data was recorded
    # GROUP = group of participants, i.e. 15 agents
    # CheckScore = the number of attention-check-correct answers given by participants 
    # C01D01Q?? .. C19D00Q?? = the participant's rates on the questionnaire items. The id of items was based the notation before Study Mid 2021



ASAQ Chart Generator
------------------------------------------------------------------------------------------------------------------------------------
In the folder "ASAQ Chart generator," we provide code to generate ASAQ charts.

- ASAQ Chart generator/asaqchart.html
  The interface to generate an ASAQ chart for visualising ASAQ ratings of one to four ASAs on the original -3 to 3 ASAQ scale
- ASAQ Chart generator/asaqpercchart.html
  The interface to generate an ASAQ chart for contrasting one to four ASAs with the ASAQ representative set 2024
- ASAQ Chart generator/asaqchart.js
  Javascript libraries for displaying ASAQ charts 
- ASAQ Chart generator/howto.png
  A figure that shows the steps to generate an ASAQ chart
- ASAQ Chart generator/csv_asaq_29agents.xlsx
  Examples input for generating ASAQ charts using data from the ASAQ representative set 2024
  - Sheet "long": the mean scores of 24 ASAQ constructs/dimensions of the long ASAQ version of 29 agents
  - Sheet "short": the mean scores of 24 items of the short ASAQ version of 29 agents



Quick Instruction
------------------------------------------------------------------------------------------------------------------------------------
1. Unzip Archive.zip 
2. Follow the table of content in underlyingIJHCS_ASAQ.pdf that is corresponding with the outline of the journal to see the  analysis results
3. Follow the name of functions in underlyingIJHCS_ASAQ.pdf to see how the analysis process and codes
4. Use R (v4.2.1) to generate the pdf from underlyingIJHCS_ASAQ.Rmd
5. Follow the instruction in the figure "ASAQ Chart Generator/howto.png" for generating ASAQ-Charts






