TY - DATA T1 - Multimodal SKEP dataset for attention regulation behaviors, knowledge gain, perceived learning experience, and perceived social presence in e-learning with a conversational agent PY - 2023/04/21 AU - Yoon Lee AU - Marcus Specht UR - DO - 10.4121/4c9de645-ca88-4b45-8fc7-2fc325f191dc.v1 KW - attention regulation KW - behavior KW - e-learning KW - knowledge gain KW - perceived learning experience KW - perceived social presence KW - Human-Robot Interaction KW - conversational agents N2 -

Reading on digital devices has become more commonplace, often challenging learners' attention. In this study, we hypothesized that allowing learners to reflect on their reading phases with an empathic social robot companion might enhance learners' attention in e-reading. To verify our assumption, we collected a novel SKEP dataset in an e-reading setting with social robot support.


We designed two interfaces: 1) a GUI-based system with a monitor, mouse, and eye tracker implemented, and 2) an HRI-based system, which has a monitor, mouse, eye tracker, and Furhat Robot as physical components. See the footnote to check the specification of the Pupil Core eye tracker and Logitech C505 HD Webcam that was implemented. For both conditions, an informative e-reading material with technicality, "Waste management and critical raw materials," has been provided through a screen-based reader, which we explicitly developed for this study. The content has been chosen, aiming for an equal baseline knowledge for general readers. The text contains 4,750 words, divided into 29 pages covering seven subtopics. The text has been implemented with 47pt on a 27-inch monitor, having 2560*1440 resolution. The setting was optimized for the eye tracker implementation, which requires a bigger font size than the usual PDF readers for high-resolution data collection.


We implemented four measurements that are direct and indirect attentional cues. Data features and granularity varies based on the data collection methods, collection timing, and data post-processing. Learners' self-regulatory behavior has been collected through a video feed and annotated second-by-second by human labelers as post hoc. Labels are observable behavioral cues that indicate learners' attentional shifts. Movements from the 1) eyebrow, 2) blink, 3) mumble, 4) hands, and 5) body works as good predictors of learners' self-awareness on attention loss; we annotated 60 video samples by applying six labels, including 6) neutral state as opposed to five attention regulation behavior labels. Additionally, we examined multimodal cues that are direct and indirect clues of attention: knowledge gain, perceived learning experience, and perceived social presence with interfaces (see readme.txt for descriptions of indicators).

ER -