%0 Generic %A Raman, Chirag %A Vargas Quiros, Jose %A Tan, Stephanie %A Islam, Ashraful %A Gedik, Ekin %A Hung, Hayley %D 2022 %T Annotations for ConfLab A Rich Multimodal Multisensor Dataset of Free-Standing Social Interactions In-the-Wild %U https://data.4tu.nl/articles/dataset/Annotations_for_ConfLab_A_Rich_Multimodal_Multisensor_Dataset_of_Free-Standing_Social_Interactions_In-the-Wild/20017664/1 %R 10.4121/20017664.v1 %K annotations %K conflab %K pose %K actions %K f-formations %X

This file contains the annotations for the ConfLab dataset, including actions (speaking status), pose, and F-formations. 

------------------

./actions/speaking_status:

./processed: the processed speaking status files, aggregated into a single data frame per segment. Skipped rows in the raw data (see https://josedvq.github.io/covfee/docs/output for details) have been imputed using the code at:  https://github.com/TUDelft-SPC-Lab/conflab/tree/master/preprocessing/speaking_status

    The processed annotations consist of:

        ./speaking: The first row contains person IDs matching the sensor IDs,

        The rest of the row contains binary speaking status annotations at 60fps for the corresponding 2 min video segment (7200 frames).

        ./confidence: Same as above. These annotations reflect the continuous-valued rating of confidence of the annotators in their speaking annotation.

To load these files with pandas: pd.read_csv(p, index_col=False)


./raw.zip: the raw outputs from speaking status annotation for each of the eight annotated 2-min video segments. These were were output by the covfee annotation tool (https://github.com/josedvq/covfee)

Annotations were done at 60 fps.

--------------------

./pose:

./coco: the processed pose files in coco JSON format, aggregated into a single data frame per video segment. These files have been generated from the raw files using the code at: https://github.com/TUDelft-SPC-Lab/conflab-keypoints

    To load in Python: f = json.load(open('/path/to/cam2_vid3_seg1_coco.json'))

    The skeleton structure (limbs) is contained within each file in:

        f['categories'][0]['skeleton']

    and keypoint names at:

        f['categories'][0]['keypoints']

./raw.zip: the raw outputs from continuous pose annotation. These were were output by the covfee annotation tool (https://github.com/josedvq/covfee)

    Annotations were done at 60 fps.

---------------------

./f_formations:

seg 2: 14:00 onwards, for videos of the form x2xxx.MP4 in /video/raw/ for the relevant cameras (2,4,6,8,10). 

seg 3: for videos of the form x3xxx.MP4 in /video/raw/ for the relevant cameras (2,4,6,8,10). 

Note that camera 10 doesn't include meaningful subject information/body parts that are not already covered in camera 8. 

First column: time stamp

Second column: "()" delineates groups, "<>" delineates subjects, cam X indicates the best camera view for which a particular group exists.


phone.csv: time stamp (pertaining to seg3), corresponding group, ID of person using the phone

%I 4TU.ResearchData