In this file we explain all columns of the dataset `thesis_dataset_anonymized.csv`.
Word count - Number of words in [lowerbound-upperbound] format. The upper bound may not be provided, in case of [47000-].
PI Openness - Openness score obtained with Personality Insights. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
PI Conscientiousness - Conscientiousness score obtained with Personality Insights. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
PI Extraversion - Extraversion score obtained with Personality Insights. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
PI Agreeableness - Agreeableness score obtained with Personality Insights. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
PI Neuroticism - Neuroticism score obtained with Yarkoni. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Yarkoni Openness - Openness score obtained with Yarkoni. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Yarkoni Conscientiousness - Conscientiousness score obtained with Yarkoni. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Yarkoni Extraversion - Extraversion score obtained with Yarkoni. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Yarkoni Agreeableness - Agreeableness score obtained with Yarkoni. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Yarkoni Neuroticism - Neuroticism score obtained with Yarkoni. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Golbeck Openness - Openness score obtained with Golbeck. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Golbeck Conscientiousness - Conscientiousness score obtained with Golbeck. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Golbeck Extraversion - Extraversion score obtained with Golbeck. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Golbeck Agreeableness - Agreeableness score obtained with Golbeck. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Golbeck Neuroticism - Neuroticism score obtained with Golbeck. Value between 0 and 1, round to two decimal points. All scores are normalized relative to the whole dataset.
Ground-truth Openness - Openness score obtained with the ground-truth (questionnaire). May be empty. Value between 0 and 1, round to two decimal points.
Ground-truth Conscientiousness - Conscientiousness score obtained with the ground-truth (questionnaire). May be empty. Value between 0 and 1, round to two decimal points.
Ground-truth Extraversion - Extraversion score obtained with the ground-truth (questionnaire). May be empty. Value between 0 and 1, round to two decimal points.
Ground-truth Agreeableness - Agreeableness score obtained with the ground-truth (questionnaire). May be empty. Value between 0 and 1, round to two decimal points.
Ground-truth Neuroticism - Neuroticism score obtained with the ground-truth (questionnaire). May be empty. Value between 0 and 1, round to two decimal points.
Continent - The continent the participant spent most of its youth in. May be empty.
Fluent - Whether the participant classifies its English as to be fluent in writing. Values can be 'Yes', 'No', 'Maybe', or empty.
Mothertongue - Whether the participant has English as their mother tongue. Values can be 'Yes', 'No', 'Maybe', or empty.