Supplementary data for the Master Thesis: Inferring Personality from GitHub Communication Data: Promises & Perils (dataset)

Supplementary data for the Master Thesis: Inferring Personality from GitHub Communication Data: Promises & Perils

doi: 10.4121/uuid:6b648676-26f4-4eb1-89dc-050810909b3b

Datacite citation style:

van Mil, Frenk (2020): Supplementary data for the Master Thesis: Inferring Personality from GitHub Communication Data: Promises & Perils. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/uuid:6b648676-26f4-4eb1-89dc-050810909b3b

Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Dataset

usage stats

1167

views

313

downloads

categories

Other Information and Computing Sciences

keywords

master thesis, Personality Inference, Psycholinguistic models, Software engineering

licence

CC0

export as...

RefWorks, BibTeX, Reference Manager, Endnote, DataCite, NLM, DC, CFF

by Frenk van Mil

This dataset contains the personality scores of participants of the study "Inferring Personality from GitHubCommunication Data:Promises & Perils". Where we investigate unconstitutional methods to infer personality from communication of developers on GitHub. The data contains Big Five personality scores for three psycholinguistic methods (i.e., Yarkoni, Golbeck, and Personality Insights) for more than 83,000 people. The scores are based on comments of people on GitHub, where the comments are converted to personality scores. The data contains personality scores, number of words analyzed per person, whether or not people have indicated to be fluent in English, and whether they have English as their mothertongue.

history

2020-06-25 first online, published, posted

publisher

4TU.Centre for Research Data

format

media types: text/csv, text/plain

references

http://resolver.tudelft.nl/uuid:c58343a1-3700-42c4-bc81-aacca9f54248

contributors

Rastogi, A. (Ayushi)
Zaidman, A. (Andy)

DATA

files (2)

4,018 bytesMD5:936590a38aebbf47cd80b304179492bc README.txt
2,643,552 bytesMD5:e44198463f0847bc041ea5afde7ae508 thesis_dataset_anonymized.csv
download all files (zip)
2,647,570 bytes unzipped