Author: Andrei Stefan
Date: 09-11-2023
Required files: data/anonymised_data/anonymised_data_demographic.csv, data/anonymised_data/anonymised_data_prescreening.csv, data/anonymised_data/anonymised_data_final.csv, data/anonymised_data/anonymised_data_post_questionnaire.csv
Output files: no output files
This file contains the code to reproduce the results regarding the demographic data. This is part of Table 4.1.
# import all required packages
import pandas as pd
For the demographic data, we are interested in the mean of the age, the standard deviation of the age and the range of the age. We are also interested in the gender that people identify as. In the prescreening questionnaire, they were asked to complete the Godin-Shephard leisure-time physical activity questionnaire and we are evaluating how many people were sedentary and how many were moderately active. They were also asked a question to help determine their transtheoretical model (TTM) stage.
def demographic():
"""
Function to print the demographic data, number of participants and samples, and number of people who finished the post-questionnaire.
Args: none
Returns: none.
"""
# get the personal anonymised data (age and gender) into a dataframe
df_age_gender = pd.read_csv("../data/anonymised_data/anonymised_data_demographic.csv")
# get the prescreening data into another dataframe
df_prescreening = pd.read_csv("../data/anonymised_data/anonymised_data_prescreening.csv")
# calculate the necessary demographic details
print("Total number of participants", len(df_age_gender))
print()
print("Total number of samples", len(pd.read_csv("../data/anonymised_data/anonymised_data_final.csv")))
print()
print("Number of people who finished the post-questionnaire", len(pd.read_csv("../data/anonymised_data/anonymised_data_post_questionnaire.csv")))
print()
print("Age mean", df_age_gender.loc[:, 'Age'].mean())
print("Age standard deviation", df_age_gender.loc[:, 'Age'].std() )
print("Age range", df_age_gender.loc[:, 'Age'].min(),"-", df_age_gender.loc[:, 'Age'].max())
print()
print("Number of people who identify as a man", len(df_age_gender[df_age_gender["Gender"] == "Man (including Trans Male/Trans Man)"]))
print("Number of people who identify as a woman", len(df_age_gender[df_age_gender["Gender"] == "Woman (including Trans Female/Trans Woman)"]))
print("Number of people who identify as non-binary", len(df_age_gender[df_age_gender["Gender"] == "Non-binary (would like to give more detail)"]))
print()
print("Godin-Shephard Insufficiently Active/Sedentary", len(df_prescreening[df_prescreening["Godin score"] < 14]))
print("Godin-Shephard Moderately Active", len(df_prescreening[14 <= df_prescreening["Godin score"]]))
print()
print("TTM contemplating", len(df_prescreening[df_prescreening["TTM stage"] == "Contemplating"]))
print("TTM preapring", len(df_prescreening[df_prescreening["TTM stage"] == "Preparing"]))
demographic()
Total number of participants 114 Total number of samples 519 Number of people who finished the post-questionnaire 110 Age mean 29.56140350877193 Age standard deviation 7.888644339628687 Age range 20 - 56 Number of people who identify as a man 56 Number of people who identify as a woman 54 Number of people who identify as non-binary 4 Godin-Shephard Insufficiently Active/Sedentary 98 Godin-Shephard Moderately Active 16 TTM contemplating 60 TTM preapring 54