Demographic data¶

Author: Andrei Stefan
Date: 09-11-2023
Required files: data/anonymised_data/anonymised_data_demographic.csv, data/anonymised_data/anonymised_data_prescreening.csv, data/anonymised_data/anonymised_data_final.csv, data/anonymised_data/anonymised_data_post_questionnaire.csv
Output files: no output files

This file contains the code to reproduce the results regarding the demographic data. This is part of Table 4.1.

In [1]:
# import all required packages
import pandas as pd

Demographic data, Table 4.1¶

For the demographic data, we are interested in the mean of the age, the standard deviation of the age and the range of the age. We are also interested in the gender that people identify as. In the prescreening questionnaire, they were asked to complete the Godin-Shephard leisure-time physical activity questionnaire and we are evaluating how many people were sedentary and how many were moderately active. They were also asked a question to help determine their transtheoretical model (TTM) stage.

In [2]:
def demographic():
    """
    Function to print the demographic data, number of participants and samples, and number of people who finished the post-questionnaire.
    
    Args: none
    Returns: none.
    """
    
    # get the personal anonymised data (age and gender) into a dataframe
    df_age_gender = pd.read_csv("../data/anonymised_data/anonymised_data_demographic.csv")
    
    # get the prescreening data into another dataframe
    df_prescreening = pd.read_csv("../data/anonymised_data/anonymised_data_prescreening.csv")
    
    # calculate the necessary demographic details
    print("Total number of participants", len(df_age_gender))
    print()
    print("Total number of samples", len(pd.read_csv("../data/anonymised_data/anonymised_data_final.csv")))
    print()
    print("Number of people who finished the post-questionnaire", len(pd.read_csv("../data/anonymised_data/anonymised_data_post_questionnaire.csv")))
    print()
    print("Age mean", df_age_gender.loc[:, 'Age'].mean())
    print("Age standard deviation", df_age_gender.loc[:, 'Age'].std() )
    print("Age range", df_age_gender.loc[:, 'Age'].min(),"-", df_age_gender.loc[:, 'Age'].max())
    print()
    print("Number of people who identify as a man", len(df_age_gender[df_age_gender["Gender"] == "Man (including Trans Male/Trans Man)"]))
    print("Number of people who identify as a woman", len(df_age_gender[df_age_gender["Gender"] == "Woman (including Trans Female/Trans Woman)"]))
    print("Number of people who identify as non-binary", len(df_age_gender[df_age_gender["Gender"] == "Non-binary (would like to give more detail)"]))
    print()
    print("Godin-Shephard Insufficiently Active/Sedentary", len(df_prescreening[df_prescreening["Godin score"] < 14]))
    print("Godin-Shephard Moderately Active", len(df_prescreening[14 <= df_prescreening["Godin score"]]))
    print()
    print("TTM contemplating", len(df_prescreening[df_prescreening["TTM stage"] == "Contemplating"]))
    print("TTM preapring", len(df_prescreening[df_prescreening["TTM stage"] == "Preparing"]))
    
demographic()
Total number of participants 114

Total number of samples 519

Number of people who finished the post-questionnaire 110

Age mean 29.56140350877193
Age standard deviation 7.888644339628687
Age range 20 - 56

Number of people who identify as a man 56
Number of people who identify as a woman 54
Number of people who identify as non-binary 4

Godin-Shephard Insufficiently Active/Sedentary 98
Godin-Shephard Moderately Active 16

TTM contemplating 60
TTM preapring 54