TY - DATA T1 - R code underlying the analysis reported in the manuscript "Machine learning approach for pitch type classification based on pelvis and trunk kinematics captured with wearable sensors" PY - 2023/11/09 AU - Larisa Gomaz UR - DO - 10.4121/e339176b-0ecd-48e5-bc7e-9b587c0a8959.v1 KW - baseball KW - kinematics KW - pitch type KW - classification KW - wearables KW - sensors N2 -
The study utilized classifiers integrated in the caret R package including K-Nearest Neighbours (KNN), Naive Bayes (NB), Random Forest (RF) and Support Vector Machine (SVM). We investigated the performance of the classifiers in both binary and multiclass classification, including additional Logistic Regression (LOGREG) for binary and Multinomial Logistic Regression (MNOM) for multiclass classification task.
We used a database created by PITCHPERFECT that characterises each pitch with 3 features used directly from the system (pelvis peak angular velocity, trunk peak angular velocity and separation time between them). Data were pre-processed and analysed using R programming language (version 4.3.1). All continuous features were scaled and centred.
We set up our training and testing cases following the 80\% (training) and 20\% (testing) split. To achieve a fair understanding of the generalizability of the classifiers, in the designated training set Leave-One-Group-Out Cross-Validation (LOGO-CV) was carried out. The performance of the classifiers is evaluated by four evaluation criteria - Accuracy, Sensitivity , Precision and F1-score. Hyperparameters were tuned using grid search, a default method for optimizing tuning parameters in the caret package. Feature selection was performed using correlation analysis. Since the correlation between the features was low, the models were trained and tested using all variables derived from PITCHPERFECT system.
ER -