STAT-1828: Predictive Modeling by Using Integrative Machine Learning Algorithms for Healthcare Data | [4th International Conference of Sciences “Revamped Scientific Outlook of 21st Century, 2025” , November 12,2025 • 2025]

Author(s):

1. Nimra Ali: Rawalpindi Women University,Rawalpindi, Pakistan.

2. Saba Riaz: Rawalpindi Women University,Rawalpindi, Pakistan.

Abstract:

This study investigates the predictive performance of ten machine learning algorithms for diabetes detection using two distinct datasets: one from the United States and another from Pakistan. The models are evaluated using key performance metrics, including Accuracy, Precision, Recall, F1 Score, AUC, and Specificity. Ensemble-based models such as Random Forest, XGBoost, and AdaBoost demonstrated exceptional performance on the Pakistani dataset, achieving near-perfect AUC values and accuracies exceeding 99\%, indicating strong reliability. However, their performance declined significantly on the U.S. dataset, where the Neural Network achieved the highest accuracy of 75.25\%. This disparity underscores the importance of regional data characteristics and suggests that predictive healthcare models must be tailored to specific population contexts. Overall, the findings emphasize that region-specific training data and customized model selection are critical for enhancing prediction accuracy in clinical applications.

Page(s): 185-185

DOI: DOI not available

Published: Journal: 4th International Conference of Sciences “Revamped Scientific Outlook of 21st Century, 2025” , November 12,2025, Volume: 1, Issue: 1, Year: 2025

Keywords:

machine learning , BRFSS , Predictive modeling , USA diabetes dataset , Pakistani diabetes dataset

References:

References are not available for this document.

Citations

Citations are not available for this document.

Citations

Downloads

Views