Feature Selection Using Weight Methods in Cardiovascular Disease Dataset

Sri Sumarlinda; Wiji Lestari; Faulinda Ely Nastiti

doi:10.32628/CSEIT2511164

Authors

Sri Sumarlinda Information System Department, Universitas Duta Bangsa Surakarta, Surakarta, Indonesia Author
Wiji Lestari Information System Department, Universitas Duta Bangsa Surakarta, Surakarta, Indonesia Author
Faulinda Ely Nastiti Information System Department, Universitas Duta Bangsa Surakarta, Surakarta, Indonesia Author

DOI:

https://doi.org/10.32628/CSEIT2511164

Keywords:

Feature Selection, Weight Methods, Cardiovascular Diseases, Prediction Model,, Feature Weight

Abstract

This study investigates the significance of clinical features in predicting cardiovascular disease through feature weighting using six selection methods: Information Gain, Gain Ratio, By Rule, Gini Index, Support Vector Machine (SVM), and Principal Component Analysis (PCA). The analysis reveals notable variations in feature importance across methods, reflecting the strengths and limitations of each approach. Among the six evaluated features—age, body mass index (BMI), systolic blood pressure, diastolic blood pressure, cholesterol, and blood sugar—age consistently emerges as the most influential predictor, achieving the highest scores in SVM (1.33) and By Rule (0.91), and demonstrating strong relevance in Information Gain (0.49) and Gain Ratio (0.59). In contrast, BMI exhibits inconsistent importance, with moderate scores in By Rule (0.67) and Gain Ratio (0.28), but negligible or negative values in SVM (–0.11) and PCA (0.00), indicating method-dependent relevance. Systolic blood pressure shows a moderate and stable influence, while diastolic blood pressure contributes minimally across most methods. Cholesterol is particularly significant in PCA (0.94), suggesting its importance in multivariate contexts, despite low scores in other methods. Blood sugar demonstrates moderate relevance, with its highest scores in By Rule (0.67) and PCA (0.30). Overall, the results highlight age and cholesterol as the most consistent and influential features, while other attributes show varying levels of importance depending on the analytical technique applied.

📊 Article Downloads

References

World Health Statistics. 2023. Monitoring Health for the SDGs Sustainable Development Goals. World health Organization (WHO) 2023. S. Sumarlinda, A. B. Rahmat, Z. B. Awang Long, and W. Lestari, “The Improvement Prediction Model Using ANFIS for Medical Dataset,” Journal of Theoretical and Applied Information Technology, vol. 102, no. 5, pp. 1663–1672, Mar. 2024. S. Sumarlinda and W. Lestari, “Clinical decision support system for mapping of blood pressure and heart rate,” in Proc. 2nd Int. Conf. Health, Science and Technology (ICOHETECH 2021), Surakarta, Indonesia, 2021, pp. 190–192. ISBN: 978-623-92207-1-6. W. Lestari, S. Sumarlinda, and A. B. Rahmat, “Improvement of prediction model using K-Nearest Neighbors (KNN) and K-Means in medical data,” in Proc. 5th Int. Conf. Health, Science and Technology (ICOHETECH), Surakarta, Indonesia, Sept. 2024, pp. 200–207. R. Gold et al., “Effect of Clinical Decision Support at Community Health Centers on the Risk of Cardiovascular Disease: A Cluster Randomized Clinical Trial,” JAMA Netw Open, vol. 5, no. 2, Feb. 2022, doi: 10.1001/jamanetworkopen.2021.46519. K. P. Kresoja, M. Unterhuber, R. Wachter, H. Thiele, and P. Lurz, “A cardiologist’s guide to machine learning in cardiovascular disease prognosis prediction,” Dec. 01, 2023, Springer Science and Business Media Deutschland GmbH. doi: 10.1007/s00395-023-00982-7. K. M. Mohi Uddin, R. Ripa, N. Yeasmin, N. Biswas, and S. K. Dey, “Machine learning-based approach to the diagnosis of cardiovascular vascular disease using a combined dataset,” Intell Based Med, vol. 7, Jan. 2023, doi: 10.1016/j.ibmed.2023.100100. Z. C. Gu et al., “An Adapted Neural-Fuzzy Inference System Model Using Preprocessed Balance Data to Improve the Predictive Accuracy of Warfarin Maintenance Dosing in Patients After Heart Valve Replacement,” Cardiovasc Drugs Ther, vol. 36, no. 5, pp. 879–889, 2022, doi: 10.1007/s10557-021-07191-1. Y. Huang et al., “Using a machine learning-based risk prediction model to analyze the coronary artery calcification score and predict coronary heart disease and risk assessment,” Comput Biol Med, vol. 151, no. PB, p. 106297, 2022, doi: 10.1016/j.compbiomed.2022.106297. V. Vimbi, N. Shaffi, and M. Mahmud, “Interpreting artificial intelligence models: a systematic review on the application of LIME and SHAP in Alzheimer’s disease detection,” Dec. 01, 2024, Springer Science and Business Media Deutschland GmbH. doi: 10.1186/s40708-024-00222-1. D. Rani, A. Bordoloi, A. Tiwari, P. K. Sarangi, S. K. Mohapatra, and N. Goel, “Feature extraction and machine learning models for heart disease prediction: An exploratory analysis,” in Proc. 2024 7th Int. Conf. Contemporary Computing and Informatics (IC3I), Greater Noida, India, 2024, pp. 498–502, doi: 10.1109/IC3I61595.2024.10829270. I. S. Al-Mahdi, S. M. Darwish, and M. M. Madbouly, “Heart disease prediction model using feature selection and ensemble deep learning with optimized weight,” CMES – Comput. Model. Eng. Sci., vol. 143, no. 1, pp. 875–909, 2025, doi: 10.32604/cmes.2025.061623. A. Darolia, R. S. Chhillar, M. Alhussein, S. Dalal, K. Aurangzeb, and U. K. Lilhore, “Enhanced cardiovascular disease prediction through self-improved Aquila optimized feature selection in quantum neural network & LSTM model,” Front. Med. (Lausanne), vol. 11, Art. no. 1414637, Jun. 2024, doi: 10.3389/fmed.2024.1414637. T. Liu, A. Krentz, L. Lu, and V. Curcin, “Machine learning based prediction models for cardiovascular disease risk using electronic health records data: Systematic review and meta-analysis,” Eur. Heart J. Digit. Health, vol. 6, no. 1, pp. 7–22, Jan. 2025, doi: 10.1093/ehjdh/ztae080. A. Gnanavelu, C. Venkataramu, and R. Chintakunta, “Cardiovascular disease prediction using machine learning metrics,” J. Young Pharm., vol. 17, no. 1, pp. 226–233, 2025, doi: 10.5530/jyp.20251231. Y. Ma, M. Li, and H. Wu, “The machine learning models in major cardiovascular adverse events prediction based on coronary computed tomography angiography: Systematic review,” J. Med. Internet Res., vol. 27, Art. no. e68872, 2025, doi: 10.2196/68872. F. D. Astuti and I. Yatini Buryadi, “The effects of feature selection methods on the classifications of imbalanced datasets,” International Journal of Information System and Computer Science (IJISCS), 2022. (Information Gain & Gain Ratio compared for imbalanced class data) A. Elmaizi, H. Nhaila, E. Sarhrouni, A. Hammouch, and C. Nacir, “A novel information gain‑based approach for classification and dimensionality reduction of hyperspectral images,” arXiv, Oct. 2022. (Filter method using Information Gain for high‑dimensional data) H. Kharsa and Z. Al Aghbari, “Leveraging association rules in feature selection for deep learning classification,” ResearchGate Preprint, Dec. 2023. [Online]. Available: https://www.researchgate.net/publication/376616276 K. Wu, Y. Zhang, and X. Liu, “Classification of valvular heart disease using SVM and PCA based on cardiovascular risk factors,” Comput. Biol. Med., vol. 168, Art. no. 107748, Jan. 2024. doi: 10.1016/j.compbiomed.2024.107748 M. Gupta and R. Singh, “An ensemble machine learning-based classification for cardiovascular disease prediction using PCA and SVM with bagging,” J. Neonatal Surg., vol. 13, no. 1, pp. 25–32, 2024. [Online]. Available: https://jneonatalsurg.com/index.php/jns/article/view/4019

Feature Selection Using Weight Methods in Cardiovascular Disease Dataset

Authors

DOI:

Keywords:

Abstract

📊 Article Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

IssueDate

RightSideBlock

Latest publications