Research Article
BibTex RIS Cite

EVALUATION OF PERFORMANCE OF CLASSIFICATION ALGORITHMS IN PREDICTION OF HEART FAILURE DISEASE

Year 2022, , 622 - 632, 03.12.2022
https://doi.org/10.17780/ksujes.1144570

Abstract

Success rates and performances of Gaussian Naive Bayes, Support Vector Machines, Linear Discriminant Analysis, Decision Tree and Random Forest classifier algorithms from machine learning methods were evaluated using the Heart Failure Prediction dataset. Label encoder method was used primarily in data preprocessing techniques on the data set. Catalog data (5 pieces) in the data set have been converted into numerical data. In addition, it was observed that there were negative values in the data in a field and this situation was converted to values in the range of 0 - 1 with min-max conversion methods. After the pre-processing, analyzes were made with classification algorithms. As a result of these analyzes, a success rate of 90.76% was achieved with the random forest algorithm, which is an ensemble classifier. In the study, 80% of the data was used for training and 20% for testing. Of the 184 data used for the test, 102 of them were patients with heart failure and 72 of them were from those without the disease. The success of the random forest algorithm in estimating those with heart failure disease was 93.1% (95 observations), and the success in predicting those without the disease was 87.8% (72 observations).

Thanks

This study was carried out in Siirt University Engineering Faculty Human-Computer Interaction Laboratory. The authors of this article thank the Human-Computer Interaction Laboratory staff for their support.

References

  • Ali Bagheri, M., Montazer, G. A., & Escalera, S. (2012). Error correcting output codes for multiclass classification: application to two image vision problems. The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012), 508–513.
  • Coşar, M., & Deniz, E. (2021). Makine Öğrenimi Algoritmalar Kullanarak Kalp Hastalklarnn Tespit Edilmesi. Avrupa Bilim ve Teknoloji Dergisi, 28, 1112–1116.
  • Heart Failure: Investigation of an Epidemic. (2013). https://doi.org/10.1161/CIRCRESAHA.113.300268
  • Heart Failure Prediction Dataset. (n.d.). www.kaggle.com.
  • Ng, A., & Jordan, M. (2001). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Advances in Neural Information Processing Systems, 14.
  • Onat, A. (2001). Risk factors and cardiovascular disease in Turkey. In Atherosclerosis (Vol. 156). www.elsevier.com/locate/atherosclerosis
  • Reddy, V. S. K., Meghana, P., Reddy, N. V. S., & Rao, B. A. (2022). Prediction on Cardiovascular disease using Decision tree and Na\"\ive Bayes classifiers. Journal of Physics: Conference Series, 2161(1), 12015.
  • Srinivas, P., & Katarya, R. (2022). hyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost. Biomedical Signal Processing and Control, 73, 103456.
  • World Health Organization. (2022a, July 14). Global health estimates: Leading causes of DALYs. Https://Www.Who.Int/Data/Gho/Data/Themes/Mortality-and-Global-Health-Estimates/Global-Health-Estimates-Leading-Causes-of-Dalys.
  • World Health Organization. (2022b, July 14). Global health estimates: Leading causes of death. Https://Www.Who.Int/Data/Gho/Data/Themes/Mortality-and-Global-Health-Estimates/Ghe-Leading-Causes-of-Death.
  • Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G. J., Ng, A., Liu, B., Yu, P. S., & others. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37.

KALP YETMEZLİĞİ HASTALIĞININ TAHMİN EDİLMESİNDE SINIFLANDIRICI ALGORİTMALARININ PERFORMANSLARININ DEĞERLENDİRİLMESİ

Year 2022, , 622 - 632, 03.12.2022
https://doi.org/10.17780/ksujes.1144570

Abstract

Kalp Yetmezliği Tahmin veri seti kullanılarak makine öğrenmesi yöntemlerinden Gaussian Naive Bayes, Support Vector Machines, Linear Discriminant Analysis, Decision Tree ve Random Forest sınıflandırıcı algoritmalarının başarı oranları ve performansları değerlendirilmiştir. Data set üzerinde öncelikle veri ön işleme tekniklerinde label encoder yöntemi kullanılmıştır. Data setteki katalog veriler (5 adet) sayısal verilere dönüştürülmüştür. Ayrıca bir alandaki verilerde negatif değerlerin olduğu gözlemlenmiş ve bu durum min-max dönüşüm yöntemleri ile 0 - 1 aralığındaki değerlere dönüştürülmüştür. Yapılan ön işlemlerden sonra sınıflandırma algoritmaları ile analizler yapılmıştır. Bu analizler neticesinde bir ensemble (topluluk) sınıflandırıcı olan random forest algoritması ile %90,76 oranında bir başarı elde edilmiştir. Yapılan çalışmada verilerin %80’i eğitim, %20’si test için kullanılmıştır. Test için kullanılan 184 tane verinin 102 tanesi kalp yetmezliği hastalığı olanlar, 72 tanesi ise hastalığı olmayanlardan oluşmaktadır. Random forest algoritmasının kalp yetmezliği hastalığı olanları tahminlime başarısı %93,1 (95 gözlem), hastalığı olmayanları tahminlime başarısı ise %87,8 (72 gözlem) olarak gerçekleşmiştir.

References

  • Ali Bagheri, M., Montazer, G. A., & Escalera, S. (2012). Error correcting output codes for multiclass classification: application to two image vision problems. The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012), 508–513.
  • Coşar, M., & Deniz, E. (2021). Makine Öğrenimi Algoritmalar Kullanarak Kalp Hastalklarnn Tespit Edilmesi. Avrupa Bilim ve Teknoloji Dergisi, 28, 1112–1116.
  • Heart Failure: Investigation of an Epidemic. (2013). https://doi.org/10.1161/CIRCRESAHA.113.300268
  • Heart Failure Prediction Dataset. (n.d.). www.kaggle.com.
  • Ng, A., & Jordan, M. (2001). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Advances in Neural Information Processing Systems, 14.
  • Onat, A. (2001). Risk factors and cardiovascular disease in Turkey. In Atherosclerosis (Vol. 156). www.elsevier.com/locate/atherosclerosis
  • Reddy, V. S. K., Meghana, P., Reddy, N. V. S., & Rao, B. A. (2022). Prediction on Cardiovascular disease using Decision tree and Na\"\ive Bayes classifiers. Journal of Physics: Conference Series, 2161(1), 12015.
  • Srinivas, P., & Katarya, R. (2022). hyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost. Biomedical Signal Processing and Control, 73, 103456.
  • World Health Organization. (2022a, July 14). Global health estimates: Leading causes of DALYs. Https://Www.Who.Int/Data/Gho/Data/Themes/Mortality-and-Global-Health-Estimates/Global-Health-Estimates-Leading-Causes-of-Dalys.
  • World Health Organization. (2022b, July 14). Global health estimates: Leading causes of death. Https://Www.Who.Int/Data/Gho/Data/Themes/Mortality-and-Global-Health-Estimates/Ghe-Leading-Causes-of-Death.
  • Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G. J., Ng, A., Liu, B., Yu, P. S., & others. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37.
There are 11 citations in total.

Details

Primary Language English
Subjects Computer Software
Journal Section Computer Engineering
Authors

Cevdet Coşkun 0000-0002-1351-0590

Fatma Kuncan 0000-0003-0712-6426

Publication Date December 3, 2022
Submission Date July 18, 2022
Published in Issue Year 2022

Cite

APA Coşkun, C., & Kuncan, F. (2022). EVALUATION OF PERFORMANCE OF CLASSIFICATION ALGORITHMS IN PREDICTION OF HEART FAILURE DISEASE. Kahramanmaraş Sütçü İmam Üniversitesi Mühendislik Bilimleri Dergisi, 25(4), 622-632. https://doi.org/10.17780/ksujes.1144570