Different Approaches to Reducing Bias in Classification of Medical Data by Ensemble Learning Methods

Adem Doganer

Source Title: International Journal of Big Data and Analytics in Healthcare (IJBDAH)6(2)

ISSN: 2379-738X|EISSN: 2379-7371|EISBN13: 9781799862994|DOI: 10.4018/IJBDAH.20210701.oa2

MLA

Doganer, Adem. "Different Approaches to Reducing Bias in Classification of Medical Data by Ensemble Learning Methods." IJBDAH vol.6, no.2 2021: pp.15-30. http://doi.org/10.4018/IJBDAH.20210701.oa2

APA

Doganer, A. (2021). Different Approaches to Reducing Bias in Classification of Medical Data by Ensemble Learning Methods. International Journal of Big Data and Analytics in Healthcare (IJBDAH), 6(2), 15-30. http://doi.org/10.4018/IJBDAH.20210701.oa2

Chicago

Doganer, Adem. "Different Approaches to Reducing Bias in Classification of Medical Data by Ensemble Learning Methods," International Journal of Big Data and Analytics in Healthcare (IJBDAH) 6, no.2: 15-30. http://doi.org/10.4018/IJBDAH.20210701.oa2

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

In this study, different models were created to reduce bias by ensemble learning methods. Reducing the bias error will improve the classification performance. In order to increase the classification performance, the most appropriate ensemble learning method and ideal sample size were investigated. Bias values and learning performances of different ensemble learning methods were compared. AdaBoost ensemble learning method provided the lowest bias value with n: 250 sample size while Stacking ensemble learning method provided the lowest bias value with n: 500, n: 750, n: 1000, n: 2000, n: 4000, n: 6000, n: 8000, n: 10000, and n: 20000 sample sizes. When the learning performances were compared, AdaBoost ensemble learning method and RBF classifier achieved the best performance with n: 250 sample size (ACC = 0.956, AUC: 0.987). The AdaBoost ensemble learning method and REPTree classifier achieved the best performance with n: 20000 sample size (ACC = 0.990, AUC = 0.999). In conclusion, for reduction of bias, methods based on stacking displayed a higher performance compared to other methods.