Welcome to the InfoSci Platform

Proximate Breast Cancer Factors Using Data Mining Classification Techniques

Alice Constance Mensah, Isaac Ofori Asare

Source Title: International Journal of Big Data and Analytics in Healthcare (IJBDAH)4(1)

ISSN: 2379-738X|EISSN: 2379-7371|EISBN13: 9781522568605|DOI: 10.4018/IJBDAH.2019010104

MLA

Mensah, Alice Constance, and Isaac Ofori Asare. "Proximate Breast Cancer Factors Using Data Mining Classification Techniques." IJBDAH vol.4, no.1 2019: pp.47-56. http://doi.org/10.4018/IJBDAH.2019010104

APA

Mensah, A. C. & Asare, I. O. (2019). Proximate Breast Cancer Factors Using Data Mining Classification Techniques. International Journal of Big Data and Analytics in Healthcare (IJBDAH), 4(1), 47-56. http://doi.org/10.4018/IJBDAH.2019010104

Chicago

Mensah, Alice Constance, and Isaac Ofori Asare. "Proximate Breast Cancer Factors Using Data Mining Classification Techniques," International Journal of Big Data and Analytics in Healthcare (IJBDAH) 4, no.1: 47-56. http://doi.org/10.4018/IJBDAH.2019010104

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

Breast cancer is the most common of all cancers and is the leading cause of cancer deaths in women worldwide. The classification of breast cancer data can be useful to predict the outcome of some diseases or discover the genetic behavior of tumors. Data mining technology helps in classifying cancer patients and this technique helps to identify potential cancer patients by simply analyzing the data. This study examines the determinant factors of breast cancer and measures the breast cancer patient data to build a useful classification model using a data mining approach. In this study of 2397 women, 1022 (42.64%) were diagnosed with breast cancer. Among the four main learning techniques such as: Random Forest, Naive Bayes, Classification and Regression Model (CART), and Boosted Tree model were used for the study. The Random Forest technique had the better accuracy value of 0.9892(95%CI,0.9832 -0.9935) and a sensitivity value of about 92%. This means that the Random Forest learning model is the best model to classify and predict breast cancer based on associated factors.