Microclustering-Based Multi-Class Classification on Imbalanced Multi-Relational Datasets

Microclustering-Based Multi-Class Classification on Imbalanced Multi-Relational Datasets

Hemlata Pant, Reena Srivastava
Copyright: © 2022 |Volume: 17 |Issue: 1 |Pages: 13
ISSN: 1554-1045|EISSN: 1554-1053|EISBN13: 9781799894001|DOI: 10.4018/IJITWE.304053
Cite Article Cite Article

MLA

Pant, Hemlata, and Reena Srivastava. "Microclustering-Based Multi-Class Classification on Imbalanced Multi-Relational Datasets." IJITWE vol.17, no.1 2022: pp.1-13. http://doi.org/10.4018/IJITWE.304053

APA

Pant, H. & Srivastava, R. (2022). Microclustering-Based Multi-Class Classification on Imbalanced Multi-Relational Datasets. International Journal of Information Technology and Web Engineering (IJITWE), 17(1), 1-13. http://doi.org/10.4018/IJITWE.304053

Chicago

Pant, Hemlata, and Reena Srivastava. "Microclustering-Based Multi-Class Classification on Imbalanced Multi-Relational Datasets," International Journal of Information Technology and Web Engineering (IJITWE) 17, no.1: 1-13. http://doi.org/10.4018/IJITWE.304053

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

In a relational database, classification algorithms are used to look for patterns across several interconnected relations. Most of the methods for multi-relational classification algorithms implicitly assume that the classes in the target relation are equally represented. Thus, they tend to produce poor predictive performance over the imbalanced dataset. In this paper, the authors propose an algorithm-level method MCMRC_IB for the classification of imbalanced multi-relational dataset. The proposed method extends MCMRC which is for balanced datasets. MCMRC_IB exploits the property of the imbalanced datasets that the minority class is represented by a smaller number of records usually 20-30% of the total records and is to be dealt accordingly by giving them weightages. The proposed method is able to handle multiple classes. Experimental results conðrm the efficiency of the proposed method in terms of predictive accuracy, F-measure, and G-mean.