Author(s):
1. Ghulam Fatima:
College of Statistical and Actuarial Sciences, University of the Punjab,Pakistan
2. Sana Saeed:
College of Statistical and Actuarial Sciences, University of the Punjab,Pakistan
Abstract:
In the data mining communal, imbalanced class dispersal data sets have established mounting consideration. The
evolving field of data mining and information discovery seeks to establish precise and effective computational tools
for the investigation of such data sets to excerpt innovative facts from statistics. Sampling methods re-balance the
imbalanced data sets consequently improve the enactment of classifiers. For the classification of the imbalanced data
sets, over-fitting and under-fitting are the two striking problems. In this study, a novel weighted ensemble method is
anticipated to diminish the influence of over-fitting and under-fitting while classifying these kinds of data sets. Forty
imbalanced data sets with varying imbalance ratios are engaged to conduct a comparative study. The enactment of the
projected method is compared with four customary classifiers including decision tree(DT), k-nearest neighbor (KNN),
support vector machines (SVM), and neural network (NN). This evaluation is completed with two over-sampling
procedures, an adaptive synthetic sampling approach (ADASYN) and a synthetic minority over-sampling (SMOTE)
technique. The projected scheme remained efficacious in diminishing the impact of over-fitting and under-fitting on
the classification of these data sets.
Page(s):
483-496
Published:
Journal: Pakistan Journal of Statistics and Operation Research, Volume: 17, Issue: 2, Year: 2021
Keywords:
Weighted Method
,
Imbalanced Data Sets
,
OverSampling Techniques
,
Ensemble Method
,
OverFitting
,
UnderFitting
References:
References are not available for this document.
Citations
Citations are not available for this document.