An
Ensemble Framework of Multi-ratio Undersampling-based Imbalanced
Classification
(pp030-046)
Takahiro Komamizu,
Yasuhiro Ogawa, and
Katsuhiko Toyama
doi:
https://doi.org/10.26421/JDI2.1-2
Abstracts:
Class imbalance is
commonly observed in real-world data, and it is problematic in that
it degrades classification performance due to biased supervision.
Undersampling is an effective
resampling approach to the
class imbalance. The conventional
undersampling-based approaches
involve a single fixed sampling ratio. However, different sampling
ratios have different preferences toward classes. In this paper, an
undersampling-based ensemble
framework,
MUEnsemble,
is proposed. This framework involves weak classifiers of different
sampling ratios, and it allows for a flexible design for weighting
weak classifiers in different sampling ratios. To demonstrate the
principle of the design, in this paper, a uniform weighting function
and a Gaussian weighting function are presented. An extensive
experimental evaluation shows that
MUEnsemble
outperforms
undersampling-based and oversampling-based state-of-the-art methods in terms of recall,
gmean, F-measure, and
ROC-AUC
metrics. Also, the evaluation showcases that the Gaussian weighting
function is superior to the uniform weighting function. This
indicates that the Gaussian weighting function can capture the
different preferences of sampling ratios toward classes. An
investigation into the effects of the parameters of the Gaussian
weighting function shows that the parameters of this function can be
chosen in terms of recall, which is preferred in many real-world
applications.
Key words:
imbalanced classification, resampling, undersampling, ensemble