← Back
Publicaciones

An Iterated Greedy Algorithm for Improving the Generation of Synthetic Patterns in Imbalanced Learning

Authors

Javier Maestre-Garcia, Francisco , Garcia-Martinez, Carlos , PÉREZ ORTIZ, MARÍA, Antonio Gutierrez, Pedro

External publication

No

Means

Lect. Notes Comput. Sci.

Scope

Proceedings Paper

Nature

Científica

JCR Quartile

SJR Quartile

SJR Impact

0.295

Publication date

01/01/2017

ISI

000443108700044

Scopus Id

2-s2.0-85020877672

Abstract

Real-world classification datasets often present a skewed distribution of patterns, where one or more classes are under-represented with respect to the rest. One of the most successful approaches for alleviating this problem is the generation of synthetic minority samples by convex combination of available ones. Within this framework, adaptive synthetic (ADASYN) sampling is a relatively new method which imposes weights on minority examples according to their learning complexity, in such a way that difficult examples are more prone to be over-sampled. This paper proposes an improvement of the ADASYN method, where the learning complexity of these patterns is also used to decide which sample of the neighbourhood is selected. Moreover, to avoid suboptimal results when performing the random convex combination, this paper explores the application of an iterative greedy algorithm which refines the synthetic patterns by repeatedly replacing a part of them. For the experiments, six binary datasets and four over-sampling methods are considered. The results show that the new version of ADASYN leads to more robust results and that the application of the iterative greedy metaheuristic significantly improves the quality of the generated patterns, presenting a positive effect on the final classification model.

Keywords

Over-sampling; Imbalanced classification; ADASYN; Iterative greedy algorithm; Metaheuristics