← Volver atrás
Publicaciones

An Iterated Greedy Algorithm for Improving the Generation of Synthetic Patterns in Imbalanced Learning

Autores

Javier Maestre-Garcia, Francisco , Garcia-Martinez, Carlos , PÉREZ ORTIZ, MARÍA, Antonio Gutierrez, Pedro

Publicación externa

No

Medio

Lect. Notes Comput. Sci.

Alcance

Proceedings Paper

Naturaleza

Científica

Cuartil JCR

Cuartil SJR

Impacto SJR

0.295

Fecha de publicacion

01/01/2017

ISI

000443108700044

Scopus Id

2-s2.0-85020877672

Abstract

Real-world classification datasets often present a skewed distribution of patterns, where one or more classes are under-represented with respect to the rest. One of the most successful approaches for alleviating this problem is the generation of synthetic minority samples by convex combination of available ones. Within this framework, adaptive synthetic (ADASYN) sampling is a relatively new method which imposes weights on minority examples according to their learning complexity, in such a way that difficult examples are more prone to be over-sampled. This paper proposes an improvement of the ADASYN method, where the learning complexity of these patterns is also used to decide which sample of the neighbourhood is selected. Moreover, to avoid suboptimal results when performing the random convex combination, this paper explores the application of an iterative greedy algorithm which refines the synthetic patterns by repeatedly replacing a part of them. For the experiments, six binary datasets and four over-sampling methods are considered. The results show that the new version of ADASYN leads to more robust results and that the application of the iterative greedy metaheuristic significantly improves the quality of the generated patterns, presenting a positive effect on the final classification model.

Palabras clave

Over-sampling; Imbalanced classification; ADASYN; Iterative greedy algorithm; Metaheuristics