Título Multi-objective evolutionary optimization using the relationship between F 1 and accuracy metrics in classification tasks
Autores Fernández J.C., CARBONERO RUZ, MARIANO, Gutiérrez P.A., Hervás-Martínez C., CARBONERO RUZ, MARIANO
Publicación externa No
Medio Appl. Intell.
Alcance Article
Naturaleza Científica
Cuartil JCR 2
Cuartil SJR 2
Impacto JCR 3.32500
Ámbito Internacional
Web https://www.scopus.com/inward/record.uri?eid=2-s2.0-85064481663&doi=10.1007%2fs10489-019-01447-y&partnerID=40&md5=c72b94749e129275557d10ce159d94ae
Fecha de publicacion 01/01/2019
ISI 000482434300019
Scopus Id 2-s2.0-85064481663
DOI 10.1007/s10489-019-01447-y
Abstract This work analyses the complementarity and contrast between two metrics commonly used for evaluating the quality of a binary classifier: the correct classification rate or accuracy, C, and the F1 metric, which is very popular when dealing with imbalanced datasets. Based on this analysis, a set of constraints relating C and F1 are defined as a function of the ratio of positive patterns in the dataset. We evaluate the possibility of using a multi-objective evolutionary algorithm guided by this pair of metrics to optimise binary classification models. To check the validity of the constraints, we perform an empirical analysis considering 26 benchmark datasets obtained from the UCI repository and an interesting liver transplant dataset. The results show that the relation is fulfilled and that the use of the algorithm for simultaneously optimising the pair (C,F1) leads to a generally balanced accuracy for both classes. The experiments also reveal that, in some cases, better results are obtained by using the majority class as the positive class instead of using the minority one, which is the most common approach with imbalanced datasets. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.
Palabras clave Classification (of information); Data mining; Optimization; Binary classification; Classification rates; Classification tasks; Evaluation metrics; F1-metric; Imbalanced Data-sets; Multi objective evol
Miembros de la Universidad Loyola