Title Oversampling the Minority Class in the Feature Space
Authors PÉREZ ORTIZ, MARÍA, Gutierrez P.A., Tino P., Hervas-Martinez C., PÉREZ ORTIZ, MARÍA
External publication No
Means IEEE Trans. Neural Networks Learn. Sys.
Scope Article
Nature Científica
JCR Quartile 1
SJR Quartile 1
JCR Impact 6.10800
SJR Impact 2.56400
Area International
Web https://www.scopus.com/inward/record.uri?eid=2-s2.0-84940706270&doi=10.1109%2fTNNLS.2015.2461436&partnerID=40&md5=c711298bc8bcff5abab8173ffa59d40b
Publication date 01/01/2016
ISI 000382175500012
Scopus Id 2-s2.0-84940706270
DOI 10.1109/TNNLS.2015.2461436
Abstract The imbalanced nature of some real-world data is one of the current challenges for machine learning researchers. One common approach oversamples the minority class through convex combination of its patterns. We explore the general idea of synthetic oversampling in the feature space induced by a kernel function (as opposed to input space). If the kernel function matches the underlying problem, the classes will be linearly separable and synthetically generated patterns will lie on the minority class region. Since the feature space is not directly accessible, we use the empirical feature space (EFS) (a Euclidean space isomorphic to the feature space) for oversampling purposes. The proposed method is framed in the context of support vector machines, where the imbalanced data sets can pose a serious hindrance. The idea is investigated in three scenarios: 1) oversampling in the full and reduced-rank EFSs; 2) a kernel learning technique maximizing the data class separation to study the influence of the feature space structure (implicitly defined by the kernel function); and 3) a unified framework for preferential oversampling that spans some of the previous approaches in the literature. We support our investigation with extensive experiments over 50 imbalanced data sets. © 2012 IEEE.
Keywords Artificial intelligence; Convex combinations; Empirical feature space; Euclidean spaces; Imbalanced Data-sets; Kernel function; Kernel learning; Linearly separable; Unified framework; Learning systems
Universidad Loyola members