Título Floating search methodology for combining classification models for site recognition in DNA sequences
Autores PÉREZ RODRÍGUEZ, JAVIER, de Haro García, Aida , García Pedrajas, Nicolás
Publicación externa No
Medio IEEE/ACM Trans. Comput. BioL. Bioinf.
Alcance Article
Naturaleza Científica
Cuartil JCR 1
Cuartil SJR 2
Impacto JCR 3.71000
Impacto SJR 0.74500
Ámbito Internacional
Web https://www.scopus.com/inward/record.uri?eid=2-s2.0-85121687757&doi=10.1109%2fTCBB.2020.2974221&partnerID=40&md5=a4ce9cdfe9c3859099a992fc52e6696b
Fecha de publicacion 17/02/2020
ISI 000728193500040
Scopus Id 2-s2.0-85121687757
DOI 10.1109/TCBB.2020.2974221
Abstract Recognition of the functional sites of genes, such as translation initiation sites, donor and acceptor splice sites and stop codons, is a relevant part of many current problems in bioinformatics. The best approaches use sophisticated classifiers, such as support vector machines. However, with the rapid accumulation of sequence data, methods for combining many sources of evidence are necessary as it is unlikely that a single classifier can solve this problem with the best possible performance. A major issue is that the number of possible models to combine is large and the use of all of these models is impractical. In this paper we present a methodology for combining many sources of information to recognize any functional site using "floating search", a powerful heuristics applicable when the cost of evaluating each solution is high. We present experiments on four functional sites in the human genome, which is used as the target genome, and use another 20 species as sources of evidence. The proposed methodology shows significant improvement over state-of-the-art methods. The results show an advantage of the proposed method and also challenge the standard assumption of using only genomes not very close and not very far from the human to improve the recognition of functional sites.
Palabras clave Biological cells; Bioinformatics; Genomics; Computational modeling; Support vector machines; Biological system modeling; Search problems; Site recognition; gene prediction; models combination
Miembros de la Universidad Loyola

Change your preferences Gestionar cookies