Publicaciones

Autores

Guijo Rubio, David , DURAN ROSAL, ANTONIO MANUEL, Gutiérrez Peña, Pedro Antonio , Troncoso, Alicia , Hervás Martínez, César

Publicación externa

No

Medio

IEEE T. Cybern.

Alcance

Article

Naturaleza

Científica

Cuartil JCR

1

Cuartil SJR

1

Impacto JCR

11.448

Impacto SJR

3.109

Web

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85119434666&doi=10.1109%2fTCYB.2019.2962584&partnerID=40&md5=efc7d3cc299735184e7f138683829e0f

Fecha de publicacion

15/01/2020

ISI

000716697700022

Scopus Id

2-s2.0-85119434666

DOI

10.1109/TCYB.2019.2962584

Abstract

Time-series clustering is the process of grouping time series with respect to their similarity or characteristics. Previous approaches usually combine a specific distance measure for time series and a standard clustering method. However, these approaches do not take the similarity of the different subsequences of each time series into account, which can be used to better compare the time-series objects of the dataset. In this article, we propose a novel technique of time-series clustering consisting of two clustering stages. In a first step, a least-squares polynomial segmentation procedure is applied to each time series, which is based on a growing window technique that returns different-length segments. Then, all of the segments are projected into the same dimensional space, based on the coefficients of the model that approximates the segment and a set of statistical features. After mapping, a first hierarchical clustering phase is applied to all mapped segments, returning groups of segments for each time series. These clusters are used to represent all time series in the same dimensional space, after defining another specific mapping process. In a second and final clustering stage, all the time-series objects are grouped. We consider internal clustering quality to automatically adjust the main parameter of the algorithm, which is an error threshold for the segmentation. The results obtained on 84 datasets from the UCR Time Series Classification Archive have been compared against three state-of-the-art methods, showing that the performance of this methodology is very promising, especially on larger datasets.

Palabras clave

Time series analysis; Hidden Markov models; Clustering algorithms; Time measurement; Autoregressive processes; Data mining; Proposals; Data mining; feature extraction; segmentation; time-series clustering

Miembros de la Universidad Loyola

ANTONIO MANUEL DURAN ROSAL

Time series clustering based on the characterisation of segment typologies