Unlocking Insights: Time Series Clustering Techniques
Conference
65th ISI World Statistics Congress 2025
Format: CPS Abstract - WSC 2025
Keywords: #statistics, clustering, timeseries
Session: CPS 5 - Time Series Analysis
Tuesday 7 October 4 p.m. - 5 p.m. (Europe/Amsterdam)
Abstract
Time series data are prevalent across various fields, including finance, economics, engineering, medicine, and operations management. Much research has centered on developing similarity measures for tasks like clustering and classification. This paper focuses on clustering time series, an unsupervised learning problem where objects are grouped based on distance or similarity measures. Since clusters can be formally defined as subsets of the data set, one possible classification of clustering methods can be according to whether the subsets are fuzzy (soft) or crisp (hard). In contrast to hard clustering, fuzzy clustering methods permit objects to belong to multiple clusters simultaneously, with varying degrees of membership. However, fuzzy clustering performance is sensitive to the fuzzifier parameter. The selection of an appropriate dissimilarity measure is of paramount importance in time series clustering, as conventional dissimilarities may not adequately capture the interdependence between values.
Given the growing interest in time series, clustering of dynamic data has gained significance with useful applications in several fields.
This paper focuses on clustering techniques for time series data, considering the challenges posed by the dynamic nature of the series and the high dimensionality of the data. To address the dimensionality issue, we propose a modelling approach based on penalised spline (P-spline) smoothers, which enables the estimation of P-spline coefficients. This approach reduces the dimensionality of the problem while preserving the essential features of the series. Additionally, we introduce a fuzzy clustering procedure for time series that combines the Probabilistic Distance clustering procedure with the Boosting philosophy, eliminating the need for a fuzzifier.Experimental results demonstrate the effectiveness of the proposed methods in grouping time series data.
Our strategies offer a comprehensive framework for time series clustering, addressing the challenges of dissimilarity measures, high dimensionality, and the choice of clustering algorithms.