The clustering performance of a weighted combined distance between time series

Margarida G. M. S. Cardoso; Ana Alexandra A. F. Martins; João Lagarto

Ciência_Iscte Comunicações Descrição Detalhada da Comunicação

Comunicação em evento científico

The clustering performance of a weighted combined distance between time series

Margarida G. M. S. Cardoso (Cardoso, M. G. M. S.); Ana Alexandra A. F. Martins (Martins, A. A. A. F.); João Lagarto (Lagarto, J.);

Título Evento

17th conference of the International Federation of Classification Societies

Ano (publicação definitiva)

2022

Língua

Inglês

País

Portugal

Mais Informação

Visitar Link

Web of Science®

Esta publicação não está indexada na Web of Science®

Scopus

Esta publicação não está indexada na Scopus

Google Scholar

Esta publicação não está indexada no Google Scholar

Overton

Esta publicação não está indexada no Overton

Abstract/Resumo

Recently, [1],we proposed a newdissimilarity measure between time series - COMB, a uniform convex combination of four (normalized) distance measures: Euclidean; Pearson correlation based; Periodogram based; and a distance between estimated autocorrelation structures. In this work, we propose a method to determine the weights of the convex combination of distances in COMB: it relies on the concordance of clusterings obtained by each individual distance measure and COMB derived clustering. A weighted COMB measure is thus obtained, WCOMB. We then test the clustering performance of WCOMB vs. COMB by conducting an experimental analysis on all the time series datasets of the UCR archive. We evaluate the concordance between the clusters obtained using K-Medoids and the original classes (using adjusted Rand index) as well as the cohesion-separation of the clusters (using the Silhouette index). In addition, we consider a clustering application - with data from the Portuguese Transmission System Operator, on time series of electricity consumption (2014 to 2019) - to compare the performance of both methods. Significant differences between the average Silhouette values of clusters obtained were found. The concordance with the original classes’ structure exhibits similar performance in both approaches. We conclude that, for unsupervised leaning, it can be worthwhile to invest on deriving specific weights for the distances integrating COMB.

Agradecimentos/Acknowledgements

Palavras-chave

clustering,distance measures,time series

Identificadores da Publicação

ID Ciência_Iscte

ci-pub-96087

Outros Detalhes da Publicação

Avaliado Cientificamente	Sim
Cidade	Porto
Tipo de Evento	Conferência
Classificação do Evento	Internacional
Tipo de Apresentação no Evento	--
Data Publicação (online)
Data Publicação (print)