The performance of a combined distance between time series
Event Title
XXV Congress of the Portuguese Statistical Society
Year (definitive publication)
2021
Language
English
Country
Portugal
More Information
Web of Science®
This publication is not indexed in Web of Science®
Scopus
This publication is not indexed in Scopus
Google Scholar
This publication is not indexed in Google Scholar
Abstract
The use of dissimilarity measures between time series is critical in several
data analysis tasks which range from simple querying to classification, clustering and
anomaly detection. Recently, we proposed a new dissimilarity measure, a convex
combination of four (normalized) distance measures which offer complementary perspectives
on the differences between two time series: the Euclidean distance which
captures differences in scale; a Pearson correlation based measure that takes into
account linear increasing and decreasing trends over time; a Periodogram based measure
that expresses the dissimilarities between frequencies or cyclical components of
the series; and a distance between estimated autocorrelation structures, comparing
the series in terms of their dependence on past observations. We conduct an experimental
analysis, to evaluate the comparative performance of this combined distance
measure, resorting to the UCR Time-Series Archive that includes time series data
sets from a wide variety of application domains. We follow a methodology suggested
in previous studies [?] that were conducted to compare several dissimilarity measures
and their variants: we use one nearest neighbor (1NN) classifier on labelled data to
evaluate the efficacy of the distance measures. In fact, since the distance measure
used is critical to 1NN accuracy, this indicator directly reflects the effectiveness of
the dissimilarity measure used. We conclude that the proposed combined measure
is competitive in several settings. Finally, we suggest further research taking into
account normalization methods.
Acknowledgements
--
Keywords
clustering,distance measures,time series