Abstract
Many approaches to time series classification rely on machine learning methods. However, there is growing interest in going beyond black box prediction models to understand discriminatory features of the time series and their associations with outcomes. One promising method is time-series shapelets (TSS), which identifies maximally discriminative subsequences of time series. For example, in environmental health applications TSS could be used to identify short-term patterns in exposure time series (shapelets) associated with adverse health outcomes. Identification of candidate shapelets in TSS is computationally intensive. The original TSS algorithm used exhaustive search. Subsequent algorithms introduced efficiencies by trimming/aggregating the set of candidates or training candidates from initialized values, but these approaches have limitations. In this paper, we introduce Wavelet-TSS (W-TSS) a novel intelligent method for identifying candidate shapelets in TSS using wavelet transformation discovery. We tested W-TSS on two datasets: (1) a synthetic example used in previous TSS studies and (2) a panel study relating exposures from residential air pollution sensors to symptoms in participants with asthma. Compared to previous TSS algorithms, W-TSS was more computationally efficient, more accurate, and was able to discover more discriminative shapelets. W-TSS does not require pre-specification of shapelet length.
Highlights
Time series classification methodology is of growing interest in health research, especially given recent advances in sensor technology
While it could be feasible to input all candidate shapelets into a prediction model which can handle high dimensional data, the clear groupings of the candidate shapelets suggested that preliminary dimension reduction would be reasonable and would likely improve interpretation
We reduced the number of candidate shapelets using global alignment kernel k-means [20], with the number of clusters arbitrarily set to k = 12
Summary
Time series classification methodology is of growing interest in health research, especially given recent advances in sensor technology. Environmental health researchers may be interested in using daily exposure time series to distinguish between days a study participant does or does not report respiratory symptoms. Many time series classification methods distinguish between classes using global summary statistics (e.g., mean, standard deviation) or global shapes (e.g., dynamic time warping methods) [1,2]. There may be discriminative local shapes (e.g., peaks in exposure indicating proximity to a source) missed by methods using global summaries. One promising method using local features is time series shapelets (TSS), first introduced by Ye and Keogh [3]. TSS classifies time series based on similarity to local shapes and has the potential to outperform other state-of-the-art time series classifiers using global features, especially in applications with discriminative local shapes and in the presence of general noise and distortion [4]
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have