Time series data correspond to observations of phenomena that are recorded over time <xref ref-type="bibr" rid="ref1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">[1]</xref> . Such data are encountered regularly in a wide range of applications, such as speech and music recognition, monitoring health and medical diagnosis, financial analysis, motion tracking, and shape identification, to name a few. With such a diversity of applications and the large variations in their characteristics, time series classification is a complex and challenging task. One of the fundamental steps in the design of time series classifiers is that of defining or constructing the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">discriminant features</i> that help differentiate between classes. This is typically achieved by designing novel <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">representation techniques</i> <xref ref-type="bibr" rid="ref2" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">[2]</xref> that transform the raw time series data to a new data domain, where subsequently a classifier is trained on the transformed data, such as one-nearest neighbors <xref ref-type="bibr" rid="ref3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">[3]</xref> or random forests <xref ref-type="bibr" rid="ref4" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">[4]</xref> . In recent time series classification approaches, deep neural network models have been employed that are able to jointly learn a representation of time series and perform classification <xref ref-type="bibr" rid="ref5" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">[5]</xref> . In many of these sophisticated approaches, the discriminant features tend to be complicated to analyze and interpret, given the high degree of nonlinearity.