Elastic measures are popular for time series analysis.
Using elastic measures, the two time series do not need to be the same length, providing much more flexibility.
The elastic measures implemented in this library include Dynamic Time Warping (DTW),
its variations such as weighted-DTW and derivative-DTW,
and edit-based distance measures such as Longest Common Subsequence (LCSS),
Edit Distance with Real Penalty (ERP), Edit Distance on Real Sequences (EDR), Move Split Merge (MSM),
and Time Warp Edit Distance (TWED).
In general, elastic measures use a dynamic programming algorithm with the goal of minimize the total distance between the two series.
This has an effect of “aligning” the time series which is useful for time-series that are out-of-sync
or otherwise unaligned in ways that non-elastic measures could not distinguish.
Through this section, we will have two time series X and Y represented as such:
\[\begin{split}\begin{aligned}
X = x_1,\ x_2,\ x_3,\ ...,\ x_n \\
Y = y_1,\ y_2,\ y_3,\ ...,\ y_m \\
\end{aligned}\end{split}\]
Additionally, we will have a array of size \((m,n)\) represented as:
As the elastic measures runs, it produces a warping path \(W = w_1, w_2, ..., w_k = (x_1,y_1), (x_i, y_i), ..., (x_n,y_m)\)
where \(max(m,n) \leq k \leq m + n - 1\). In warping path element, the algorithm takes one of three steps.
If \(w_{k-1} = (x_{i-1}, y_i)\), then the step was horizontal. If \(w_{k-1} = (x_{i}, y_{i-1})\), then the step was vertical.
And if \(w_{k-1} = (x_{i-1}, y_{i-1})\), the step was diagonal.
Along with this, each elastic measure uses a distance function D to determine the distance between the elements of each time series.
D can be divided into 4 components, \(D^{v},D^{h},D^d\), and $D^u$ or distance from a vertical step, distance from a horizontal step,
distance from a diagonal step, and undirectional distance. These components are the same in some measures and differ in others.
The elastic distance of the two time series is \(\sum_{i = 1}^{k}D(w_i)\) where \(D\) is one of the component listed above.
Alternatively, the elastic distance can be written as:
The function \(\pi\) is a final function depending on the measure.
Throughout this paper, we will define \(\pi\) and D as a simplistic way to define each function.
If the the distance function differs depending on the direction that will be detailed in the measure.
Additionally, certain measures will have other parameters that must be specified. These will be discussed in where applicable.
In the worst case,
elastic measures will take \(O(mn)\) time to complete as each item within the time series are compared to each other before the final comparison is computed.
There are various methods to improve this run-time, such as restricting the range of comparing values.
Additionally,
the space complexity in the worst case is an array of \(O(mn)\)
but this can be reduced to a linear size complexity as only the immediately previous values are used.
Dynamic Time Warping (DTW) [1]_ utilizes dynamic programming to find
the optimal alignment between elements of times series \(X = (x_{1}, x_{2}, ..., x_{n})\)
and \(Y = (y_{1}, y_{2}, ..., y_m)\) by constructing
a distance matrix \(M\) of shape \((n, m)\) with the following forumla:
w (float, optional) – If constraint="Sakoe-Chiba" , w is the largest temporal shift allowed between two time series. w defaults to None, in which case the warping window is the length of the longest time series.
fast (bool, optional) – whether or not to use fast (Numba) implementation, default to True
The Generalized Lower Bound (GLB) variant for DTW [1] that achieves high pruning power while caches reusable computations.
GLB_DTW is a state-of-the-art DTW lower bound for DTW. The function can be imported using fromtsdistance.elasticimportglb_dtw.
LB_Keogh [1]_ is one of the most popular dtw lower bounds.
LB_Keogh constructs upper and lower envelopes of query series and compute
distance between these query envelopes and the candidate series,
while utilizing a window :math:w to limit warping (shifted alignements).
LB_Keogh is formally defined as:
The Generalized Lower Bound (GLB) variant for LCSS [1] that achieves high pruning power while caching reusable computations.
GLB_LCSS is a state-of-the-art LCSS lower bound. The function can be imported using fromtsdistance.elasticimportglb_lcss.
Parameters:
y (np.array) – a time series
x (np.array) – another time series
epsilon (float) – the matching threshold
w (int) – w is the largest temporal shift allowed between two time series
The LCSS lower bound, LB_Keogh_LCSS [1]_ is adapted from
LB_Keogh [2] and considers the LCSS matching threshold
:math:`epsilon`in constructing envelopes.
tsdistance.elastic.glb_swale(x, y, p, r, epsilon, w)¶
The Generalized Lower Bound (GLB) variant for SWALE [1] that achieves high pruning power while caching reusable computations.
The function can be imported using fromtsdistance.elasticimportglb_swale.
Parameters:
x (np.array) – a time series
y (np.array) – another time series
p (float) – the penalty for mismatches
r (float) – the reward for matches
epsilon (float) – the matching threshold
w (int) – the largest temporal shift allowed between two time series
The Generalized Lower Bound (GLB) variant for EDR [1] that achieves high pruning power while caching reusable computations.
The function can be imported using fromtsdistance.elasticimportglb_edr.
Parameters:
x (np.array) – a time series
y (np.array) – another time series
epsilon (float) – the matching threshold
w (int) – the largest temporal shift allowed between two time series
The Generalized Lower Bound (GLB) variant for MSM [1] that achieves high pruning power while caching reusable computations.
GLB_MSM is a state-of-the-art MSM lower bound, which can be imported using fromtsdistance.elasticimportglb_msm.
Parameters:
x (np.array) – a time series
y (np.array) – another time series
c (float) – the cost for move or split operation for MSM
w (int) – the largest temporal shift allowed between two time series