Kernel Measures¶
Kernel Dynamic Time Warping (kdtw)¶
- tsdistance.kernel.kdtw(x, y, sigma)¶
Kernel Dynamic Time Warping (KDTW) [1]_. is a similarity measure constructed from DTW with the property that KDTW is a positive definite kernel (homogeneous to an inner product in the so-called Reproducing Kernel Hilbert Space). Following earlier work by Cuturi & al. [2]_, namely the so-called Global Alignment kernel (GA-kernel), the derivation of KDTW is detailed in Marteau & Gibet 2014 [1]_. KDTW is a convolution kernel as defined in [3]. The formula for KDTW is shown below:
\[\begin{equation*} k(X_i,Y_i,\sigma) = e^{- (X_i-Y_i)^2/\sigma } \end{equation*}\]\[\begin{split}\begin{equation*} KDTW^{xy}(X_i,Y_j,\sigma) = \beta * k(X_i,Y_j,\sigma) \cdot \sum \begin{cases} h(i-1,j)KDTW^{xy}(X_{i-1},Y_j) \\ h(i-1,j-1)KDTW^{xy}(X_{i-1},Y_{j-1}) \\ h(i,j-1)KDTW^{xy}(X_i,Y_{j-1}) \\ \end{cases} \end{equation*}\end{split}\]\[\begin{split}\begin{equation*} KDTW^{xx}(X_i,Y_j,\sigma) = \beta \cdot \sum \begin{cases} (h(i-1,j) KDTW^{xx}(X_{i-1},Y_j) * k(X_{i},Y_i,\sigma) \\ \Delta_{i,j} *h(i,j)*KDTW^{xx}(X_{i-1},Y_{j-1})*k(x_i,y_j,\sigma) \\ h(p,q-1)*KDTW^{xx}(X_i,Y_{j-1})*k(X_j,Y_j,\sigma) \\ \end{cases} \end{equation*}\end{split}\]\[\begin{equation*} KDTW(X,Y) = KDTW^{xy}(X_n,Y_m) + KDTW^{xx}(X_n,Y_m) \end{equation*}\]- Parameters:
x (np.array) – time series
xxlen (int) – length of time series
xy (np.array) – time series
xylen (int) – length of time series
ysigma (float) – bandwidth parameter which weights the local contributions
- Returns:
the KDTW distance
Example:
>>> from tsdistance.kernel import kdtw >>> import numpy as np >>> ts1 = np.array([1, 2, 3, 4, 5, 9, 7]) >>> ts2 = np.array([8, 9, 9, 7, 3, 1, 2]) >>> kdtw_dist = kdtw(ts1, ts2, 0.5) >>> print(kdtw_dist) 4.796391482673881e-51
Reference
Shift INvariant Kernel (SINK)¶
- tsdistance.kernel.SINK(x, y, gamma, e)¶
Shift Invariant Kernel (SINK) [1]_ [2]_ computes the distance between time series X and Y by summing all weighted elements of the Coefficient Normalized Cross-Correlation (\(NCC_c\)) sequence between \(X\) and \(Y\). Formally, SINK is defined as follows:
\[\begin{equation} SINK(x,y,\gamma) = \sum_{i=1}^ne^{\gamma * NCCc_i(x,y)} \end{equation} \]where \(\gamma > 0\).
- Parameters:
x (np.array) – time series
xy (np.array) – time series
xgamma – bandwidth paramater that determines weights for each inner product through \(k'(\vec{x}, \vec{y}, \gamma) = e^{\gamma<\vec{x}, \vec{y}>}\)
e – constant, default to \(e\)
- Type:
float, \(\gamma\) > 0
- Returns:
the SINK distance
References
[1] John Paparrizos and Michael Franklin. “GRAIL: Efficient Time-SeriesRepresentation Learning”. In:Proceedings of the VLDB Endowment12(2019)
[2] Amaia Abanda, Usue Mor, and Jose A. Lozano. “A review on distancebased time series classification”. In:Data Mining and Knowledge Discovery12.378–412 (2019)
Log Global Alignment Kernel (LGAK)¶
- tsdistance.kernel.LGAK(x, y, sigma)¶
This function uses the log Global Alignment Kernel (TGAK) described in Cuturi (2011) [1]_. The formula for LGAK is follows:
\[LGAK(x, y,\sigma)= (\prod_{i=1}^{|\pi|}e^(\frac{1}{2\sigma^2}({x_{\pi_1(i)} - y_{\pi_2(j)}})^2+log(e^{-\frac{({x_{\pi_1(i)} - y_{\pi_2(j)}})^2}{2\sigma^2}})))\]- Parameters:
x (np.array) – time series
xy (np.array) – time series
xsigma (float) – parameter of the Gaussian kernel
- Returns:
the LGAK distance