pyscamp API documentation¶
pyscamp: Python bindings for SCAMP¶
selfjoin (a, m, **kwargs) |
Computes the matrix profile for time series A. |
abjoin (a, b, m, **kwargs) |
For each subsequence in time series A, finds the nearest neighbor in time series B. |
selfjoin_sum (a, m, **kwargs) |
Returns the sum of the correlations above specified threshold (default 0) for each subsequence in a time series. |
abjoin_sum (a, b, m, **kwargs) |
For each subsequence in time series a, returns the sum of the correlations to subsequences in time series b above specified threshold (default 0). |
selfjoin_knn (a, m, k, **kwargs) |
[GPU ONLY, EXPERIMENTAL] Returns the approximate k nearest neighbors for each subsequence in a time series |
abjoin_knn (a, b, m, k, **kwargs) |
[GPU ONLY, EXPERIMENTAL] For each subsequence in time series A, returns its Approximate K nearest neighbors in time series B |
selfjoin_matrix (a, m, **kwargs) |
[EXPERIMENTAL] Returns a pooled version of the distance matrix with HxW of [mheight x mwidth], pooling operation is max() for Pearson Correlation and min() for Euclidian Distance |
abjoin_matrix (a, b, m, **kwargs) |
[EXPERIMENTAL] Returns a pooled version of the distance matrix with HxW of [mheight x mwidth], pooling operation is max() for Pearson Correlation and min() for Euclidian Distance |
-
pyscamp.
abjoin
(a: List[float], b: List[float], m: int, **kwargs) → Tuple[numpy.ndarray[numpy.float32], numpy.ndarray[numpy.int32]]¶ For each subsequence in time series A, finds the nearest neighbor in time series B.
Parameters: - a (1D array) – Time series, b will be queried for subsequences in a.
- b (1D array) – Time series in which to search for matches for subsequences in a.
- m (int) – Subsequence length to use for computing the matrix profile.
Returns: A tuple. First element: The nearest neighbor distance of subsequences in a to time series b. Second element: The index (in b) of each nearest neighbor.
Return type: Tuple of np.ndarray[float32] and np.ndarray[int32]
-
pyscamp.
abjoin_knn
(a: List[float], b: List[float], m: int, k: int, **kwargs) → List[Tuple[int, int, float]]¶ [GPU ONLY, EXPERIMENTAL] For each subsequence in time series A, returns its Approximate K nearest neighbors in time series B
Parameters: - a (1D array) – Time series to compute the KNN matrix profile for.`
- b (1D array) – Time series in which to search for matches.
- m (int) – Subsequence length to use for computing the matrix profile.
- k (int) – Number of neighbors to return for each subsequence
- threshold (float, optional) – Correlation threshold [0,1] (Default 0), matches which have a correlation less than the threshold will be ignored
Returns: List of tuples (col, row, distance) containing the matches (up to K) for each column of the distance matrix, col is the index in A, row is the index in B of the match, and d is the distance between the two subsequences
Return type: List of tuple[int, int, float]
-
pyscamp.
abjoin_matrix
(a: List[float], b: List[float], m: int, **kwargs) → numpy.ndarray[numpy.float32]¶ [EXPERIMENTAL] Returns a pooled version of the distance matrix with HxW of [mheight x mwidth], pooling operation is max() for Pearson Correlation and min() for Euclidian Distance
Parameters: - a (1D array) – Time series corresponding to the columns of the distance matrix.
- b (1D array) – Time series corresponding to the rows of the distance matrix.
- m (int) – Subsequence length to use for computing the matrix profile.
- mheight (int, optional) – Height of the pooled distance matrix to output. Default 50
- mwidth (int, optional) – Width of the pooled distance matrix to output. Default 50
- threshold (float, optional) – Correlation threshold [0,1] (Default 0), matches which have a correlation less than the threshold will be ignored
Returns: A 2D array of height of mheight and width of mwidth. This is a pooled version of the full distance matrix.
Return type: 2D array
-
pyscamp.
abjoin_sum
(a: List[float], b: List[float], m: int, **kwargs) → numpy.ndarray[numpy.float64]¶ For each subsequence in time series a, returns the sum of the correlations to subsequences in time series b above specified threshold (default 0).
Parameters: - a (1D array) – Time series to compute matrix profile for.
- b (1D array) – Time series to search for matches.
- m (int) – Subsequence length to use for computing the matrix profile.
- threshold (float, optional) – Correlation threshold [0,1] (Default 0), matches which have a correlation less than the threshold will be ignored
Returns: For each subsequence in A, returns the sum of correlations above the the specified threshold in B.
Return type: np.ndarray[float64]
-
pyscamp.
gpu_supported
() → bool¶ Returns true if both 1) The module was compiled with GPU support and 2) GPUs are available.
-
pyscamp.
selfjoin
(a: List[float], m: int, **kwargs) → Tuple[numpy.ndarray[numpy.float32], numpy.ndarray[numpy.int32]]¶ Computes the matrix profile for time series A.
Parameters: - a (1D array) – Time series to compute matrix profile for.
- m (int) – Subsequence length to use for computing the matrix profile.
Returns: A tuple containing the matrix profile as the first element and the indices as a the second element.
Return type: Tuple of np.ndarray[float32] and np.ndarray[int32]
-
pyscamp.
selfjoin_knn
(a: List[float], m: int, k: int, **kwargs) → List[Tuple[int, int, float]]¶ [GPU ONLY, EXPERIMENTAL] Returns the approximate k nearest neighbors for each subsequence in a time series
Parameters: - a (1D array) – Time series to compute the KNN matrix profile for.
- m (int) – Subsequence length to use for computing the matrix profile.
- k (int) – Number of neighbors to return for each subsequence
- threshold (float, optional) – Correlation threshold [0,1] (Default 0), matches which have a correlation less than the threshold will be ignored
Returns: List of tuples (col, row, distance) containing the matches (up to K) for each column of the distance matrix, row is the index of the match, and d is the distance between the two subsequences
Return type: List of tuple[int, int, float]
-
pyscamp.
selfjoin_matrix
(a: List[float], m: int, **kwargs) → numpy.ndarray[numpy.float32]¶ [EXPERIMENTAL] Returns a pooled version of the distance matrix with HxW of [mheight x mwidth], pooling operation is max() for Pearson Correlation and min() for Euclidian Distance
Parameters: - a (1D array) – Time series to compute matrix profile for.
- m (int) – Subsequence length to use for computing the matrix profile.
- mheight (int, optional) – Height of the pooled distance matrix to output. Default 50
- mwidth (int, optional) – Width of the pooled distance matrix to output. Default 50
- threshold (float, optional) – Correlation threshold [0,1] (Default 0), matches which have a correlation less than the threshold will be ignored
Returns: A 2D array of height of mheight and width of mwidth. This is a pooled version of the full distance matrix.
Return type: 2D array
-
pyscamp.
selfjoin_sum
(a: List[float], m: int, **kwargs) → numpy.ndarray[numpy.float64]¶ Returns the sum of the correlations above specified threshold (default 0) for each subsequence in a time series.
Parameters: - a (1D array) – Time series to compute matrix profile for.
- m (int) – Subsequence length to use for computing the matrix profile.
- threshold (float, optional) – Correlation threshold [0,1] (Default 0), matches which have a correlation less than the threshold will be ignored
Returns: For each subsequence in A, returns the sum of correlations above the the specified threshold to other subesequences in A.
Return type: np.ndarray[float64]