pairwise_kernels#

cuml.metrics.pairwise_kernels(X, Y=None, metric='linear', *, filter_params=False, convert_dtype=True, **kwds)[source]#

Compute the kernel between arrays X and optional array Y. This method takes either a vector array or a kernel matrix, and returns a kernel matrix. If the input is a vector array, the kernels are computed. If the input is a kernel matrix, it is returned instead. This method provides a safe way to take a kernel matrix as input, while preserving compatibility with many other algorithms that take a vector array. If Y is given (default is None), then the returned matrix is the pairwise kernel between the arrays from both X and Y. Valid values for metric are: [‘additive_chi2’, ‘chi2’, ‘linear’, ‘poly’, ‘polynomial’, ‘rbf’, ‘laplacian’, ‘sigmoid’, ‘cosine’]

Parameters:
XDense matrix (device or host) of shape (n_samples_X, n_samples_X) or (n_samples_X, n_features)

Array of pairwise kernels between samples, or a feature array. The shape of the array should be (n_samples_X, n_samples_X) if metric == “precomputed” and (n_samples_X, n_features) otherwise. Acceptable formats: cuDF DataFrame, NumPy ndarray, Numba device ndarray, cuda array interface compliant array like CuPy

YDense matrix (device or host) of shape (n_samples_Y, n_features), default=None

A second feature array only if X has shape (n_samples_X, n_features). Acceptable formats: cuDF DataFrame, NumPy ndarray, Numba device ndarray, cuda array interface compliant array like CuPy

metricstr or callable (numba device function), default=”linear”

The metric to use when calculating kernel between instances in a feature array. If metric is “precomputed”, X is assumed to be a kernel matrix. Alternatively, if metric is a callable function, it is called on each pair of instances (rows) and the resulting value recorded. The callable should take two rows from X as input and return the corresponding kernel value as a single number.

filter_paramsbool, default=False

Whether to filter invalid parameters or not.

convert_dtypebool, optional (default = True)

When set to True, the method will, when necessary, convert Y to be the same data type as X if they differ. This will increase memory used for the method.

**kwdsoptional keyword parameters

Any further parameters are passed directly to the kernel function.

Returns:
Kndarray of shape (n_samples_X, n_samples_X) or (n_samples_X, n_samples_Y)

A kernel matrix K such that K_{i, j} is the kernel between the ith and jth vectors of the given matrix X, if Y is None. If Y is not None, then K_{i, j} is the kernel between the ith array from X and the jth array from Y.

Notes

If metric is ‘precomputed’, Y is ignored and X is returned.

Examples

>>> import cupy as cp
>>> from cuml.metrics import pairwise_kernels
>>> from numba import cuda
>>> import math

>>> X = cp.array([[2, 3], [3, 5], [5, 8]])
>>> Y = cp.array([[1, 0], [2, 1]])

>>> pairwise_kernels(X, Y, metric='linear')
array([[ 2,  7],
    [ 3, 11],
    [ 5, 18]])
>>> @cuda.jit(device=True)
... def custom_rbf_kernel(x, y, gamma=None):
...     if gamma is None:
...         gamma = 1.0 / len(x)
...     sum = 0.0
...     for i in range(len(x)):
...         sum += (x[i] - y[i]) ** 2
...     return math.exp(-gamma * sum)

>>> pairwise_kernels(X, Y, metric=custom_rbf_kernel)
array([[6.73794700e-03, 1.35335283e-01],
    [5.04347663e-07, 2.03468369e-04],
    [4.24835426e-18, 2.54366565e-13]])