cudf.core.column.string.StringMethods.edit_distance_matrix#
- StringMethods.edit_distance_matrix() SeriesOrIndex #
Computes the edit distance between strings in the series.
The series to compute the matrix should have more than 2 strings and should not contain nulls.
Edit distance is measured based on the Levenshtein edit distance algorithm.
- Returns:
- Series of ListDtype(int64)
Assume
N
is the length of this series. The return series containsN
lists of sizeN
, where thej
th number in thei
th row of the series tells the edit distance between thei
th string and thej
th string of this series. The matrix is symmetric. Diagonal elements are 0.
Examples
>>> import cudf >>> s = cudf.Series(['abc', 'bc', 'cba']) >>> s.str.edit_distance_matrix() 0 [0, 1, 2] 1 [1, 0, 2] 2 [2, 2, 0] dtype: list