cudf.core.column.string.StringMethods.edit_distance#

StringMethods.edit_distance(targets) SeriesOrIndex#

The targets strings are measured against the strings in this instance using the Levenshtein edit distance algorithm. https://www.cuelogic.com/blog/the-levenshtein-algorithm

The targets parameter may also be a single string in which case the edit distance is computed for all the strings against that single string.

Parameters:
targetsarray-like, Sequence or Series or str

The string(s) to measure against each string.

Returns:
Series or Index of int32.

Examples

>>> import cudf
>>> sr = cudf.Series(["puppy", "doggy", "kitty"])
>>> targets = cudf.Series(["pup", "dogie", "kitten"])
>>> sr.str.edit_distance(targets=targets)
0    2
1    2
2    2
dtype: int32
>>> sr.str.edit_distance("puppy")
0    0
1    4
2    4
dtype: int32