Compute the edit distance between individual strings in two strings columns.
The output[i] is the edit distance between input[i] and targets[i]. This edit distance calculation uses the Levenshtein algorithm as documented here: https://www.cuelogic.com/blog/the-levenshtein-algorithm
Example:
s = ["hello", "", "world"]
t = ["hallo", "goodbye", "world"]
d = edit_distance(s, t)
d is now [1, 7, 0]
Any null entries for either input or targets is ignored and the edit distance is computed as though the null entry is an empty string.
The targets.size() must equal input.size() unless targets.size()==1. In this case, all input will be computed against the single targets[0] string.
- Exceptions
-
| std::invalid_argument | if targets.size() != input.size() and if targets.size() != 1 |
| std::invalid_argument | if targets.size() == 1 and targets[0].is_null() |
- Parameters
-
| input | Strings column of input strings |
| targets | Strings to compute edit distance against input |
| stream | CUDA stream used for device memory operations and kernel launches |
| mr | Device memory resource used to allocate the returned column's device memory |
- Returns
- New lists column of edit distance values