cudf.core.column.string.StringMethods.jaccard_index#

StringMethods.jaccard_index(input: cudf.Series, width: int) SeriesOrIndex[source]#

Compute the Jaccard index between this column and the given input strings column.

Parameters:
inputSeries

The input strings column to compute the Jaccard index against. Must have the same number of strings as this column.

widthint

The number of characters for the sliding window calculation.

Examples

>>> import cudf
>>> str1 = cudf.Series(["the brown dog", "jumped about"])
>>> str2 = cudf.Series(["the black cat", "jumped around"])
>>> str1.str.jaccard_index(str2, 5)
0    0.058824
1    0.307692
dtype: float32