minhash#
- pylibcudf.nvtext.minhash.minhash(Column input, uint32_t seed, Column a, Column b, size_type width) Column #
Returns the minhash values for each string. This function uses MurmurHash3_x86_32 for the hash algorithm.
For details, see
minhash()
.- Parameters:
- inputColumn
Strings column to compute minhash
- seeduint32_t
Seed used for the hash function
- aColumn
1st parameter value used for the minhash algorithm.
- bColumn
2nd parameter value used for the minhash algorithm.
- widthsize_type
Character width used for apply substrings;
- Returns:
- Column
List column of minhash values for each string per seed
- pylibcudf.nvtext.minhash.minhash64(Column input, uint64_t seed, Column a, Column b, size_type width) Column #
Returns the minhash values for each string. This function uses MurmurHash3_x64_128 for the hash algorithm.
For details, see
minhash64()
.- Parameters:
- inputColumn
Strings column to compute minhash
- seeduint64_t
Seed used for the hash function
- aColumn
1st parameter value used for the minhash algorithm.
- bColumn
2nd parameter value used for the minhash algorithm.
- widthsize_type
Character width used for apply substrings;
- Returns:
- Column
List column of minhash values for each string per seed