Column Hash#

group column_hash

Typedefs

using hash_value_type = uint32_t#

Type of hash value.

Enums

enum class hash_id#

Identifies the hash function to be used.

Values:

enumerator HASH_IDENTITY#

Identity hash function that simply returns the key to be hashed.

enumerator HASH_MURMUR3#

Murmur3 hash function.

enumerator HASH_SPARK_MURMUR3#

Spark Murmur3 hash function.

enumerator HASH_MD5#

MD5 hash function.

Functions

std::unique_ptr<column> hash(table_view const &input, hash_id hash_function = hash_id::HASH_MURMUR3, uint32_t seed = DEFAULT_HASH_SEED, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::mr::device_memory_resource *mr = rmm::mr::get_current_device_resource())#

Computes the hash value of each row in the input set of columns.

Deprecated:

Since 23.08

Parameters:
  • input – The table of columns to hash

  • hash_function – The hash function enum to use

  • seed – Optional seed value to use for the hash function

  • stream – CUDA stream used for device memory operations and kernel launches

  • mr – Device memory resource used to allocate the returned column’s device memory

Returns:

A column where each row is the hash of a column from the input

Variables

static constexpr uint32_t DEFAULT_HASH_SEED = 0#

The default seed value for hash functions.