Sampling#

Uniform Random Walks#

cugraph_error_code_t cugraph_uniform_random_walks(const cugraph_resource_handle_t *handle, cugraph_rng_state_t *rng_state, cugraph_graph_t *graph, const cugraph_type_erased_device_array_view_t *start_vertices, size_t max_length, cugraph_random_walk_result_t **result, cugraph_error_t **error)#

Compute uniform random walks.

Parameters:
  • handle[in] Handle for accessing resources

  • rng_state[inout] State of the random number generator, updated with each call

  • graph[in] Pointer to graph. NOTE: Graph might be modified if the storage needs to be transposed

  • start_vertices[in] Array of source vertices

  • max_length[in] Maximum length of the generated path

  • result[out] Output from the node2vec call

  • error[out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS

Returns:

error code

Biased Random Walks#

cugraph_error_code_t cugraph_biased_random_walks(const cugraph_resource_handle_t *handle, cugraph_rng_state_t *rng_state, cugraph_graph_t *graph, const cugraph_type_erased_device_array_view_t *start_vertices, size_t max_length, cugraph_random_walk_result_t **result, cugraph_error_t **error)#

Compute biased random walks.

Parameters:
  • handle[in] Handle for accessing resources

  • rng_state[inout] State of the random number generator, updated with each call

  • graph[in] Pointer to graph. NOTE: Graph might be modified if the storage needs to be transposed

  • start_vertices[in] Array of source vertices

  • max_length[in] Maximum length of the generated path

  • result[out] Output from the node2vec call

  • error[out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS

Returns:

error code

Random Walks via Node2Vec#

cugraph_error_code_t cugraph_node2vec_random_walks(const cugraph_resource_handle_t *handle, cugraph_rng_state_t *rng_state, cugraph_graph_t *graph, const cugraph_type_erased_device_array_view_t *start_vertices, size_t max_length, double p, double q, cugraph_random_walk_result_t **result, cugraph_error_t **error)#

Compute random walks using the node2vec framework.

Parameters:
  • handle[in] Handle for accessing resources

  • rng_state[inout] State of the random number generator, updated with each call

  • graph[in] Pointer to graph. NOTE: Graph might be modified if the storage needs to be transposed

  • start_vertices[in] Array of source vertices

  • max_length[in] Maximum length of the generated path

  • compress_result[in] If true, return the paths as a compressed sparse row matrix, otherwise return as a dense matrix

  • p[in] The return parameter

  • q[in] The in/out parameter

  • result[out] Output from the node2vec call

  • error[out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS

Returns:

error code

Node2Vec#

Warning

doxygenfunction: Cannot find function “cugraph_node2vec” in doxygen xml output for project “libcugraph” from directory: /tmp/tmp.aUT0xCUQBI

Uniform Neighbor Sampling#

Warning

doxygenfunction: Cannot find function “cugraph_uniform_neighbor_sample” in doxygen xml output for project “libcugraph” from directory: /tmp/tmp.aUT0xCUQBI

Sampling Support Functions#

cugraph_type_erased_device_array_view_t *cugraph_lookup_result_get_srcs(const cugraph_lookup_result_t *result)#

Get the edge sources from the lookup result.

Parameters:

result[in] The result from src-dst lookup using edge ids and type(s)

Returns:

type erased array pointing to the edge sources

cugraph_type_erased_device_array_view_t *cugraph_lookup_result_get_dsts(const cugraph_lookup_result_t *result)#

Get the edge destinations from the lookup result.

Parameters:

result[in] The result from src-dst lookup using edge ids and type(s)

Returns:

type erased array pointing to the edge destinations

void cugraph_lookup_result_free(cugraph_lookup_result_t *result)#

Free a src-dst lookup result.

Parameters:

result[in] The result from src-dst lookup using edge ids and type(s)

void cugraph_lookup_container_free(cugraph_lookup_container_t *container)#

Free a sampling lookup map.

Parameters:

container[in] The sampling lookup map (a.k.a. container).

size_t cugraph_random_walk_result_get_max_path_length(cugraph_random_walk_result_t *result)#

Get the max path length from random walk result.

Parameters:

result[in] The result from random walks

Returns:

maximum path length

cugraph_type_erased_device_array_view_t *cugraph_random_walk_result_get_paths(cugraph_random_walk_result_t *result)#

Get the matrix (row major order) of vertices in the paths.

Parameters:

result[in] The result from a random walk algorithm

Returns:

type erased array pointing to the path matrix in device memory

cugraph_type_erased_device_array_view_t *cugraph_random_walk_result_get_weights(cugraph_random_walk_result_t *result)#

Get the matrix (row major order) of edge weights in the paths.

Parameters:

result[in] The result from a random walk algorithm

Returns:

type erased array pointing to the path edge weights in device memory

cugraph_type_erased_device_array_view_t *cugraph_random_walk_result_get_path_sizes(cugraph_random_walk_result_t *result)#

If the random walk result is compressed, get the path sizes.

Deprecated:

This call will no longer be relevant once the new node2vec are called

Parameters:

result[in] The result from a random walk algorithm

Returns:

type erased array pointing to the path sizes in device memory

void cugraph_random_walk_result_free(cugraph_random_walk_result_t *result)#

Free random walks result.

Parameters:

result[in] The result from random walks

cugraph_error_code_t cugraph_sampling_options_create(cugraph_sampling_options_t **options, cugraph_error_t **error)#

Create sampling options object.

All sampling options set to FALSE

Parameters:
  • options[out] Opaque pointer to the sampling options

  • error[out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS

void cugraph_sampling_set_retain_seeds(cugraph_sampling_options_t *options, bool_t value)#

Set flag to retain seeds (original sources)

Parameters:
  • options – - opaque pointer to the sampling options

  • value – - Boolean value to assign to the option

void cugraph_sampling_set_renumber_results(cugraph_sampling_options_t *options, bool_t value)#

Set flag to renumber results.

Parameters:
  • options – - opaque pointer to the sampling options

  • value – - Boolean value to assign to the option

void cugraph_sampling_set_compress_per_hop(cugraph_sampling_options_t *options, bool_t value)#

Set whether to compress per-hop (True) or globally (False)

Parameters:
  • options – - opaque pointer to the sampling options

  • value – - Boolean value to assign to the option

void cugraph_sampling_set_with_replacement(cugraph_sampling_options_t *options, bool_t value)#

Set flag to sample with_replacement.

Parameters:
  • options – - opaque pointer to the sampling options

  • value – - Boolean value to assign to the option

void cugraph_sampling_set_return_hops(cugraph_sampling_options_t *options, bool_t value)#

Set flag to sample return_hops.

Parameters:
  • options – - opaque pointer to the sampling options

  • value – - Boolean value to assign to the option

void cugraph_sampling_set_compression_type(cugraph_sampling_options_t *options, cugraph_compression_type_t value)#

Set compression type.

Parameters:
  • options – - opaque pointer to the sampling options

  • value – - Enum defining the compresion type

void cugraph_sampling_set_prior_sources_behavior(cugraph_sampling_options_t *options, cugraph_prior_sources_behavior_t value)#

Set prior sources behavior.

Parameters:
  • options – - opaque pointer to the sampling options

  • value – - Enum defining prior sources behavior

void cugraph_sampling_set_dedupe_sources(cugraph_sampling_options_t *options, bool_t value)#

Set flag to sample dedupe_sources prior to sampling.

Parameters:
  • options – - opaque pointer to the sampling options

  • value – - Boolean value to assign to the option

void cugraph_sampling_set_temporal_sampling_comparison(cugraph_sampling_options_t *options, cugraph_temporal_sampling_comparison_t comparison)#

Set temporal sampling to use associated comparision.

Parameters:
  • options – - opaque pointer to the sampling options

  • comparison – - Comparison value to assign to the option

void cugraph_sampling_set_disjoint_sampling(cugraph_sampling_options_t *options, bool_t value)#

Set flag to perform disjoint sampling.

Note: This flag is not supported in the current implementation.

Parameters:
  • options – - opaque pointer to the sampling options

  • value – - Boolean value to assign to the option

void cugraph_sampling_options_free(cugraph_sampling_options_t *options)#

Free sampling options object.

Parameters:

options[in] Opaque pointer to sampling object

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_majors(const cugraph_sample_result_t *result)#

Get the major vertices from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the major vertices in device memory

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_minors(const cugraph_sample_result_t *result)#

Get the minor vertices from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the minor vertices in device memory

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_major_offsets(const cugraph_sample_result_t *result)#

Get the major offsets from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the major offsets in device memory

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_start_labels(const cugraph_sample_result_t *result)#

Get the start labels from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the start labels

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_id(const cugraph_sample_result_t *result)#

Get the edge_id from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the edge_id

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_type(const cugraph_sample_result_t *result)#

Get the edge_type from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the edge_type

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_weight(const cugraph_sample_result_t *result)#

Get the edge_weight from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the edge_weight

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_start_time(const cugraph_sample_result_t *result)#

Get the edge_start_time from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the edge_start_time

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_end_time(const cugraph_sample_result_t *result)#

Get the edge_end_time from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the edge_end_time

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_hop(const cugraph_sample_result_t *result)#

Get the hop from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the hop

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_label_hop_offsets(const cugraph_sample_result_t *result)#

Get the label-hop offsets from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the label-hop offsets

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_label_type_hop_offsets(const cugraph_sample_result_t *result)#

Get the label-type-hop offsets from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the label-type-hop offsets

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_index(const cugraph_sample_result_t *result)#

Get the index from the sampling algorithm result.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the index

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_renumber_map(const cugraph_sample_result_t *result)#

Get the renumber map.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the renumber map

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_renumber_map_offsets(const cugraph_sample_result_t *result)#

Get the renumber map offsets.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the renumber map offsets

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_renumber_map(const cugraph_sample_result_t *result)#

Get the edge renumber map.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the renumber map

cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_renumber_map_offsets(const cugraph_sample_result_t *result)#

Get the edge renumber map offets.

Parameters:

result[in] The result from a sampling algorithm

Returns:

type erased array pointing to the renumber map

void cugraph_sample_result_free(cugraph_sample_result_t *result)#

Free a sampling result.

Parameters:

result[in] The result from a sampling algorithm

cugraph_error_code_t cugraph_test_sample_result_create(const cugraph_resource_handle_t *handle, const cugraph_type_erased_device_array_view_t *srcs, const cugraph_type_erased_device_array_view_t *dsts, const cugraph_type_erased_device_array_view_t *edge_id, const cugraph_type_erased_device_array_view_t *edge_type, const cugraph_type_erased_device_array_view_t *wgt, const cugraph_type_erased_device_array_view_t *hop, const cugraph_type_erased_device_array_view_t *label, cugraph_sample_result_t **result, cugraph_error_t **error)#

Create a sampling result (testing API)

Parameters:
  • handle[in] Handle for accessing resources

  • srcs[in] Device array view to populate srcs

  • dsts[in] Device array view to populate dsts

  • edge_id[in] Device array view to populate edge_id (can be NULL)

  • edge_type[in] Device array view to populate edge_type (can be NULL)

  • wgt[in] Device array view to populate wgt (can be NULL)

  • hop[in] Device array view to populate hop

  • label[in] Device array view to populate label (can be NULL)

  • result[out] Pointer to the location to store the cugraph_sample_result_t*

  • error[out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS

Returns:

error code

cugraph_error_code_t cugraph_test_uniform_neighborhood_sample_result_create(const cugraph_resource_handle_t *handle, const cugraph_type_erased_device_array_view_t *srcs, const cugraph_type_erased_device_array_view_t *dsts, const cugraph_type_erased_device_array_view_t *edge_id, const cugraph_type_erased_device_array_view_t *edge_type, const cugraph_type_erased_device_array_view_t *weight, const cugraph_type_erased_device_array_view_t *hop, const cugraph_type_erased_device_array_view_t *label, cugraph_sample_result_t **result, cugraph_error_t **error)#

Create a sampling result (testing API)

Parameters:
  • handle[in] Handle for accessing resources

  • srcs[in] Device array view to populate srcs

  • dsts[in] Device array view to populate dsts

  • edge_id[in] Device array view to populate edge_id

  • edge_type[in] Device array view to populate edge_type

  • weight[in] Device array view to populate weight

  • hop[in] Device array view to populate hop

  • label[in] Device array view to populate label

  • result[out] Pointer to the location to store the cugraph_sample_result_t*

  • error[out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS

Returns:

error code

cugraph_error_code_t cugraph_select_random_vertices(const cugraph_resource_handle_t *handle, const cugraph_graph_t *graph, cugraph_rng_state_t *rng_state, size_t num_vertices, cugraph_type_erased_device_array_t **vertices, cugraph_error_t **error)#

Select random vertices from the graph.

Parameters:
  • handle[in] Handle for accessing resources

  • graph[in] Pointer to graph

  • rng_state[inout] State of the random number generator, updated with each call

  • num_vertices[in] Number of vertices to sample

  • vertices[out] Device array view to populate label

  • error[out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS

Returns:

error code

cugraph_error_code_t cugraph_negative_sampling(const cugraph_resource_handle_t *handle, cugraph_rng_state_t *rng_state, cugraph_graph_t *graph, const cugraph_type_erased_device_array_view_t *vertices, const cugraph_type_erased_device_array_view_t *src_biases, const cugraph_type_erased_device_array_view_t *dst_biases, size_t num_samples, bool_t remove_duplicates, bool_t remove_existing_edges, bool_t exact_number_of_samples, bool_t do_expensive_check, cugraph_coo_t **result, cugraph_error_t **error)#

Perform negative sampling.

Negative sampling generates a COO structure defining edges according to the specified parameters

Parameters:
  • handle[in] Handle for accessing resources

  • rng_state[inout] State of the random number generator, updated with each call

  • graph[in] Pointer to graph

  • vertices[in] Vertex ids for the source biases. If src_bias and dst_bias are not specified this is ignored. If vertices is specified then vertices[i] is the vertex id of src_biases[i] and dst_biases[i]. If vertices is not specified then i is the vertex id if src_biases[i] and dst_biases[i]

  • src_biases[in] Bias for selecting source vertices. If NULL, do uniform sampling, if provided probability of vertex i will be src_bias[i] / (sum of all source biases)

  • dst_biases[in] Bias for selecting destination vertices. If NULL, do uniform sampling, if provided probability of vertex i will be dst_bias[i] / (sum of all destination biases)

  • num_samples[in] Number of negative samples to generate

  • remove_duplicates[in] If true, remove duplicates from sampled edges

  • remove_existing_edges[in] If true, remove sampled edges that actually exist in the graph

  • exact_number_of_samples[in] If true, result should contain exactly num_samples. If false the code will generate num_samples and then do any filtering as specified

  • do_expensive_check[in] A flag to run expensive checks for input arguments (if set to true)

  • result[out] Opaque pointer to generated coo list

  • error[out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS

Returns:

error code