Sampling#
Uniform Random Walks#
-
cugraph_error_code_t cugraph_uniform_random_walks(const cugraph_resource_handle_t *handle, cugraph_graph_t *graph, const cugraph_type_erased_device_array_view_t *start_vertices, size_t max_length, cugraph_random_walk_result_t **result, cugraph_error_t **error)#
Compute uniform random walks.
- Parameters:
handle – [in] Handle for accessing resources
graph – [in] Pointer to graph. NOTE: Graph might be modified if the storage needs to be transposed
start_vertices – [in] Array of source vertices
max_length – [in] Maximum length of the generated path
result – [out] Output from the node2vec call
error – [out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS
- Returns:
error code
Biased Random Walks#
-
cugraph_error_code_t cugraph_biased_random_walks(const cugraph_resource_handle_t *handle, cugraph_graph_t *graph, const cugraph_type_erased_device_array_view_t *start_vertices, size_t max_length, cugraph_random_walk_result_t **result, cugraph_error_t **error)#
Compute biased random walks.
- Parameters:
handle – [in] Handle for accessing resources
graph – [in] Pointer to graph. NOTE: Graph might be modified if the storage needs to be transposed
start_vertices – [in] Array of source vertices
max_length – [in] Maximum length of the generated path
result – [out] Output from the node2vec call
error – [out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS
- Returns:
error code
Random Walks via Node2Vec#
-
cugraph_error_code_t cugraph_node2vec_random_walks(const cugraph_resource_handle_t *handle, cugraph_graph_t *graph, const cugraph_type_erased_device_array_view_t *start_vertices, size_t max_length, double p, double q, cugraph_random_walk_result_t **result, cugraph_error_t **error)#
Compute random walks using the node2vec framework.
- Parameters:
handle – [in] Handle for accessing resources
graph – [in] Pointer to graph. NOTE: Graph might be modified if the storage needs to be transposed
start_vertices – [in] Array of source vertices
max_length – [in] Maximum length of the generated path
compress_result – [in] If true, return the paths as a compressed sparse row matrix, otherwise return as a dense matrix
p – [in] The return parameter
q – [in] The in/out parameter
result – [out] Output from the node2vec call
error – [out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS
- Returns:
error code
Node2Vec#
-
cugraph_error_code_t cugraph_node2vec(const cugraph_resource_handle_t *handle, cugraph_graph_t *graph, const cugraph_type_erased_device_array_view_t *sources, size_t max_depth, bool_t compress_result, double p, double q, cugraph_random_walk_result_t **result, cugraph_error_t **error)#
Compute random walks using the node2vec framework.
- Deprecated:
This call should be replaced with cugraph_node2vec_random_walks
- Parameters:
handle – [in] Handle for accessing resources
graph – [in] Pointer to graph. NOTE: Graph might be modified if the storage needs to be transposed
sources – [in] Array of source vertices
max_depth – [in] Maximum length of the generated path
compress_result – [in] If true, return the paths as a compressed sparse row matrix, otherwise return as a dense matrix
p – [in] The return parameter
q – [in] The in/out parameter
result – [out] Output from the node2vec call
error – [out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS
- Returns:
error code
Uniform Neighbor Sampling#
-
cugraph_error_code_t cugraph_uniform_neighbor_sample(const cugraph_resource_handle_t *handle, cugraph_graph_t *graph, const cugraph_type_erased_device_array_view_t *start_vertices, const cugraph_type_erased_device_array_view_t *start_vertex_labels, const cugraph_type_erased_device_array_view_t *label_list, const cugraph_type_erased_device_array_view_t *label_to_comm_rank, const cugraph_type_erased_device_array_view_t *label_offsets, const cugraph_type_erased_host_array_view_t *fan_out, cugraph_rng_state_t *rng_state, const cugraph_sampling_options_t *options, bool_t do_expensive_check, cugraph_sample_result_t **result, cugraph_error_t **error)#
Uniform Neighborhood Sampling.
- Deprecated:
This API will be deleted, use cugraph_homogeneous_uniform_neighbor_sample
Returns a sample of the neighborhood around specified start vertices. Optionally, each start vertex can be associated with a label, allowing the caller to specify multiple batches of sampling requests in the same function call - which should improve GPU utilization.
If label is NULL then all start vertices will be considered part of the same batch and the return value will not have a label column.
- Parameters:
handle – [in] Handle for accessing resources
graph – [in] Pointer to graph. NOTE: Graph might be modified if the storage needs to be transposed
start_vertices – [in] Device array of start vertices for the sampling
start_vertex_labels – [in] Device array of start vertex labels for the sampling. The labels associated with each start vertex will be included in the output associated with results that were derived from that start vertex. We only support label of type INT32. If label is NULL, the return data will not be labeled.
label_list – [in] Device array of the labels included in
start_vertex_labels
. Iflabel_to_comm_rank
is not specified this parameter is ignored. If specified, label_list must be sorted in ascending order.label_to_comm_rank – [in] Device array identifying which comm rank the output for a particular label should be shuffled in the output. If not specifed the data is not organized in output. If specified then the all data from
label_list
[i] will be shuffled to rank. This cannot be specified unless
start_vertex_labels
is also specified label_to_comm_rank[i]. If not specified then the output data will not be shuffled between ranks.label_offsets – [in] Device array of the offsets for each label in the seed list. This parameter is only used with the retain_seeds option.
fan_out – [in] Host array defining the fan out at each step in the sampling algorithm. We only support fan_out values of type INT32
rng_state – [inout] State of the random number generator, updated with each call
sampling_options – [in] Opaque pointer defining the sampling options.
do_expensive_check – [in] A flag to run expensive checks for input arguments (if set to true)
result – [out] Output from the uniform_neighbor_sample call
error – [out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS
- Returns:
error code
Sampling Support Functions#
-
cugraph_type_erased_device_array_view_t *cugraph_lookup_result_get_srcs(const cugraph_lookup_result_t *result)#
Get the edge sources from the lookup result.
- Parameters:
result – [in] The result from src-dst lookup using edge ids and type(s)
- Returns:
type erased array pointing to the edge sources
-
cugraph_type_erased_device_array_view_t *cugraph_lookup_result_get_dsts(const cugraph_lookup_result_t *result)#
Get the edge destinations from the lookup result.
- Parameters:
result – [in] The result from src-dst lookup using edge ids and type(s)
- Returns:
type erased array pointing to the edge destinations
-
void cugraph_lookup_result_free(cugraph_lookup_result_t *result)#
Free a src-dst lookup result.
- Parameters:
result – [in] The result from src-dst lookup using edge ids and type(s)
-
void cugraph_lookup_container_free(cugraph_lookup_container_t *container)#
Free a sampling lookup map.
- Parameters:
container – [in] The sampling lookup map (a.k.a. container).
-
size_t cugraph_random_walk_result_get_max_path_length(cugraph_random_walk_result_t *result)#
Get the max path length from random walk result.
- Parameters:
result – [in] The result from random walks
- Returns:
maximum path length
-
cugraph_type_erased_device_array_view_t *cugraph_random_walk_result_get_paths(cugraph_random_walk_result_t *result)#
Get the matrix (row major order) of vertices in the paths.
- Parameters:
result – [in] The result from a random walk algorithm
- Returns:
type erased array pointing to the path matrix in device memory
-
cugraph_type_erased_device_array_view_t *cugraph_random_walk_result_get_weights(cugraph_random_walk_result_t *result)#
Get the matrix (row major order) of edge weights in the paths.
- Parameters:
result – [in] The result from a random walk algorithm
- Returns:
type erased array pointing to the path edge weights in device memory
-
cugraph_type_erased_device_array_view_t *cugraph_random_walk_result_get_path_sizes(cugraph_random_walk_result_t *result)#
If the random walk result is compressed, get the path sizes.
- Deprecated:
This call will no longer be relevant once the new node2vec are called
- Parameters:
result – [in] The result from a random walk algorithm
- Returns:
type erased array pointing to the path sizes in device memory
-
void cugraph_random_walk_result_free(cugraph_random_walk_result_t *result)#
Free random walks result.
- Parameters:
result – [in] The result from random walks
-
cugraph_error_code_t cugraph_sampling_options_create(cugraph_sampling_options_t **options, cugraph_error_t **error)#
Create sampling options object.
All sampling options set to FALSE
- Parameters:
options – [out] Opaque pointer to the sampling options
error – [out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS
-
void cugraph_sampling_set_retain_seeds(cugraph_sampling_options_t *options, bool_t value)#
Set flag to retain seeds (original sources)
- Parameters:
options – - opaque pointer to the sampling options
value – - Boolean value to assign to the option
-
void cugraph_sampling_set_renumber_results(cugraph_sampling_options_t *options, bool_t value)#
Set flag to renumber results.
- Parameters:
options – - opaque pointer to the sampling options
value – - Boolean value to assign to the option
-
void cugraph_sampling_set_compress_per_hop(cugraph_sampling_options_t *options, bool_t value)#
Set whether to compress per-hop (True) or globally (False)
- Parameters:
options – - opaque pointer to the sampling options
value – - Boolean value to assign to the option
-
void cugraph_sampling_set_with_replacement(cugraph_sampling_options_t *options, bool_t value)#
Set flag to sample with_replacement.
- Parameters:
options – - opaque pointer to the sampling options
value – - Boolean value to assign to the option
-
void cugraph_sampling_set_return_hops(cugraph_sampling_options_t *options, bool_t value)#
Set flag to sample return_hops.
- Parameters:
options – - opaque pointer to the sampling options
value – - Boolean value to assign to the option
-
void cugraph_sampling_set_compression_type(cugraph_sampling_options_t *options, cugraph_compression_type_t value)#
Set compression type.
- Parameters:
options – - opaque pointer to the sampling options
value – - Enum defining the compresion type
-
void cugraph_sampling_set_prior_sources_behavior(cugraph_sampling_options_t *options, cugraph_prior_sources_behavior_t value)#
Set prior sources behavior.
- Parameters:
options – - opaque pointer to the sampling options
value – - Enum defining prior sources behavior
-
void cugraph_sampling_set_dedupe_sources(cugraph_sampling_options_t *options, bool_t value)#
Set flag to sample dedupe_sources prior to sampling.
- Parameters:
options – - opaque pointer to the sampling options
value – - Boolean value to assign to the option
-
void cugraph_sampling_options_free(cugraph_sampling_options_t *options)#
Free sampling options object.
- Parameters:
options – [in] Opaque pointer to sampling object
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_majors(const cugraph_sample_result_t *result)#
Get the major vertices from the sampling algorithm result.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the major vertices in device memory
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_minors(const cugraph_sample_result_t *result)#
Get the minor vertices from the sampling algorithm result.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the minor vertices in device memory
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_major_offsets(const cugraph_sample_result_t *result)#
Get the major offsets from the sampling algorithm result.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the major offsets in device memory
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_start_labels(const cugraph_sample_result_t *result)#
Get the start labels from the sampling algorithm result.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the start labels
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_id(const cugraph_sample_result_t *result)#
Get the edge_id from the sampling algorithm result.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the edge_id
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_type(const cugraph_sample_result_t *result)#
Get the edge_type from the sampling algorithm result.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the edge_type
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_weight(const cugraph_sample_result_t *result)#
Get the edge_weight from the sampling algorithm result.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the edge_weight
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_hop(const cugraph_sample_result_t *result)#
Get the hop from the sampling algorithm result.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the hop
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_label_hop_offsets(const cugraph_sample_result_t *result)#
Get the label-hop offsets from the sampling algorithm result.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the label-hop offsets
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_index(const cugraph_sample_result_t *result)#
Get the index from the sampling algorithm result.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the index
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_renumber_map(const cugraph_sample_result_t *result)#
Get the renumber map.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the renumber map
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_renumber_map_offsets(const cugraph_sample_result_t *result)#
Get the renumber map offsets.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the renumber map offsets
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_renumber_map(const cugraph_sample_result_t *result)#
Get the edge renumber map.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the renumber map
-
cugraph_type_erased_device_array_view_t *cugraph_sample_result_get_edge_renumber_map_offsets(const cugraph_sample_result_t *result)#
Get the edge renumber map offets.
- Parameters:
result – [in] The result from a sampling algorithm
- Returns:
type erased array pointing to the renumber map
-
void cugraph_sample_result_free(cugraph_sample_result_t *result)#
Free a sampling result.
- Parameters:
result – [in] The result from a sampling algorithm
-
cugraph_error_code_t cugraph_test_sample_result_create(const cugraph_resource_handle_t *handle, const cugraph_type_erased_device_array_view_t *srcs, const cugraph_type_erased_device_array_view_t *dsts, const cugraph_type_erased_device_array_view_t *edge_id, const cugraph_type_erased_device_array_view_t *edge_type, const cugraph_type_erased_device_array_view_t *wgt, const cugraph_type_erased_device_array_view_t *hop, const cugraph_type_erased_device_array_view_t *label, cugraph_sample_result_t **result, cugraph_error_t **error)#
Create a sampling result (testing API)
- Parameters:
handle – [in] Handle for accessing resources
srcs – [in] Device array view to populate srcs
dsts – [in] Device array view to populate dsts
edge_id – [in] Device array view to populate edge_id (can be NULL)
edge_type – [in] Device array view to populate edge_type (can be NULL)
wgt – [in] Device array view to populate wgt (can be NULL)
hop – [in] Device array view to populate hop
label – [in] Device array view to populate label (can be NULL)
result – [out] Pointer to the location to store the cugraph_sample_result_t*
error – [out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS
- Returns:
error code
-
cugraph_error_code_t cugraph_test_uniform_neighborhood_sample_result_create(const cugraph_resource_handle_t *handle, const cugraph_type_erased_device_array_view_t *srcs, const cugraph_type_erased_device_array_view_t *dsts, const cugraph_type_erased_device_array_view_t *edge_id, const cugraph_type_erased_device_array_view_t *edge_type, const cugraph_type_erased_device_array_view_t *weight, const cugraph_type_erased_device_array_view_t *hop, const cugraph_type_erased_device_array_view_t *label, cugraph_sample_result_t **result, cugraph_error_t **error)#
Create a sampling result (testing API)
- Parameters:
handle – [in] Handle for accessing resources
srcs – [in] Device array view to populate srcs
dsts – [in] Device array view to populate dsts
edge_id – [in] Device array view to populate edge_id
edge_type – [in] Device array view to populate edge_type
weight – [in] Device array view to populate weight
hop – [in] Device array view to populate hop
label – [in] Device array view to populate label
result – [out] Pointer to the location to store the cugraph_sample_result_t*
error – [out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS
- Returns:
error code
-
cugraph_error_code_t cugraph_select_random_vertices(const cugraph_resource_handle_t *handle, const cugraph_graph_t *graph, cugraph_rng_state_t *rng_state, size_t num_vertices, cugraph_type_erased_device_array_t **vertices, cugraph_error_t **error)#
Select random vertices from the graph.
- Parameters:
handle – [in] Handle for accessing resources
graph – [in] Pointer to graph
rng_state – [inout] State of the random number generator, updated with each call
num_vertices – [in] Number of vertices to sample
vertices – [out] Device array view to populate label
error – [out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS
- Returns:
error code
-
cugraph_error_code_t cugraph_negative_sampling(const cugraph_resource_handle_t *handle, cugraph_rng_state_t *rng_state, cugraph_graph_t *graph, const cugraph_type_erased_device_array_view_t *vertices, const cugraph_type_erased_device_array_view_t *src_biases, const cugraph_type_erased_device_array_view_t *dst_biases, size_t num_samples, bool_t remove_duplicates, bool_t remove_existing_edges, bool_t exact_number_of_samples, bool_t do_expensive_check, cugraph_coo_t **result, cugraph_error_t **error)#
Perform negative sampling.
Negative sampling generates a COO structure defining edges according to the specified parameters
- Parameters:
handle – [in] Handle for accessing resources
rng_state – [inout] State of the random number generator, updated with each call
graph – [in] Pointer to graph
vertices – [in] Vertex ids for the source biases. If
src_bias
anddst_bias
are not specified this is ignored. Ifvertices
is specified then vertices[i] is the vertex id of src_biases[i] and dst_biases[i]. Ifvertices
is not specified then i is the vertex id if src_biases[i] and dst_biases[i]src_biases – [in] Bias for selecting source vertices. If NULL, do uniform sampling, if provided probability of vertex i will be src_bias[i] / (sum of all source biases)
dst_biases – [in] Bias for selecting destination vertices. If NULL, do uniform sampling, if provided probability of vertex i will be dst_bias[i] / (sum of all destination biases)
num_samples – [in] Number of negative samples to generate
remove_duplicates – [in] If true, remove duplicates from sampled edges
remove_existing_edges – [in] If true, remove sampled edges that actually exist in the graph
exact_number_of_samples – [in] If true, result should contain exactly
num_samples
. If false the code will generatenum_samples
and then do any filtering as specifieddo_expensive_check – [in] A flag to run expensive checks for input arguments (if set to true)
result – [out] Opaque pointer to generated coo list
error – [out] Pointer to an error object storing details of any error. Will be populated if error code is not CUGRAPH_SUCCESS
- Returns:
error code