Link Analysis#

template<typename vertex_t, typename edge_t, typename weight_t, typename result_t, bool multi_gpu>
void pagerank(raft::handle_t const &handle, graph_view_t<vertex_t, edge_t, true, multi_gpu> const &graph_view, std::optional<edge_property_view_t<edge_t, weight_t const*>> edge_weight_view, std::optional<weight_t const*> precomputed_vertex_out_weight_sums, std::optional<vertex_t const*> personalization_vertices, std::optional<result_t const*> personalization_values, std::optional<vertex_t> personalization_vector_size, result_t *pageranks, result_t alpha, result_t epsilon, size_t max_iterations = 500, bool has_initial_guess = false, bool do_expensive_check = false)#

Compute PageRank scores.

Deprecated:

This API will be deprecated to replaced by the new version below that returns metadata about the algorithm.

This function computes general (if personalization_vertices is nullptr) or personalized (if personalization_vertices is not nullptr.) PageRank scores.

Throws:

cugraph::logic_error – on erroneous input arguments or if fails to converge before max_iterations.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

edge_t – Type of edge identifiers. Needs to be an integral type.

weight_t – Type of edge weights. Needs to be a floating point type.

result_t – Type of PageRank scores.

multi_gpu – Flag indicating whether template instantiation should target single-GPU (false) or multi-GPU (true).

Parameters:

handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

graph_view – Graph view object.

edge_weight_view – Optional view object holding edge weights for graph_view. If edge_weight_view.has_value() == false, edge weights are assumed to be 1.0.

precomputed_vertex_out_weight_sums – Pointer to an array storing sums of out-going edge weights for the vertices (for re-use) or std::nullopt. If std::nullopt, these values are freshly computed. Computing these values outside this function reduces the number of memory allocations/deallocations and computing if a user repeatedly computes PageRank scores using the same graph with different personalization vectors.

personalization_vertices – Pointer to an array storing personalization vertex identifiers (compute personalized PageRank) or std::nullopt (compute general PageRank).

personalization_values – Pointer to an array storing personalization values for the vertices in the personalization set. Relevant only if personalization_vertices is not std::nullopt.

personalization_vector_size – Size of the personalization set. If @personalization_vertices is not std::nullopt, the sizes of the arrays pointed by personalization_vertices and personalization_values should be personalization_vector_size.

pageranks – Pointer to the output PageRank score array.

alpha – PageRank damping factor.

epsilon – Error tolerance to check convergence. Convergence is assumed if the sum of the differences in PageRank values between two consecutive iterations is less than the number of vertices in the graph multiplied by epsilon.

max_iterations – Maximum number of PageRank iterations.

has_initial_guess – If set to true, values in the PageRank output array (pointed by pageranks) is used as initial PageRank values. If false, initial PageRank values are set to 1.0 divided by the number of vertices in the graph.

do_expensive_check – A flag to run expensive checks for input arguments (if set to true).

template<typename vertex_t, typename edge_t, typename weight_t, typename result_t, bool multi_gpu>
std::tuple<rmm::device_uvector<result_t>, centrality_algorithm_metadata_t> pagerank(raft::handle_t const &handle, graph_view_t<vertex_t, edge_t, true, multi_gpu> const &graph_view, std::optional<edge_property_view_t<edge_t, weight_t const*>> edge_weight_view, std::optional<raft::device_span<weight_t const>> precomputed_vertex_out_weight_sums, std::optional<std::tuple<raft::device_span<vertex_t const>, raft::device_span<result_t const>>> personalization, std::optional<raft::device_span<result_t const>> initial_pageranks, result_t alpha, result_t epsilon, size_t max_iterations = 500, bool do_expensive_check = false)#

Compute PageRank scores.

.*

This function computes general (if personalization_vertices is nullptr) or personalized (if personalization_vertices is not nullptr.) PageRank scores.

Throws:

cugraph::logic_error – on erroneous input arguments or if fails to converge before max_iterations.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

edge_t – Type of edge identifiers. Needs to be an integral type.

weight_t – Type of edge weights. Needs to be a floating point type.

result_t – Type of PageRank scores.

multi_gpu – Flag indicating whether template instantiation should target single-GPU (false) or multi-GPU (true).

Parameters:

handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

graph_view – Graph view object.

edge_weight_view – Optional view object holding edge weights for graph_view. If edge_weight_view.has_value() == false, edge weights are assumed to be 1.0.

precomputed_vertex_out_weight_sums – Pointer to an array storing sums of out-going edge weights for the vertices (for re-use) or std::nullopt. If std::nullopt, these values are freshly computed. Computing these values outside this function reduces the number of memory allocations/deallocations and computing if a user repeatedly computes PageRank scores using the same graph with different personalization vectors.

personalization – Optional tuple containing device spans of vertex identifiers and personalization values for the vertices (compute personalized PageRank) or std::nullopt (compute general PageRank).

initial_pageranks – Optional device span containing initial PageRank values. If specified this array will be used as the initial values and the PageRank values will be updated in place. If not specified then the initial values will be set to 1.0 divided by the number of vertices in the graph and the return value will contain an rmm::device_uvector containing the resulting PageRank values.

alpha – PageRank damping factor.

epsilon – Error tolerance to check convergence. Convergence is assumed if the sum of the differences in PageRank values between two consecutive iterations is less than the number of vertices in the graph multiplied by epsilon.

max_iterations – Maximum number of PageRank iterations.

do_expensive_check – A flag to run expensive checks for input arguments (if set to true).

Returns:

tuple containing the optional pagerank results (populated if initial_pageranks is set to std::nullopt) and a metadata structure with metadata indicating how many iterations were run and whether the algorithm converged or not.

template<typename vertex_t, typename edge_t, typename result_t, bool multi_gpu>
std::tuple<result_t, size_t> hits(raft::handle_t const &handle, graph_view_t<vertex_t, edge_t, true, multi_gpu> const &graph_view, result_t *hubs, result_t *authorities, result_t epsilon, size_t max_iterations, bool has_initial_hubs_guess, bool normalize, bool do_expensive_check)#

Compute HITS scores.

.*

This function computes HITS scores for the vertices of a graph

Throws:

cugraph::logic_error – on erroneous input arguments

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

edge_t – Type of edge identifiers. Needs to be an integral type.

multi_gpu – Flag indicating whether template instantiation should target single-GPU (false) or multi-GPU (true).

Parameters:

handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

graph_view – Graph view object.

hubs – Pointer to the input/output hub score array.

authorities – Pointer to the output authorities score array.

epsilon – Error tolerance to check convergence. Convergence is assumed if the sum of the differences in hub values between two consecutive iterations is less than epsilon

max_iterations – Maximum number of HITS iterations.

has_initial_guess – If set to true, values in the hubs output array (pointed by hubs) is used as initial hub values. If false, initial hub values are set to 1.0 divided by the number of vertices in the graph.

normalize – If set to true, final hub and authority scores are normalized (the L1-norm of the returned hub and authority score arrays is 1.0) before returning.

do_expensive_check – A flag to run expensive checks for input arguments (if set to true).

Returns:

std::tuple<result_t, size_t> A tuple of sum of the differences of hub scores of the last two iterations and the total number of iterations taken to reach the final result

Link Analysis#

This Page