Graph Generators#

template<typename vertex_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> generate_rmat_edgelist(raft::handle_t const &handle, size_t scale, size_t num_edges, double a = 0.57, double b = 0.19, double c = 0.19, uint64_t seed = 0, bool clip_and_flip = false, bool scramble_vertex_ids = false)#

generate an edge list for an R-mat graph.

Deprecated:

This function will be deprectated and should be replaced with the version that takes raft::random::RngState as a parameter

This function allows multi-edges and self-loops similar to the Graph 500 reference implementation.

NOTE: The scramble_vertex_ids function needs to be called in order to generate a graph conforming to the Graph 500 specification (note that scrambling does not affect cuGraph’s graph construction performance, so this is generally unnecessary). If edge_factor is given (e.g. Graph 500), set num_edges to (size_t{1} << scale) * edge_factor. To generate an undirected graph, set b == c and clip_and_flip = true. All the resulting edges will be placed in the lower triangular part (including the diagonal) of the graph adjacency matrix.

For multi-GPU generation with P GPUs, seed should be set to different values in different GPUs to avoid every GPU generating the same set of edges. num_edges should be adjusted as well; e.g. assuming edge_factor is given, set num_edges = (size_t{1} << scale) * edge_factor / P + (rank < (((size_t{1} << scale) * edge_factor) % P) ? 1 : 0).

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • scale – Scale factor to set the number of verties in the graph. Vertex IDs have values in [0, V), where V = 1 << scale.

  • num_edges – Number of edges to generate.

  • a – a, b, c, d (= 1.0 - (a + b + c)) in the R-mat graph generator (vist https://graph500.org for additional details). a, b, c, d should be non-negative and a + b + c should be no larger than 1.0.

  • b – a, b, c, d (= 1.0 - (a + b + c)) in the R-mat graph generator (vist https://graph500.org for additional details). a, b, c, d should be non-negative and a + b + c should be no larger than 1.0.

  • c – a, b, c, d (= 1.0 - (a + b + c)) in the R-mat graph generator (vist https://graph500.org for additional details). a, b, c, d should be non-negative and a + b + c should be no larger than 1.0.

  • seed – Seed value for the random number generator.

  • clip_and_flip – Flag controlling whether to generate edges only in the lower triangular part (including the diagonal) of the graph adjacency matrix (if set to true) or not (if set to false).

  • scramble_vertex_ids – Flag controlling whether to scramble vertex ID bits (if set to true) or not (if set to false); scrambling vertex ID bits breaks correlation between vertex ID values and vertex degrees.

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> generate_rmat_edgelist(raft::handle_t const &handle, raft::random::RngState &rng_state, size_t scale, size_t num_edges, double a = 0.57, double b = 0.19, double c = 0.19, bool clip_and_flip = false, bool scramble_vertex_ids = false)#

generate an edge list for an R-mat graph.

This function allows multi-edges and self-loops similar to the Graph 500 reference implementation.

NOTE: The scramble_vertex_ids function needs to be called in order to generate a graph conforming to the Graph 500 specification (note that scrambling does not affect cuGraph’s graph construction performance, so this is generally unnecessary). If edge_factor is given (e.g. Graph 500), set num_edges to (size_t{1} << scale) * edge_factor. To generate an undirected graph, set b == c and clip_and_flip = true. All the resulting edges will be placed in the lower triangular part (including the diagonal) of the graph adjacency matrix.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • rng_state – RAFT RNG state, updated with each call

  • scale – Scale factor to set the number of vertices in the graph. Vertex IDs have values in [0, V), where V = 1 << scale.

  • num_edges – Number of edges to generate.

  • a – a, b, c, d (= 1.0 - (a + b + c)) in the R-mat graph generator (vist https://graph500.org for additional details). a, b, c, d should be non-negative and a + b + c should be no larger than 1.0.

  • b – a, b, c, d (= 1.0 - (a + b + c)) in the R-mat graph generator (vist https://graph500.org for additional details). a, b, c, d should be non-negative and a + b + c should be no larger than 1.0.

  • c – a, b, c, d (= 1.0 - (a + b + c)) in the R-mat graph generator (vist https://graph500.org for additional details). a, b, c, d should be non-negative and a + b + c should be no larger than 1.0.

  • clip_and_flip – Flag controlling whether to generate edges only in the lower triangular part (including the diagonal) of the graph adjacency matrix (if set to true) or not (if set to false).

  • scramble_vertex_ids – Flag controlling whether to scramble vertex ID bits (if set to true) or not (if set to false); scrambling vertex ID bits breaks correlation between vertex ID values and vertex degrees.

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> generate_bipartite_rmat_edgelist(raft::handle_t const &handle, raft::random::RngState &rng_state, size_t src_scale, size_t dst_scale, size_t num_edges, double a = 0.57, double b = 0.19, double c = 0.19)#

generate an edge list for a bipartite R-mat graph.

The source vertex IDs will be in the range of [0, 2^src_scale) and the destination vertex IDs will be in the range of [0, 2^dst_scale). This function allows multi-edges.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • rng_state – RAFT RNG state, updated with each call

  • src_scale – Scale factor to set the range of source vertex IDs (or the first vertex set) in the bipartite graph. Vertex IDs have values in [0, V_src), where V_src = 1 << src_scale.

  • dst_scale – Scale factor to set the range of destination vertex IDs (or the second vertex set) in the bipartite graph. Vertex IDs have values in [0, V_dst), where V_dst = 1 << dst_scale.

  • num_edges – Number of edges to generate.

  • a – a, b, c, d (= 1.0 - (a + b + c)) in the R-mat graph generator (vist https://graph500.org for additional details). a, b, c, d should be non-negative and a + b + c should be no larger than 1.0.

  • b – a, b, c, d (= 1.0 - (a + b + c)) in the R-mat graph generator (vist https://graph500.org for additional details). a, b, c, d should be non-negative and a + b + c should be no larger than 1.0.

  • c – a, b, c, d (= 1.0 - (a + b + c)) in the R-mat graph generator (vist https://graph500.org for additional details). a, b, c, d should be non-negative and a + b + c should be no larger than 1.0.

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
std::vector<std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>>> generate_rmat_edgelists(raft::handle_t const &handle, size_t n_edgelists, size_t min_scale, size_t max_scale, size_t edge_factor = 16, generator_distribution_t size_distribution = generator_distribution_t::POWER_LAW, generator_distribution_t edge_distribution = generator_distribution_t::POWER_LAW, uint64_t seed = 0, bool clip_and_flip = false, bool scramble_vertex_ids = false)#

generate multiple edge lists using the R-mat graph generator.

Deprecated:

This function will be deprectated and should be replaced with the version that takes raft::random::RngState as a parameter

This function allows multi-edges and self-loops similar to the Graph 500 reference implementation.

NOTE: The scramble_vertex_ids function needs to be called in order to generate a graph conforming to the Graph 500 specification (note that scrambling does not affect cuGraph’s graph construction performance, so this is generally unnecessary). If edge_factor is given (e.g. Graph 500), set num_edges to (size_t{1} << scale) * edge_factor. To generate an undirected graph, set b == c and clip_and_flip = true. All the resulting edges will be placed in the lower triangular part (including the diagonal) of the graph adjacency matrix.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • n_edgelists – Number of edge lists (graphs) to generate

  • min_scale – Scale factor to set the minimum number of verties in the graph.

  • max_scale – Scale factor to set the maximum number of verties in the graph.

  • edge_factor – Average number of edges per vertex to generate.

  • size_distribution – Distribution of the graph sizes, impacts the scale parameter of the R-MAT generator

  • edge_distribution – Edges distribution for each graph, impacts how R-MAT parameters a,b,c,d, are set.

  • seed – Seed value for the random number generator.

  • clip_and_flip – Flag controlling whether to generate edges only in the lower triangular part (including the diagonal) of the graph adjacency matrix (if set to true) or not (if set to false).

  • scramble_vertex_ids – Flag controlling whether to scramble vertex ID bits (if set to true) or not (if set to false); scrambling vertex ID bits breaks correlation between vertex ID values and vertex degrees.

Returns:

A vector of std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> of size n_edgelists, each vector element being a tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
std::vector<std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>>> generate_rmat_edgelists(raft::handle_t const &handle, raft::random::RngState &rng_state, size_t n_edgelists, size_t min_scale, size_t max_scale, size_t edge_factor = 16, generator_distribution_t size_distribution = generator_distribution_t::POWER_LAW, generator_distribution_t edge_distribution = generator_distribution_t::POWER_LAW, bool clip_and_flip = false, bool scramble_vertex_ids = false)#

generate multiple edge lists using the R-mat graph generator.

This function allows multi-edges and self-loops similar to the Graph 500 reference implementation.

NOTE: The scramble_vertex_ids function needs to be called in order to generate a graph conforming to the Graph 500 specification (note that scrambling does not affect cuGraph’s graph construction performance, so this is generally unnecessary). If edge_factor is given (e.g. Graph 500), set num_edges to (size_t{1} << scale) * edge_factor. To generate an undirected graph, set b == c and clip_and_flip = true. All the resulting edges will be placed in the lower triangular part (including the diagonal) of the graph adjacency matrix.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • rng_state – RAFT RNG state, updated with each call

  • n_edgelists – Number of edge lists (graphs) to generate

  • min_scale – Scale factor to set the minimum number of verties in the graph.

  • max_scale – Scale factor to set the maximum number of verties in the graph.

  • edge_factor – Average number of edges per vertex to generate.

  • size_distribution – Distribution of the graph sizes, impacts the scale parameter of the R-MAT generator

  • edge_distribution – Edges distribution for each graph, impacts how R-MAT parameters a,b,c,d, are set.

  • clip_and_flip – Flag controlling whether to generate edges only in the lower triangular part (including the diagonal) of the graph adjacency matrix (if set to true) or not (if set to false).

  • scramble_vertex_ids – Flag controlling whether to scramble vertex ID bits (if set to true) or not (if set to false); scrambling vertex ID bits breaks correlation between vertex ID values and vertex degrees.

Returns:

A vector of std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> of size n_edgelists, each vector element being a tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> generate_path_graph_edgelist(raft::handle_t const &handle, std::vector<std::tuple<vertex_t, vertex_t>> const &component_parameters_v)#

generate an edge list for path graph

A path graph of size n connects the vertices from 0 to (n - 1) in a single long path: ((0,1), (1,2), …, (n - 2, n - 1)

If executed in a multi-gpu context (handle comms has been initialized) the path will span all GPUs including an edge from the last vertex on GPU i to the first vertex on GPU (i+1)

This function will generate a collection of path graphs. component_parameters_v defines the parameters for generating each component. Each element of component_parameters_v defines a tuple consisting of the number of vertices and the base vertex id for the component.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • component_parameters_v – A vector containing tuples consisting of the number of vertices and base vertex id for each component to generate.

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> generate_2d_mesh_graph_edgelist(raft::handle_t const &handle, std::vector<std::tuple<vertex_t, vertex_t, vertex_t>> const &component_parameters_v)#

generate an edge list for a 2D Mesh Graph

A sequence of 2D mesh graphs will be constructed according to the component specifications. Each 2D mesh graph is configured with a tuple containing (x, y, base_vertex_id). component_parameters_v will contain a tuple for each component.

If executed in a multi-gpu context (handle comms has been initialized) each GPU will generate disjoint 2D mesh constructs of equal size.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • component_parameters_v – Vector containing tuple defining the configuration of each component

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> generate_3d_mesh_graph_edgelist(raft::handle_t const &handle, std::vector<std::tuple<vertex_t, vertex_t, vertex_t, vertex_t>> const &component_parameters_v)#

generate an edge list for a 3D Mesh Graph

A sequence of 3D mesh graphs will be constructed according to the component specifications. Each 3D mesh graph is configured with a tuple containing (x, y, z, base_vertex_id). component_parameters_v will contain a tuple for each component.

If executed in a multi-gpu context (handle comms has been initialized) each GPU will generate disjoint 3D mesh constructs of equal size.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • component_parameters_v – Vector containing tuple defining the configuration of each component

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> generate_complete_graph_edgelist(raft::handle_t const &handle, std::vector<std::tuple<vertex_t, vertex_t>> const &component_parameters_v)#

generate an edge lists for some complete graphs

A sequence of complete graphs will be constructed according to the component specifications. Each complete graph is configured with a tuple containing (n, base_vertex_id). component_parameters_v will contain a tuple for each component.

If executed in a multi-gpu context (handle comms has been initialized) each GPU will generate disjoint complete graph constructs of equal size.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • component_parameters_v – Vector containing tuple defining the configuration of each component

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> generate_erdos_renyi_graph_edgelist_gnp(raft::handle_t const &handle, vertex_t num_vertices, float p, vertex_t base_vertex_id, uint64_t seed = 0)#

generate an edge lists for an Erdos-Renyi graph

This API supports the G(n,p) model which requires O(n^2) work.

If executed in a multi-gpu context (handle comms has been initialized) each GPU will generate Erdos-Renyi edges for its portion of the 2D partitioning of the adjacency matrix.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • num_vertices – Number of vertices to use in the generated graph

  • p – Probability for edge creation

  • base_vertex_id – Starting vertex id for the generated graph

  • seed – Seed value for the random number generator.

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> generate_erdos_renyi_graph_edgelist_gnm(raft::handle_t const &handle, vertex_t num_vertices, size_t m, vertex_t base_vertex_id, uint64_t seed = 0)#

generate an edge lists for an Erdos-Renyi graph

This API supports the G(n,m) model

If executed in a multi-gpu context (handle comms has been initialized) each GPU will generate Erdos-Renyi edges for its portion of the 2D partitioning of the adjacency matrix.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • num_vertices – Number of vertices to use in each complete graph

  • m – Number of edges to generate

  • base_vertex_id – Starting vertex id for the generated graph

  • seed – Seed value for the random number generator.

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t, typename weight_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>, std::optional<rmm::device_uvector<weight_t>>> symmetrize_edgelist_from_triangular(raft::handle_t const &handle, rmm::device_uvector<vertex_t> &&d_src_v, rmm::device_uvector<vertex_t> &&d_dst_v, std::optional<rmm::device_uvector<weight_t>> &&optional_d_weights_v, bool check_diagonal = false)#

symmetrize an edgelist from the edges in the lower (or upper but not both) triangular part of a graph adjacency matrix

Template Parameters:
  • vertex_t – Type of vertex identifiers. Needs to be an integral type.

  • weight_t – Type of weights.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • d_src_v – Vector of source vertices

  • d_dst_v – Vector of destination vertices

  • d_weights_v – Optional vector of edge weights

  • check_diagonal – Flag indicating whether to check for diagonal edges or not. If set to true, symmetrize only the edges with source != destination (to avoid duplicating every self-loops).

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs.

template<typename vertex_t>
rmm::device_uvector<vertex_t> scramble_vertex_ids(raft::handle_t const &handle, rmm::device_uvector<vertex_t> &&vertices, size_t lgN)#

scramble vertex IDs in a graph

Given a vertex list for a graph, scramble the input vertex IDs.

The scramble code here follows the algorithm in the Graph 500 reference implementation version 3.0.0.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • vertices – Vector of input vertices

  • lgN – The input & output (scrambled) vertex IDs are assumed to be in [0, 2^lgN).

Returns:

rmm::device_uvector object storing scrambled vertex IDs.

template<typename vertex_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>> scramble_vertex_ids(raft::handle_t const &handle, rmm::device_uvector<vertex_t> &&srcs, rmm::device_uvector<vertex_t> &&dsts, size_t lgN)#

scramble vertex ids in a graph

Given an edge list for a graph, scramble the input vertex IDs.

The scramble code here follows the algorithm in the Graph 500 reference implementation version 3.0.0.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • d_src_v – Vector of input source vertices

  • d_dst_v – Vector of input destination vertices

  • lgN – The input & output (scrambled) vertex IDs are assumed to be in [0, 2^lgN).

Returns:

Tuple of two rmm::device_uvector objects storing scrambled source & destination vertex IDs, respectively.

template<typename vertex_t, typename weight_t>
std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>, std::optional<rmm::device_uvector<weight_t>>> combine_edgelists(raft::handle_t const &handle, std::vector<rmm::device_uvector<vertex_t>> &&d_sources, std::vector<rmm::device_uvector<vertex_t>> &&d_dests, std::optional<std::vector<rmm::device_uvector<weight_t>>> &&optional_d_weights, bool remove_multi_edges = true)#

Combine edgelists from multiple sources into a single edgelist.

If executed in a multi-gpu context (handle comms has been initialized) each GPU will operate only on its subset of data. Any shuffling to get edges onto the same GPU should be done prior to calling this function.

Template Parameters:

vertex_t – Type of vertex identifiers. Needs to be an integral type.

Parameters:
  • handle – RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and handles to various CUDA libraries) to run graph algorithms.

  • sources – The source vertex ids to combine

  • dests – The destination vertex ids to combine

  • weights – Optional vector of weights to combine

  • remove_multi_edges – If true (the default) then remove multi edges, if false leave them in

Returns:

std::tuple<rmm::device_uvector<vertex_t>, rmm::device_uvector<vertex_t>, rmm::device_uvector<weight_t>> A tuple of rmm::device_uvector objects for edge source vertex IDs and edge destination vertex IDs and edge weights.