cugraph.node2vec#

cugraph.node2vec(G, start_vertices, max_depth=1, compress_result=True, p=1.0, q=1.0)[source]#

Computes random walks for each node in ‘start_vertices’, under the node2vec sampling framework.

Parameters:

GcuGraph.Graph or networkx.Graph: The graph can be either directed or undirected. Weights in the graph are ignored.
start_vertices: int or list or cudf.Series or cudf.DataFrame: A single node or a list or a cudf.Series of nodes from which to run the random walks. In case of multi-column vertices it should be a cudf.DataFrame. Only supports int32 currently.
max_depth: int, optional (default=1): The maximum depth of the random walks. If not specified, the maximum depth is set to 1.
compress_result: bool, optional (default=True): If True, coalesced paths are returned with a sizes array with offsets. Otherwise padded paths are returned with an empty sizes array.
p: float, optional (default=1.0, [0 < p]): Return factor, which represents the likelihood of backtracking to a previous node in the walk. A higher value makes it less likely to sample a previously visited node, while a lower value makes it more likely to backtrack, making the walk “local”. A positive float.
q: float, optional (default=1.0, [0 < q]): In-out factor, which represents the likelihood of visiting nodes closer or further from the outgoing node. If q > 1, the random walk is likelier to visit nodes closer to the outgoing node. If q < 1, the random walk is likelier to visit nodes further from the outgoing node. A positive float.

Returns:

vertex_pathscudf.Series or cudf.DataFrame: Series containing the vertices of edges/paths in the random walk.
edge_weight_paths: cudf.Series: Series containing the edge weights of edges represented by the returned vertex_paths
sizes: int or cudf.Series: The path size or sizes in case of coalesced paths.

References

A Grover, J Leskovec: node2vec: Scalable Feature Learning for Networks, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, https://arxiv.org/abs/1607.00653

Examples

>>> from cugraph.datasets import karate
>>> G = karate.get_graph(download=True)
>>> start_vertices = cudf.Series([0, 2], dtype=np.int32)
>>> paths, weights, path_sizes = cugraph.node2vec(G, start_vertices, 3,
...                                               True, 0.8, 0.5)