cugraph.node2vec#

cugraph.node2vec(G, start_vertices, max_depth=1, compress_result=True, p=1.0, q=1.0)[source]#

Computes random walks for each node in ‘start_vertices’, under the node2vec sampling framework.

Parameters:
GcuGraph.Graph or networkx.Graph

The graph can be either directed or undirected. Weights in the graph are ignored.

start_vertices: int or list or cudf.Series or cudf.DataFrame

A single node or a list or a cudf.Series of nodes from which to run the random walks. In case of multi-column vertices it should be a cudf.DataFrame. Only supports int32 currently.

max_depth: int, optional (default=1)

The maximum depth of the random walks. If not specified, the maximum depth is set to 1.

compress_result: bool, optional (default=True)

If True, coalesced paths are returned with a sizes array with offsets. Otherwise padded paths are returned with an empty sizes array.

p: float, optional (default=1.0, [0 < p])

Return factor, which represents the likelihood of backtracking to a previous node in the walk. A higher value makes it less likely to sample a previously visited node, while a lower value makes it more likely to backtrack, making the walk “local”. A positive float.

q: float, optional (default=1.0, [0 < q])

In-out factor, which represents the likelihood of visiting nodes closer or further from the outgoing node. If q > 1, the random walk is likelier to visit nodes closer to the outgoing node. If q < 1, the random walk is likelier to visit nodes further from the outgoing node. A positive float.

Returns:
vertex_pathscudf.Series or cudf.DataFrame

Series containing the vertices of edges/paths in the random walk.

edge_weight_paths: cudf.Series

Series containing the edge weights of edges represented by the returned vertex_paths

sizes: int or cudf.Series

The path size or sizes in case of coalesced paths.

References

A Grover, J Leskovec: node2vec: Scalable Feature Learning for Networks, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, https://arxiv.org/abs/1607.00653

Examples

>>> from cugraph.datasets import karate
>>> G = karate.get_graph(download=True)
>>> start_vertices = cudf.Series([0, 2], dtype=np.int32)
>>> paths, weights, path_sizes = cugraph.node2vec(G, start_vertices, 3,
...                                               True, 0.8, 0.5)