cugraph.node2vec#
- cugraph.node2vec(G, start_vertices, max_depth=1, compress_result=True, p=1.0, q=1.0)[source]#
Computes random walks for each node in ‘start_vertices’, under the node2vec sampling framework.
- Parameters:
- GcuGraph.Graph or networkx.Graph
The graph can be either directed or undirected. Weights in the graph are ignored.
- start_vertices: int or list or cudf.Series or cudf.DataFrame
A single node or a list or a cudf.Series of nodes from which to run the random walks. In case of multi-column vertices it should be a cudf.DataFrame. Only supports int32 currently.
- max_depth: int, optional (default=1)
The maximum depth of the random walks. If not specified, the maximum depth is set to 1.
- compress_result: bool, optional (default=True)
If True, coalesced paths are returned with a sizes array with offsets. Otherwise padded paths are returned with an empty sizes array.
- p: float, optional (default=1.0, [0 < p])
Return factor, which represents the likelihood of backtracking to a previous node in the walk. A higher value makes it less likely to sample a previously visited node, while a lower value makes it more likely to backtrack, making the walk “local”. A positive float.
- q: float, optional (default=1.0, [0 < q])
In-out factor, which represents the likelihood of visiting nodes closer or further from the outgoing node. If q > 1, the random walk is likelier to visit nodes closer to the outgoing node. If q < 1, the random walk is likelier to visit nodes further from the outgoing node. A positive float.
- Returns:
- vertex_pathscudf.Series or cudf.DataFrame
Series containing the vertices of edges/paths in the random walk.
- edge_weight_paths: cudf.Series
Series containing the edge weights of edges represented by the returned vertex_paths
- sizes: int or cudf.Series
The path size or sizes in case of coalesced paths.
References
A Grover, J Leskovec: node2vec: Scalable Feature Learning for Networks, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, https://arxiv.org/abs/1607.00653
Examples
>>> from cugraph.datasets import karate >>> G = karate.get_graph(download=True) >>> start_vertices = cudf.Series([0, 2], dtype=np.int32) >>> paths, weights, path_sizes = cugraph.node2vec(G, start_vertices, 3, ... True, 0.8, 0.5)