cugraph.dask.sampling.node2vec_random_walks.node2vec_random_walks#
- cugraph.dask.sampling.node2vec_random_walks.node2vec_random_walks(input_graph, start_vertices: int | list | Series | DataFrame = None, max_depth: int = 1, p: float = 1.0, q: float = 1.0, random_state: int = None) Tuple[Series | DataFrame, Series, int][source]#
compute random walks under the node2vec sampling framework for each nodes in ‘start_vertices’ and returns a padded result along with the maximum path length. Vertices with no outgoing edges will be padded with -1.
- Parameters:
- input_graphcuGraph.Graph
The graph can be either directed or undirected.
- start_vertices: int or list or cudf.Series or cudf.DataFrame
A single node or a list or a cudf.Series of nodes from which to run the random walks. In case of multi-column vertices it should be a cudf.DataFrame. Only supports int32 currently.
- max_depth: int
The maximum depth of the random walks. If not specified, the maximum depth is set to 1. Must be a positive integer
- p: float, optional (default=1.0, [0 < p])
Return factor, which represents the likelihood of backtracking to a previous node in the walk. A higher value makes it less likely to sample a previously visited node, while a lower value makes it more likely to backtrack, making the walk “local”. A positive float.
- q: float, optional (default=1.0, [0 < q])
In-out factor, which represents the likelihood of visiting nodes closer or further from the outgoing node. If q > 1, the random walk is likelier to visit nodes closer to the outgoing node. If q < 1, the random walk is likelier to visit nodes further from the outgoing node. A positive float.
- random_state: int, optional
Random seed to use when making sampling calls.
- Returns:
- vertex_pathsdask_cudf.Series or dask_cudf.DataFrame
Series containing the vertices of edges/paths in the random walk.
- edge_weight_paths: dask_cudf.Series
Series containing the edge weights of edges represented by the returned vertex_paths
- and
- max_path_lengthint
The maximum path length.