cugraph.dask.traversal.bfs.bfs#

cugraph.dask.traversal.bfs.bfs(input_graph, start, depth_limit=None, return_distances=True, check_start=True)[source]#

Find the distances and predecessors for a breadth-first traversal of a graph. The input graph must contain edge list as a dask-cudf dataframe with one partition per GPU.

Parameters:
input_graphcugraph.Graph

cuGraph graph instance, should contain the connectivity information as dask cudf edge list dataframe (edge weights are not used for this algorithm).

startInteger or list or cudf object or dask_cudf object

The id(s) of the graph vertex from which the traversal begins in each component of the graph. Only one vertex per connected component of the graph is allowed.

depth_limitInteger or None, optional (default=None)

Limit the depth of the search

return_distancesbool, optional (default=True)

Indicates if distances should be returned

check_startbool, optional (default=True)

If True, performs more extensive tests on the start vertices to ensure validitity, at the expense of increased run time.

Returns:
dfdask_cudf.DataFrame

df[‘vertex’] gives the vertex id

df[‘distance’] gives the path distance from the starting vertex (Only if return_distances is True)

df[‘predecessor’] gives the vertex it was reached from in the traversal

Examples

>>> import cugraph.dask as dcg
>>> import dask_cudf
>>> # ... Init a DASK Cluster
>>> #    see https://docs.rapids.ai/api/cugraph/stable/dask-cugraph.html
>>> # Download dataset from https://github.com/rapidsai/cugraph/datasets/..
>>> chunksize = dcg.get_chunksize(datasets_path / "karate.csv")
>>> ddf = dask_cudf.read_csv(datasets_path / "karate.csv",
...                          chunksize=chunksize, delimiter=" ",
...                          names=["src", "dst", "value"],
...                          dtype=["int32", "int32", "float32"])
>>> dg = cugraph.Graph(directed=True)
>>> dg.from_dask_cudf_edgelist(ddf, source='src', destination='dst',
...                            edge_attr='value')
>>> df = dcg.bfs(dg, 0)