cugraph.centrality.edge_betweenness_centrality#

cugraph.centrality.edge_betweenness_centrality(G, k: ~typing.Optional[~typing.Union[int, list, ~cudf.core.series.Series, ~cudf.core.dataframe.DataFrame]] = None, normalized: bool = True, weight: ~typing.Optional[~cudf.core.dataframe.DataFrame] = None, seed: ~typing.Optional[int] = None, result_dtype: ~typing.Union[~numpy.float32, ~numpy.float64] = <class 'numpy.float64'>) Union[DataFrame, dict][source]#

Compute the edge betweenness centrality for all edges of the graph G. Betweenness centrality is a measure of the number of shortest paths that pass over an edge. An edge with a high betweenness centrality score has more paths passing over it and is therefore believed to be more important.

To improve performance, rather than doing an all-pair shortest path, a sample of k starting vertices can be used.

CuGraph does not currently support the ‘weight’ parameter.

Parameters:
GcuGraph.Graph or networkx.Graph

The graph can be either directed (Graph(directed=True)) or undirected. The current implementation uses BFS traversals. Use weight parameter if weights need to be considered (currently not supported).

kint or list or None, optional (default=None)

If k is not None, use k node samples to estimate betweenness. Higher values give better approximation. If k is either a list, a cudf DataFrame, or a dask_cudf DataFrame, then its contents are assumed to be vertex identifiers to be used for estimation. If k is None (the default), all the vertices are used to estimate betweenness. Vertices obtained through sampling or defined as a list will be used as sources for traversals inside the algorithm.

normalizedbool, optional (default=True)

If true, the betweenness values are normalized by __2 / (n * (n - 1))__ for undirected Graphs, and __1 / (n * (n - 1))__ for directed Graphs where n is the number of nodes in G. Normalization will ensure that values are in [0, 1], this normalization scales for the highest possible value where one edge is crossed by every single shortest path.

weightcudf.DataFrame, optional (default=None)

Specifies the weights to be used for each edge. Should contain a mapping between edges and weights. (Not Supported)

seedoptional (default=None)

if k is specified and k is an integer, use seed to initialize the random number generator. Using None as seed relies on random.seed() behavior: using current system time If k is either None or list: seed parameter is ignored

result_dtypenp.float32 or np.float64, optional (default=np.float64)

Indicate the data type of the betweenness centrality scores Using double automatically switch implementation to “default”

Returns:
dfcudf.DataFrame or Dictionary if using NetworkX

GPU data frame containing three cudf.Series of size E: the vertex identifiers of the sources, the vertex identifies of the destinations and the corresponding betweenness centrality values. Please note that the resulting the ‘src’, ‘dst’ column might not be in ascending order.

df[‘src’]cudf.Series

Contains the vertex identifiers of the source of each edge

df[‘dst’]cudf.Series

Contains the vertex identifiers of the destination of each edge

df[‘betweenness_centrality’]cudf.Series

Contains the betweenness centrality of edges

df[“edge_id”]cudf.Series

Contains the edge ids of edges if present.

Examples

>>> from cugraph.datasets import karate
>>> G = karate.get_graph(download=True)
>>> bc = cugraph.edge_betweenness_centrality(G)