cugraph.centrality.betweenness_centrality#
- cugraph.centrality.betweenness_centrality(G, k: ~typing.Optional[~typing.Union[int, list, ~cudf.core.series.Series, ~cudf.core.dataframe.DataFrame]] = None, normalized: bool = True, weight: ~typing.Optional[~cudf.core.dataframe.DataFrame] = None, endpoints: bool = False, seed: ~typing.Optional[int] = None, random_state: ~typing.Optional[int] = None, result_dtype: ~typing.Union[~numpy.float32, ~numpy.float64] = <class 'numpy.float64'>) Union[DataFrame, dict] [source]#
Compute the betweenness centrality for all vertices of the graph G. Betweenness centrality is a measure of the number of shortest paths that pass through a vertex. A vertex with a high betweenness centrality score has more paths passing through it and is therefore believed to be more important.
To improve performance. rather than doing an all-pair shortest path, a sample of k starting vertices can be used.
CuGraph does not currently support ‘weight’ parameters.
- Parameters:
- GcuGraph.Graph or networkx.Graph
The graph can be either directed (Graph(directed=True)) or undirected. The current implementation uses a parallel variation of the Brandes Algorithm (2001) to compute exact or approximate betweenness. If weights are provided in the edgelist, they will not be used.
- kint, list or cudf object or None, optional (default=None)
If k is not None, use k node samples to estimate betweenness. Higher values give better approximation. If k is either a list, a cudf DataFrame, or a dask_cudf DataFrame, then its contents are assumed to be vertex identifiers to be used for estimation. If k is None (the default), all the vertices are used to estimate betweenness. Vertices obtained through sampling or defined as a list will be used as sources for traversals inside the algorithm.
- normalizedbool, optional (default=True)
If true, the betweenness values are normalized by __2 / ((n - 1) * (n - 2))__ for undirected Graphs, and __1 / ((n - 1) * (n - 2))__ for directed Graphs where n is the number of nodes in G. Normalization will ensure that values are in [0, 1], this normalization scales for the highest possible value where one node is crossed by every single shortest path.
- weightcudf.DataFrame, optional (default=None)
Specifies the weights to be used for each edge. Should contain a mapping between edges and weights.
(Not Supported): if weights are provided at the Graph creation, they will not be used.
- endpointsbool, optional (default=False)
If true, include the endpoints in the shortest path counts.
- seedint, optional (default=None)
if k is specified and k is an integer, use seed to initialize the random number generator. Using None defaults to a hash of process id, time, and hostname If k is either None or list: seed parameter is ignored.
This parameter is here for backwards-compatibility and identical to ‘random_state’.
- random_stateint, optional (default=None)
if k is specified and k is an integer, use random_state to initialize the random number generator. Using None defaults to a hash of process id, time, and hostname If k is either None or list: random_state parameter is ignored.
- result_dtypenp.float32 or np.float64, optional, default=np.float64
Indicate the data type of the betweenness centrality scores.
- Returns:
- dfcudf.DataFrame or Dictionary if using NetworkX
GPU data frame containing two cudf.Series of size V: the vertex identifiers and the corresponding betweenness centrality values. Please note that the resulting the ‘vertex’ column might not be in ascending order. The Dictionary contains the same two columns
- df[‘vertex’]cudf.Series
Contains the vertex identifiers
- df[‘betweenness_centrality’]cudf.Series
Contains the betweenness centrality of vertices
Examples
>>> from cugraph.datasets import karate >>> G = karate.get_graph(download=True) >>> bc = cugraph.betweenness_centrality(G)