cugraph.jaccard_coefficient#

cugraph.jaccard_coefficient(G: Union[Graph, networkx.Graph], ebunch: Union[DataFrame, Iterable[Union[int, str, float]]] = None)[source]#

For NetworkX Compatability. See jaccard

Parameters:
Gcugraph.Graph or NetworkX.Graph

cuGraph or NetworkX Graph instance, should contain the connectivity information as an edge list. The graph should be undirected where an undirected edge is represented by a directed edge in both direction. The adjacency list will be computed if not already present.

This implementation only supports undirected, non-multi Graphs.

ebunchcudf.DataFrame or iterable of node pairs, optional (default=None)

A GPU dataframe consisting of two columns representing pairs of vertices or iterable of 2-tuples (u, v) where u and v are nodes in the graph.

If provided, the Overlap coefficient is computed for the given vertex pairs. Otherwise, the current implementation computes the overlap coefficient for all adjacent vertices in the graph.

Returns:
dfcudf.DataFrame

GPU data frame of size E (the default) or the size of the given pairs (first, second) containing the Jaccard weights. The ordering is relative to the adjacency list, or that given by the specified vertex pairs.

df[‘first’]cudf.Series

The first vertex ID of each pair (will be identical to first if specified).

df[‘second’]cudf.Series

the second vertex ID of each pair (will be identical to second if specified).

df[‘jaccard_coeff’]cudf.Series

The computed Jaccard coefficient between the first and the second vertex ID.

Examples

>>> from cugraph.datasets import karate
>>> from cugraph import jaccard_coefficient
>>> G = karate.get_graph(download=True)
>>> df = jaccard_coefficient(G)