cugraph.symmetrize#

cugraph.symmetrize(input_df, source_col_name, dest_col_name, value_col_name=None, multi=False, symmetrize=True)[source]#

Take a dataframe of source destination pairs along with associated values stored in a single GPU or distributed create a COO set of source destination pairs along with values where all edges exist in both directions.

Return from this call will be a COO stored as two/three cudf/dask_cudf Series/Dataframe -the symmetrized source column and the symmetrized dest column, along with an optional cudf/dask_cudf Series/DataFrame containing the associated values (only if the values are passed in).

Parameters:
input_dfcudf.DataFrame or dask_cudf.DataFrame

The edgelist as a cudf.DataFrame or dask_cudf.DataFrame

source_col_namestr or list

source column name.

dest_col_namestr or list

destination column name.

value_col_namestr or None

weights column name.

multibool, optional (default=False)

Set to True if graph is a Multi(Di)Graph. This allows multiple edges instead of dropping them.

symmetrizebool, optional

Default is True to perform symmetrization. If False only duplicate edges are dropped.

Examples

>>> from cugraph.structure.symmetrize import symmetrize
>>> # Download dataset from https://github.com/rapidsai/cugraph/datasets/..
>>> M = cudf.read_csv(datasets_path / 'karate.csv', delimiter=' ',
...                   dtype=['int32', 'int32', 'float32'], header=None)
>>> df = cudf.DataFrame()
>>> df['sources'] = cudf.Series(M['0'])
>>> df['destinations'] = cudf.Series(M['1'])
>>> df['values'] = cudf.Series(M['2'])
>>> src, dst, val = symmetrize(df, 'sources', 'destinations', 'values')