cugraph-pyg API Reference#

cugraph-pyg

Graph Storage#

cugraph_pyg.data.graph_store.GraphStore()

cuGraph-backed PyG GraphStore implementation that distributes the graph across workers.

Feature Storage#

cugraph_pyg.data.feature_store.FeatureStore([...])

A basic implementation of the PyG FeatureStore interface that stores feature data in WholeGraph WholeMemory.

Tensors and Embeddings#

cugraph_pyg.tensor.dist_tensor.DistTensor([...])

WholeGraph-backed Distributed Tensor Interface for PyTorch. Parameters ---------- src: Optional[Union[torch.Tensor, str, List[str]]] The source of the tensor. It can be a torch.Tensor on host, a file path, or a list of file paths. When the source is omitted, the tensor will be load later. shape : Optional[list, tuple] The shape of the tensor. It has to be a one- or two-dimensional tensor for now. When the shape is omitted, the src has to be specified and must be pt or npy file paths. dtype : Optional[torch.dtype] The dtype of the tensor. When the dtype is omitted, the src has to be specified and must be pt or npy file paths. device : Optional[Literal["cpu", "cuda"]] = "cpu" The desired location to store the embedding [ "cpu" | "cuda" ]. Default is "cpu", i.e., host-pinned memory (UVA). partition_book : Union[List[int], None] = None 1-D Range partition based on entry (dim-0). partition_book[i] determines the entry count of rank i and shoud be a positive integer; the sum of partition_book should equal to shape[0]. Entries will be equally partitioned if None. backend : Optional[Literal["vmm", "nccl", "nvshmem", "chunked"]] = "nccl" The backend used for communication. Default is "nccl".

cugraph_pyg.tensor.dist_tensor.DistEmbedding([...])

WholeGraph-backed Distributed Embedding Interface for PyTorch. Parameters ---------- src: Optional[Union[torch.Tensor, str, List[str]]] The source of the tensor. It can be a torch.Tensor on host, a file path, or a list of file paths. When the source is omitted, the tensor will be load later. shape : Optional[list, tuple] The shape of the tensor. It has to be a one- or two-dimensional tensor for now. When the shape is omitted, the src has to be specified and must be pt or npy file paths. dtype : Optional[torch.dtype] The dtype of the tensor. Whne the dtype is omitted, the src has to be specified and must be pt or npy file paths. device : Optional[Literal["cpu", "cuda"]] = "cpu" The desired location to store the embedding [ "cpu" | "cuda" ]. Default is "cpu", i.e., host-pinned memory (UVA). partition_book : Union[List[int], None] = None 1-D Range partition based on entry (dim-0). partition_book[i] determines the entry count of rank i and shoud be a positive integer; the sum of partition_book should equal to shape[0]. Entries will be equally partitioned if None. backend : Optional[Literal["vmm", "nccl", "nvshmem", "chunked"]] = "nccl" The backend used for communication. Default is "nccl". cache_policy : Optional[WholeMemoryCachePolicy] = None The cache policy for the tensor if it is an embedding. Default is None. gather_sms : Optional[int] = -1 Whether to gather the embeddings on all GPUs. Default is False. round_robin_size: int = 0 continuous embedding size of a rank using round robin shard strategy name : Optional[str] The name of the tensor.

cugraph_pyg.tensor.dist_matrix.DistMatrix([...])

WholeGraph-backed Distributed Matrix Interface for PyTorch.

Data Loaders#

cugraph_pyg.loader.node_loader.NodeLoader(...)

Duck-typed version of torch_geometric.loader.NodeLoader.

cugraph_pyg.loader.neighbor_loader.NeighborLoader(...)

Duck-typed version of torch_geometric.loader.NeighborLoader

cugraph_pyg.loader.link_loader.LinkLoader(...)

Duck-typed version of torch_geometric.loader.LinkLoader.

cugraph_pyg.loader.link_neighbor_loader.LinkNeighborLoader(...)

Duck-typed version of torch_geometric.loader.LinkNeighborLoader

Samplers#

cugraph_pyg.sampler.sampler.BaseSampler(...)

cugraph_pyg.sampler.sampler.SampleReader(...)

Iterator that processes results from the cuGraph distributed sampler.

cugraph_pyg.sampler.sampler.HomogeneousSampleReader(...)

Subclass of SampleReader that reads homogeneous output samples produced by the cuGraph distributed sampler.

cugraph_pyg.sampler.sampler.HeterogeneousSampleReader(...)

Subclass of SampleReader that reads heterogeneous output samples produced by the cuGraph distributed sampler.

cugraph_pyg.sampler.sampler.SampleIterator(...)

Iterator that combines output graphs with their features to produce final output minibatches that can be fed into a GNN model.

cugraph_pyg.sampler.distributed_sampler.BaseDistributedSampler(...)

Base class for distributed graph sampling using cuGraph.

cugraph_pyg.sampler.distributed_sampler.DistributedNeighborSampler(...)