pylibwholegraph API doc#
APIs#
|
Init WholeGraph environment for PyTorch. |
Init WholeGraph environment for PyTorch and create |
|
Finalize WholeGraph. |
|
|
WholeMemory Communicator. |
|
Set the global world's information. |
Create WholeMemory Communicator. For example: 24 ranks with group_size = 4 and comm_stride = 2 will create following groups: [0, 2, 4, 6], [1, 3, 5, 7], [8, 10, 12, 14], [9, 11, 13, 15], [16, 18, 20, 22], [17, 19, 21, 23] :param group_size: Size of each group, -1 means to use all ranks in just one single group. :param comm_stride: Stride of each rank in each group :return: WholeMemoryCommunicator. |
|
|
Destroy WholeMemoryCommunicator :param wm_comm: WholeMemoryCommunicator to destroy :return: None |
Get the global communicator of this job :return: WholeMemoryCommunicator that has all GPUs in it. |
|
Get the local node communicator of this job :return: WholeMemoryCommunicator that has GPUs in the same node. |
|
Get the local device communicator of this job :return: WholeMemoryCommunicator that has only the GPU belonging to current process. |
|
|
WholeMemory Tensor |
|
Create empty WholeMemory Tensor. Now only support dim = 1 or 2 :param comm: WholeMemoryCommunicator :param memory_type: WholeMemory type, should be continuous, chunked or distributed :param memory_location: WholeMemory location, should be cpu or cuda :param sizes: size of the tensor :param dtype: data type of the tensor :param strides: strides of the tensor :param tensor_entry_partition: rank partition based on entry; tensor_entry_partition[i] determines the entry count of rank i and shoud be a positive integer; the sum of tensor_entry_partition should equal to total entry count; entries will be equally partitioned if None :return: Allocated WholeMemoryTensor. |
Create WholeMemory Tensor from list of binary files. :param comm: WholeMemoryCommunicator :param memory_type: WholeMemory type, should be continuous, chunked or distributed :param memory_location: WholeMemory location, should be cpu or cuda :param filelist: list of binary files :param dtype: data type of the tensor :param last_dim_size: 0 for create 1-D array, positive value for create matrix column size :param last_dim_strides: stride of last_dim, -1 for same as size of last dim. :param tensor_entry_partition: rank partition based on entry; tensor_entry_partition[i] determines the entry count of rank i and shoud be a positive integer; the sum of tensor_entry_partition should equal to total entry count; entries will be equally partitioned if None :return: WholeMemoryTensor. |
|
Destroy allocated WholeMemory Tensor :param wm_tensor: WholeMemory Tensor :return: None |
|
|
Sparse Optimizer for WholeMemoryEmbedding. |
Create WholeMemoryOptimizer. |
|
Destroy WholeMemoryOptimizer :param optimizer: WholeMemoryOptimizer to destroy :return: None |
|
Cache policy to create WholeMemoryEmbedding. |
|
Create WholeMemoryCachePolicy NOTE: in most cases, |
|
Create builtin cache policy |
|
Destroy WholeMemoryCachePolicy :param cache_policy: WholeMemoryCachePolicy to destroy :return: None |
|
WholeMemory Embedding |
|
|
Create embedding :param comm: WholeMemoryCommunicator :param memory_type: WholeMemory type, should be continuous, chunked or distributed :param memory_location: WholeMemory location, should be cpu or cuda :param dtype: data type :param sizes: size of the embedding, must be 2D :param cache_policy: cache policy :param embedding_entry_partition: rank partition based on entry; embedding_entry_partition[i] determines the entry count of rank i and shoud be a positive integer; the sum of embedding_entry_partition should equal to total entry count; entries will be equally partitioned if None :param gather_sms: the number of SMs used in gather process :param round_robin_size: continuous embedding size of a rank using round robin shard strategy :return: WholeMemoryEmbedding |
Create embedding from file list :param comm: WholeMemoryCommunicator :param memory_type: WholeMemory type, should be continuous, chunked or distributed :param memory_location: WholeMemory location, should be cpu or cuda :param filelist: list of files :param dtype: data type :param last_dim_size: size of last dim :param cache_policy: cache policy :param embedding_entry_partition: rank partition based on entry; embedding_entry_partition[i] determines the entry count of rank i and shoud be a positive integer; the sum of embedding_entry_partition should equal to total entry count; entries will be equally partitioned if None :param gather_sms: the number of SMs used in gather process :param round_robin_size: continuous embedding size of a rank using round robin shard strategy :return: |
|
|
Destroy WholeMemoryEmbedding :param wm_embedding: WholeMemoryEmbedding to destroy :return: None |
torch.nn.Module wrapper of WholeMemoryEmbedding |
|
Graph structure storage Actually, it is the graph structure of one relation, represented in CSR format. |