Attention

The vector search and clustering algorithms in RAFT are being migrated to a new library dedicated to vector search called cuVS. We will continue to support the vector search algorithms in RAFT during this move, but will no longer update them after the RAPIDS 24.06 (June) release. We plan to complete the migration by RAPIDS 24.10 (October) release and they will be removed from RAFT altogether in the 24.12 (December) release.

Sparse Matrix Operations#

template<typename T>
void raft::sparse::op::coo_remove_scalar(const int *rows, const int *cols, const T *vals, int nnz, int *crows, int *ccols, T *cvals, int *cnnz, int *cur_cnnz, T scalar, int n, cudaStream_t stream)#

Removes the values matching a particular scalar from a COO formatted sparse matrix.

Parameters:
  • rows – input array of rows (size n)

  • cols – input array of cols (size n)

  • vals – input array of vals (size n)

  • nnz – size of current rows/cols/vals arrays

  • crows – compressed array of rows

  • ccols – compressed array of cols

  • cvals – compressed array of vals

  • cnnz – array of non-zero counts per row

  • cur_cnnz – array of counts per row

  • scalar – scalar to remove from arrays

  • n – number of rows in dense matrix

  • stream – cuda stream to use

template<typename T>
void raft::sparse::op::coo_remove_scalar(COO<T> *in, COO<T> *out, T scalar, cudaStream_t stream)#

Removes the values matching a particular scalar from a COO formatted sparse matrix.

Parameters:
  • in – input COO matrix

  • out – output COO matrix

  • scalar – scalar to remove from arrays

  • stream – cuda stream to use

template<typename T>
void raft::sparse::op::coo_remove_zeros(COO<T> *in, COO<T> *out, cudaStream_t stream)#

Removes zeros from a COO formatted sparse matrix.

Parameters:
  • in – input COO matrix

  • out – output COO matrix

  • stream – cuda stream to use

template<typename value_idx>
void raft::sparse::op::compute_duplicates_mask(value_idx *mask, const value_idx *rows, const value_idx *cols, size_t nnz, cudaStream_t stream)#

Computes a mask from a sorted COO matrix where 0’s denote duplicate values and 1’s denote new values. This mask can be useful for computing an exclusive scan to pre-build offsets for reducing duplicates, such as when symmetrizing or taking the min of each duplicated value.

Note that this function always marks the first value as 0 so that a cumulative sum can be performed as a follow-on. However, even if the mask is used directly, any duplicates should always have a 1 when first encountered so it can be assumed that the first element is always a 1 otherwise.

Template Parameters:

value_idx

Parameters:
  • mask[out] output mask, size nnz

  • rows[in] COO rows array, size nnz

  • cols[in] COO cols array, size nnz

  • nnz[in] number of nonzeros in input arrays

  • stream[in] cuda ops will be ordered wrt this stream

template<typename value_idx, typename value_t>
void raft::sparse::op::max_duplicates(raft::resources const &handle, raft::sparse::COO<value_t, value_idx> &out, const value_idx *rows, const value_idx *cols, const value_t *vals, size_t nnz, size_t m, size_t n)#

Performs a COO reduce of duplicate columns per row, taking the max weight for duplicate columns in each row. This function assumes the input COO has been sorted by both row and column but makes no assumption on the sorting of values.

Template Parameters:
  • value_idx

  • value_t

Parameters:
  • handle[in]

  • out[out] output COO, the nnz will be computed allocate() will be called in this function.

  • rows[in] COO rows array, size nnz

  • cols[in] COO cols array, size nnz

  • vals[in] COO vals array, size nnz

  • nnz[in] number of nonzeros in COO input arrays

  • m[in] number of rows in COO input matrix

  • n[in] number of columns in COO input matrix

template<typename Index_> void void raft::sparse::op::csr_row_op (const Index_ *row_ind, Index_ n_rows, Index_ nnz, Lambda op, cudaStream_t stream)

Perform a custom row operation on a CSR matrix in batches.

Template Parameters:
  • T – numerical type of row_ind array

  • TPB_X – number of threads per block to use for underlying kernel

  • Lambda – type of custom operation function

Parameters:
  • row_ind – the CSR row_ind array to perform parallel operations over

  • n_rows – total number vertices in graph

  • nnz – number of non-zeros

  • op – custom row operation functor accepting the row and beginning index.

  • stream – cuda stream to use

template<typename value_idx>
void raft::sparse::op::csr_row_slice_indptr(value_idx start_row, value_idx stop_row, const value_idx *indptr, value_idx *indptr_out, value_idx *start_offset, value_idx *stop_offset, cudaStream_t stream)#

Slice consecutive rows from a CSR array and populate newly sliced indptr array

Template Parameters:

value_idx

Parameters:
  • start_row[in] : beginning row to slice

  • stop_row[in] : ending row to slice

  • indptr[in] : indptr of input CSR to slice

  • indptr_out[out] : output sliced indptr to populate

  • start_offset[in] : beginning column offset of input indptr

  • stop_offset[in] : ending column offset of input indptr

  • stream[in] : cuda stream for ordering events

template<typename value_idx, typename value_t>
void raft::sparse::op::csr_row_slice_populate(value_idx start_offset, value_idx stop_offset, const value_idx *indices, const value_t *data, value_idx *indices_out, value_t *data_out, cudaStream_t stream)#

Slice rows from a CSR, populate column and data arrays

Template Parameters:
  • value_idx – : data type of CSR index arrays

  • value_t – : data type of CSR data array

Parameters:
  • start_offset[in] : beginning column offset to slice

  • stop_offset[in] : ending column offset to slice

  • indices[in] : column indices array from input CSR

  • data[in] : data array from input CSR

  • indices_out[out] : output column indices array

  • data_out[out] : output data array

  • stream[in] : cuda stream for ordering events

template<typename T, typename IdxT = int>
void raft::sparse::op::coo_sort(IdxT m, IdxT n, IdxT nnz, IdxT *rows, IdxT *cols, T *vals, cudaStream_t stream)#

Sorts the arrays that comprise the coo matrix by row and then by column.

Parameters:
  • m – number of rows in coo matrix

  • n – number of cols in coo matrix

  • nnz – number of non-zeros

  • rows – rows array from coo matrix

  • cols – cols array from coo matrix

  • vals – vals array from coo matrix

  • stream – cuda stream to use

template<typename T, typename IdxT = int>
void raft::sparse::op::coo_sort(COO<T, IdxT> *const in, cudaStream_t stream)#

Sort the underlying COO arrays by row.

Template Parameters:

T – the type name of the underlying value array

Parameters:
  • in – COO to sort by row

  • stream – the cuda stream to use

template<typename value_idx, typename value_t>
void raft::sparse::op::coo_sort_by_weight(value_idx *rows, value_idx *cols, value_t *data, value_idx nnz, cudaStream_t stream)#

Sorts a COO by its weight

Template Parameters:
  • value_idx

  • value_t

Parameters:
  • rows[inout] source edges

  • cols[inout] dest edges

  • data[inout] edge weights

  • nnz[in] number of edges in edge list

  • stream[in] cuda stream for which to order cuda operations