Attention
The vector search and clustering algorithms in RAFT are being migrated to a new library dedicated to vector search called cuVS. We will continue to support the vector search algorithms in RAFT during this move, but will no longer update them after the RAPIDS 24.06 (June) release. We plan to complete the migration by RAPIDS 24.10 (October) release and they will be removed from RAFT altogether in the 24.12 (December) release.
Sampling Without Replacement#
#include <raft/random/sample_without_replacement.cuh>
namespace raft::random
-
template<typename DataT, typename IdxT, typename WeightsVectorType, class OutIndexVectorType>
void sample_without_replacement(raft::resources const &handle, RngState &rng_state, raft::device_vector_view<const DataT, IdxT> in, WeightsVectorType &&weights_opt, raft::device_vector_view<DataT, IdxT> out, OutIndexVectorType &&outIdx_opt)# Sample the input vector without replacement, optionally based on the input weight vector for each element in the array.
The implementation is based on the
one-pass samplingalgorithm described in “Accelerating weighted random sampling without replacement,” a technical report by Kirill Mueller.If no input weight vector is provided, then input elements will be sampled uniformly. Otherwise, the elements sampled from the input vector will always appear in increasing order of their weights as computed using the exponential distribution. So, if you are particular about the order (for e.g., array permutations), then this might not be the right choice.
Note
Please do not specify template parameters explicitly, as the compiler can deduce them from the arguments.
- Template Parameters:
DataT – type of each element of the input array
inIdxT – type of the dimensions of the arrays; output index type
WeightsVectorType – std::optional<raft::device_vector_view<const weight_type, IdxT>> of each elements of the weights array
weights_optOutIndexVectorType – std::optional<raft::device_vector_view<IdxT, IdxT>> of output indices
outIdx_opt
- Parameters:
handle – [in] RAFT handle containing (among other resources) the CUDA stream on which to run.
rng_state – [inout] Pseudorandom number generator state.
in – [in] Input vector to be sampled.
weights_opt – [in] std::optional weights vector. If not provided, uniform sampling will be used.
out – [out] Vector of samples from the input vector.
outIdx_opt – [out] std::optional vector of the indices sampled from the input array.
- Pre:
The number of samples
out.extent(0)is less than or equal to the number of inputsin.extent(0).- Pre:
The number of weights
wts.extent(0)equals the number of inputsin.extent(0).
-
template<typename ...Args, typename = std::enable_if_t<sizeof...(Args) == 5>>
void sample_without_replacement(Args... args)# Overload of
sample_without_replacementto help the compiler find the above overload, in case users pass instd::nulloptfor one or both of the optional arguments.Please see above for documentation of
sample_without_replacement.
#include <raft/random/permute.cuh>
namespace raft::random
-
template<typename InputOutputValueType, typename IntType, typename IdxType, typename Layout>
void permute(raft::resources const &handle, raft::device_matrix_view<const InputOutputValueType, IdxType, Layout> in, std::optional<raft::device_vector_view<IntType, IdxType>> permsOut, std::optional<raft::device_matrix_view<InputOutputValueType, IdxType, Layout>> out)# Randomly permute the rows of the input matrix.
We do not support in-place permutation, so that we can compute in parallel without race conditions. This function is useful for shuffling input data sets in machine learning algorithms.
Note
This is NOT a uniform permutation generator! It only generates a small fraction of all possible random permutations. If your application needs a high-quality permutation generator, then we recommend Knuth Shuffle.
- Template Parameters:
InputOutputValueType – Type of each element of the input matrix, and the type of each element of the output matrix (if provided)
IntType – Integer type of each element of
permsOutIdxType – Integer type of the extents of the mdspan parameters
Layout – Either
raft::row_majororraft::col_major
- Parameters:
handle – [in] RAFT handle containing the CUDA stream on which to run.
in – [in] input matrix
permsOut – [out] If provided, the indices of the permutation.
out – [out] If provided, the output matrix, containing the permuted rows of the input matrix
in. (Not providing this is only useful if you providepermsOut.)
- Pre:
If
permsOut.has_value()istrue, then(*permsOut).extent(0) == in.extent(0)istrue.- Pre:
If
out.has_value()istrue, then(*out).extents() == in.extents()istrue.
-
template<typename InputOutputValueType, typename IdxType, typename Layout, typename PermsOutType, typename OutType>
void permute(raft::resources const &handle, raft::device_matrix_view<const InputOutputValueType, IdxType, Layout> in, PermsOutType &&permsOut, OutType &&out)# Overload of
permutethat compiles if users pass instd::nulloptfor either or both ofpermsOutandout.