Attention
The vector search and clustering algorithms in RAFT are being migrated to a new library dedicated to vector search called cuVS. We will continue to support the vector search algorithms in RAFT during this move, but will no longer update them after the RAPIDS 24.06 (June) release. We plan to complete the migration by RAPIDS 24.10 (October) release and they will be removed from RAFT altogether in the 24.12 (December) release.
Sampling Without Replacement#
#include <raft/random/sample_without_replacement.cuh>
namespace raft::random
-
template<typename DataT, typename IdxT, typename WeightsVectorType, class OutIndexVectorType>
void sample_without_replacement(raft::resources const &handle, RngState &rng_state, raft::device_vector_view<const DataT, IdxT> in, WeightsVectorType &&weights_opt, raft::device_vector_view<DataT, IdxT> out, OutIndexVectorType &&outIdx_opt)# Sample the input vector without replacement, optionally based on the input weight vector for each element in the array.
The implementation is based on the
one-pass sampling
algorithm described in “Accelerating weighted random sampling without replacement,” a technical report by Kirill Mueller.If no input weight vector is provided, then input elements will be sampled uniformly. Otherwise, the elements sampled from the input vector will always appear in increasing order of their weights as computed using the exponential distribution. So, if you are particular about the order (for e.g., array permutations), then this might not be the right choice.
Note
Please do not specify template parameters explicitly, as the compiler can deduce them from the arguments.
- Template Parameters:
DataT – type of each element of the input array
in
IdxT – type of the dimensions of the arrays; output index type
WeightsVectorType – std::optional<raft::device_vector_view<const weight_type, IdxT>> of each elements of the weights array
weights_opt
OutIndexVectorType – std::optional<raft::device_vector_view<IdxT, IdxT>> of output indices
outIdx_opt
- Parameters:
handle – [in] RAFT handle containing (among other resources) the CUDA stream on which to run.
rng_state – [inout] Pseudorandom number generator state.
in – [in] Input vector to be sampled.
weights_opt – [in] std::optional weights vector. If not provided, uniform sampling will be used.
out – [out] Vector of samples from the input vector.
outIdx_opt – [out] std::optional vector of the indices sampled from the input array.
- Pre:
The number of samples
out.extent(0)
is less than or equal to the number of inputsin.extent(0)
.- Pre:
The number of weights
wts.extent(0)
equals the number of inputsin.extent(0)
.
-
template<typename ...Args, typename = std::enable_if_t<sizeof...(Args) == 5>>
void sample_without_replacement(Args... args)# Overload of
sample_without_replacement
to help the compiler find the above overload, in case users pass instd::nullopt
for one or both of the optional arguments.Please see above for documentation of
sample_without_replacement
.
#include <raft/random/permute.cuh>
namespace raft::random
-
template<typename InputOutputValueType, typename IntType, typename IdxType, typename Layout>
void permute(raft::resources const &handle, raft::device_matrix_view<const InputOutputValueType, IdxType, Layout> in, std::optional<raft::device_vector_view<IntType, IdxType>> permsOut, std::optional<raft::device_matrix_view<InputOutputValueType, IdxType, Layout>> out)# Randomly permute the rows of the input matrix.
We do not support in-place permutation, so that we can compute in parallel without race conditions. This function is useful for shuffling input data sets in machine learning algorithms.
Note
This is NOT a uniform permutation generator! It only generates a small fraction of all possible random permutations. If your application needs a high-quality permutation generator, then we recommend Knuth Shuffle.
- Template Parameters:
InputOutputValueType – Type of each element of the input matrix, and the type of each element of the output matrix (if provided)
IntType – Integer type of each element of
permsOut
IdxType – Integer type of the extents of the mdspan parameters
Layout – Either
raft::row_major
orraft::col_major
- Parameters:
handle – [in] RAFT handle containing the CUDA stream on which to run.
in – [in] input matrix
permsOut – [out] If provided, the indices of the permutation.
out – [out] If provided, the output matrix, containing the permuted rows of the input matrix
in
. (Not providing this is only useful if you providepermsOut
.)
- Pre:
If
permsOut.has_value()
istrue
, then(*permsOut).extent(0) == in.extent(0)
istrue
.- Pre:
If
out.has_value()
istrue
, then(*out).extents() == in.extents()
istrue
.
-
template<typename InputOutputValueType, typename IdxType, typename Layout, typename PermsOutType, typename OutType>
void permute(raft::resources const &handle, raft::device_matrix_view<const InputOutputValueType, IdxType, Layout> in, PermsOutType &&permsOut, OutType &&out)# Overload of
permute
that compiles if users pass instd::nullopt
for either or both ofpermsOut
andout
.