Public Member Functions | Public Attributes | List of all members
ML::SVM::WorkingSet< math_t > Class Template Reference

#include <workingset.h>

Collaboration diagram for ML::SVM::WorkingSet< math_t >:
Collaboration graph

Public Member Functions

 WorkingSet (const raft::handle_t &handle, cudaStream_t stream, int n_rows=0, int n_ws=0, SvmType svmType=C_SVC)
 Manage a working set. More...
 
 ~WorkingSet ()
 
void SetSize (int n_train, int n_ws=0)
 Set the size of the working set and allocate buffers accordingly. More...
 
int GetSize ()
 
int * GetIndices ()
 Return a device pointer to the the working set indices. More...
 
void SimpleSelect (math_t *f, math_t *alpha, math_t *y, const math_t *C, int n_already_selected=0)
 Select new elements for a working set. More...
 
void Select (math_t *f, math_t *alpha, math_t *y, const math_t *C)
 Select working set indices. More...
 
int PrioritySelect (math_t *alpha, const math_t *C, int nc)
 Select elements from the previous working set based on their priority. More...
 

Public Attributes

bool FIFO_strategy = true
 

Workspace selection strategy, note that only FIFO is tested so far

More...
 

Detailed Description

template<typename math_t>
class ML::SVM::WorkingSet< math_t >

Working set selection for the SMO algorithm.

The working set is a subset of the training vectors, by default it has 1024 elements. At every outer iteration in SmoSolver::Solve, we select a different working set, and optimize the dual coefficients for the working set.

The vectors are selected based on the f values, which is the difference between the target label and the decision function value.

Constructor & Destructor Documentation

◆ WorkingSet()

template<typename math_t >
ML::SVM::WorkingSet< math_t >::WorkingSet ( const raft::handle_t &  handle,
cudaStream_t  stream,
int  n_rows = 0,
int  n_ws = 0,
SvmType  svmType = C_SVC 
)
inline

Manage a working set.

Parameters
handlecuml handle implementation
streamcuda stream for working set operations
n_rowsnumber of training vectors
n_wsnumber of elements in the working set (default 1024)
svmTypeclassification or regression

◆ ~WorkingSet()

template<typename math_t >
ML::SVM::WorkingSet< math_t >::~WorkingSet ( )
inline

Member Function Documentation

◆ GetIndices()

template<typename math_t >
int* ML::SVM::WorkingSet< math_t >::GetIndices ( )
inline

Return a device pointer to the the working set indices.

The returned array is owned by WorkingSet.

◆ GetSize()

template<typename math_t >
int ML::SVM::WorkingSet< math_t >::GetSize ( )
inline

Return the size of the working set.

◆ PrioritySelect()

template<typename math_t >
int ML::SVM::WorkingSet< math_t >::PrioritySelect ( math_t *  alpha,
const math_t *  C,
int  nc 
)

Select elements from the previous working set based on their priority.

We sort the old working set based on their priority in ascending order, and then select nc elements from free, and then lower/upper bound vectors. For details see [2].

See Issue #946.

References: [2] T Serafini, L Zanni: On the Working Set selection in grad. projection based decomposition techniques for Support Vector Machines DOI: 10.1080/10556780500140714

Parameters
[in]alphadevice vector of dual coefficients, size [n_train]
[in]Cpenalty parameter
[in]ncnumber of elements to select

◆ Select()

template<typename math_t >
void ML::SVM::WorkingSet< math_t >::Select ( math_t *  f,
math_t *  alpha,
math_t *  y,
const math_t *  C 
)
inline

Select working set indices.

To avoid training vectors oscillating in and out of the working set, we keep half of the previous working set, and fill new elements only to the other half.

We can have a FIFO retention policy, or we can consider the time (=ws_priority) a vector already spent in the ws. References: [1] Z. Wen et al. ThunderSVM: A Fast SVM Library on GPUs and CPUs, Journal of Machine Learning Research, 19, 1-5 (2018)

Parameters
foptimality indicator vector, size [n_train]
alphadual coefficients, size [n_train]
yclass labels, size [n_train]
Cpenalty parameter vector, size [n_train]

◆ SetSize()

template<typename math_t >
void ML::SVM::WorkingSet< math_t >::SetSize ( int  n_train,
int  n_ws = 0 
)
inline

Set the size of the working set and allocate buffers accordingly.

Parameters
n_trainnumber of training vectors
n_wsworking set size (default min(1024, n_train))

◆ SimpleSelect()

template<typename math_t >
void ML::SVM::WorkingSet< math_t >::SimpleSelect ( math_t *  f,
math_t *  alpha,
math_t *  y,
const math_t *  C,
int  n_already_selected = 0 
)

Select new elements for a working set.

Here we follow the working set selection strategy by Joachims [1], we select n training instances as:

  • select n/2 element of upper set, where f is largest
  • select n/2 from lower set, where f is smallest

The difference compared to Joachims' strategy is that we can already have some elements selected by a different strategy, therefore we select only n = n_ws - n_already_selected.

References: [1] Joachims, T. (1998). Making large-scale support vector machine learning practical. In B. Scholkopf, C. Burges, & A. Smola (Eds.), Advances in kernel methods: Support vector machines. Cambridge, MA: MIT Press

Parameters
foptimality indicator vector, size [n_train]
alphadual coefficients, size [n_train]
ytarget labels (+/- 1)
Cpenalty parameter vector size [n_train]
n_already_selected

Member Data Documentation

◆ FIFO_strategy

template<typename math_t >
bool ML::SVM::WorkingSet< math_t >::FIFO_strategy = true

Workspace selection strategy, note that only FIFO is tested so far


The documentation for this class was generated from the following file: