#include <workingset.h>

Collaboration diagram for ML::SVM::WorkingSet< math_t >:

Public Member Functions
	WorkingSet (const raft::handle_t &handle, cudaStream_t stream, int n_rows=0, int n_ws=0, SvmType svmType=C_SVC)
	Manage a working set. More...

	~WorkingSet ()

void	SetSize (int n_train, int n_ws=0)
	Set the size of the working set and allocate buffers accordingly. More...

int	GetSize ()

int *	GetIndices ()
	Return a device pointer to the the working set indices. More...

void	SimpleSelect (math_t f, math_t alpha, math_t y, const math_t C, int n_already_selected=0)
	Select new elements for a working set. More...

void	Select (math_t f, math_t alpha, math_t y, const math_t C)
	Select working set indices. More...

int	PrioritySelect (math_t alpha, const math_t C, int nc)
	Select elements from the previous working set based on their priority. More...

Public Attributes
bool	FIFO_strategy = true
	Workspace selection strategy, note that only FIFO is tested so far More...

Detailed Description

template<typename math_t>
class ML::SVM::WorkingSet< math_t >

Working set selection for the SMO algorithm.

The working set is a subset of the training vectors, by default it has 1024 elements. At every outer iteration in SmoSolver::Solve, we select a different working set, and optimize the dual coefficients for the working set.

The vectors are selected based on the f values, which is the difference between the target label and the decision function value.

Constructor & Destructor Documentation

◆ WorkingSet()

template<typename math_t >

ML::SVM::WorkingSet< math_t >::WorkingSet	(	const raft::handle_t &	handle,
		cudaStream_t	stream,
		int	n_rows = `0`,
		int	n_ws = `0`,
		SvmType	svmType = `C_SVC`
	)

inline

Manage a working set.

Parameters

handle	cuml handle implementation
stream	cuda stream for working set operations
n_rows	number of training vectors
n_ws	number of elements in the working set (default 1024)
svmType	classification or regression

◆ ~WorkingSet()

template<typename math_t >

ML::SVM::WorkingSet< math_t >::~WorkingSet ( )

inline

Member Function Documentation

◆ GetIndices()

template<typename math_t >

int* ML::SVM::WorkingSet< math_t >::GetIndices ( )

inline

Return a device pointer to the the working set indices.

The returned array is owned by WorkingSet.

◆ GetSize()

template<typename math_t >

int ML::SVM::WorkingSet< math_t >::GetSize ( )

inline

Return the size of the working set.

◆ PrioritySelect()

template<typename math_t >

int ML::SVM::WorkingSet< math_t >::PrioritySelect	(	math_t *	alpha,
		const math_t *	C,
		int	nc
	)

Select elements from the previous working set based on their priority.

We sort the old working set based on their priority in ascending order, and then select nc elements from free, and then lower/upper bound vectors. For details see [2].

See Issue #946.

References: [2] T Serafini, L Zanni: On the Working Set selection in grad. projection based decomposition techniques for Support Vector Machines DOI: 10.1080/10556780500140714

Parameters

[in]	alpha	device vector of dual coefficients, size [n_train]
[in]	C	penalty parameter
[in]	nc	number of elements to select

◆ Select()

template<typename math_t >

void ML::SVM::WorkingSet< math_t >::Select	(	math_t *	f,
		math_t *	alpha,
		math_t *	y,
		const math_t *	C
	)

inline

Select working set indices.

To avoid training vectors oscillating in and out of the working set, we keep half of the previous working set, and fill new elements only to the other half.

We can have a FIFO retention policy, or we can consider the time (=ws_priority) a vector already spent in the ws. References: [1] Z. Wen et al. ThunderSVM: A Fast SVM Library on GPUs and CPUs, Journal of Machine Learning Research, 19, 1-5 (2018)

Parameters

f	optimality indicator vector, size [n_train]
alpha	dual coefficients, size [n_train]
y	class labels, size [n_train]
C	penalty parameter vector, size [n_train]

◆ SetSize()

template<typename math_t >

void ML::SVM::WorkingSet< math_t >::SetSize	(	int	n_train,
		int	n_ws = `0`
	)

inline

Set the size of the working set and allocate buffers accordingly.

Parameters

n_train	number of training vectors
n_ws	working set size (default min(1024, n_train))

◆ SimpleSelect()

template<typename math_t >

void ML::SVM::WorkingSet< math_t >::SimpleSelect	(	math_t *	f,
		math_t *	alpha,
		math_t *	y,
		const math_t *	C,
		int	n_already_selected = `0`
	)

Select new elements for a working set.

Here we follow the working set selection strategy by Joachims [1], we select n training instances as:

select n/2 element of upper set, where f is largest
select n/2 from lower set, where f is smallest

The difference compared to Joachims' strategy is that we can already have some elements selected by a different strategy, therefore we select only n = n_ws - n_already_selected.

References: [1] Joachims, T. (1998). Making large-scale support vector machine learning practical. In B. Scholkopf, C. Burges, & A. Smola (Eds.), Advances in kernel methods: Support vector machines. Cambridge, MA: MIT Press

Parameters

f	optimality indicator vector, size [n_train]
alpha	dual coefficients, size [n_train]
y	target labels (+/- 1)
C	penalty parameter vector size [n_train]
n_already_selected

Member Data Documentation

◆ FIFO_strategy

template<typename math_t >

bool ML::SVM::WorkingSet< math_t >::FIFO_strategy = true

Workspace selection strategy, note that only FIFO is tested so far

The documentation for this class was generated from the following file:

src/svm/workingset.h

Public Member Functions

Public Attributes

Detailed Description

template<typename math_t> class ML::SVM::WorkingSet< math_t >

Constructor & Destructor Documentation

◆ WorkingSet()

◆ ~WorkingSet()

Member Function Documentation

◆ GetIndices()

◆ GetSize()

◆ PrioritySelect()

◆ Select()

◆ SetSize()

◆ SimpleSelect()

Member Data Documentation

◆ FIFO_strategy

template<typename math_t>
class ML::SVM::WorkingSet< math_t >