KFold#

class cuml.model_selection.KFold(n_splits=5, *, shuffle=False, random_state=None)[source]#

K-Folds cross-validator.

Provides train/test indices to split data in train/test sets. Split dataset into k consecutive folds (without shuffling by default).

Each fold is then used once as a validation set while the k - 1 remaining folds form the training set.

Parameters:
n_splitsint, default=5

Number of folds. Must be at least 2.

shufflebool, default=False

Whether to shuffle the samples before splitting. Note that the samples within each split will not be shuffled.

random_stateint, CuPy RandomState, NumPy RandomState, or None, default=None

When shuffle is True, random_state affects the ordering of the indices, which controls the randomness of each fold. Otherwise, this parameter has no effect. Pass an int for reproducible output across multiple function calls.

Methods

get_n_splits([X, y])

Returns the number of splitting iterations in the cross-validator.

split(X[, y])

Generate indices to split data into training and test set.

get_n_splits(X=None, y=None)[source]#

Returns the number of splitting iterations in the cross-validator.

Parameters:
Xobject

Always ignored, exists for compatibility.

yobject

Always ignored, exists for compatibility.

Returns:
n_splitsint

Returns the number of splitting iterations in the cross-validator.

split(X, y=None)[source]#

Generate indices to split data into training and test set.

Parameters:
Xarray-like of shape (n_samples, n_features)

Training data, where n_samples is the number of samples and n_features is the number of features.

yarray-like of shape (n_samples,), default=None

The target variable for supervised learning problems.

Yields:
trainCuPy ndarray

The training set indices for that split.

testCuPy ndarray

The testing set indices for that split.