SGD#

class cuml.solvers.SGD(*, loss='squared_loss', penalty=None, alpha=0.0001, l1_ratio=0.15, fit_intercept=True, epochs=1000, tol=0.001, shuffle=True, learning_rate='constant', eta0=0.001, power_t=0.5, batch_size=32, n_iter_no_change=5, output_type=None, verbose=False)#

Stochastic Gradient Descent is a very common machine learning algorithm where one optimizes some cost function via gradient steps. This makes SGD very attractive for large problems when the exact solution is hard or even impossible to find.

cuML’s SGD algorithm accepts a numpy matrix or a cuDF DataFrame as the input dataset. The SGD algorithm currently works with linear regression, ridge regression and SVM models.

Parameters:

loss‘hinge’, ‘log’, ‘squared_loss’ (default = ‘squared_loss’)

‘hinge’ uses linear SVM ‘log’ uses logistic regression ‘squared_loss’ uses linear regression

penalty{‘l1’, ‘l2’, ‘elasticnet’, None} (default = None)

The penalty (aka regularization term) to apply.

‘l1’: L1 norm (Lasso) regularization
‘l2’: L2 norm (Ridge) regularization
‘elasticnet’: Elastic Net regularization, a weighted average of L1 and L2
None: no penalty is added (the default)

alphafloat (default = 0.0001)

The constant value which decides the degree of regularization

fit_interceptboolean (default = True)

If True, the model tries to correct for the global mean of y. If False, the model expects that you have centered the data.

epochsint (default = 1000)

The number of times the model should iterate through the entire dataset during training (default = 1000)

tolfloat (default = 1e-3)

The training process will stop if current_loss > previous_loss - tol

shuffleboolean (default = True)

True, shuffles the training data after each epoch False, does not shuffle the training data after each epoch

eta0float (default = 0.001)

Initial learning rate

power_tfloat (default = 0.5)

The exponent used for calculating the invscaling learning rate

batch_sizeint (default=32)

The number of samples to use for each batch.

learning_rate{‘constant’, ‘invscaling’, ‘adaptive’} (default = ‘constant’)

constant keeps the learning rate constant adaptive changes the learning rate if the training loss or the validation accuracy does not improve for n_iter_no_change epochs. The old learning rate is generally divide by 5

n_iter_no_changeint (default = 5)

The number of epochs to train without any improvement in the model

output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None

Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.

verboseint or boolean, default=False

Sets logging level. It must be one of cuml.common.logger.level_*. See Verbosity Levels for more info.

Attributes:

coef_

Methods

`fit`(self, X, y, *[, convert_dtype])	Fit the model with X and y.
`predict`(self, X, *[, convert_dtype])	Predicts the y for X.

Examples

>>> import numpy as np
>>> import cudf
>>> from cuml.solvers import SGD as cumlSGD
>>> X = cudf.DataFrame()
>>> X['col1'] = np.array([1,1,2,2], dtype=np.float32)
>>> X['col2'] = np.array([1,2,2,3], dtype=np.float32)
>>> y = cudf.Series(np.array([1, 1, 2, 2], dtype=np.float32))
>>> pred_data = cudf.DataFrame()
>>> pred_data['col1'] = np.asarray([3, 2], dtype=np.float32)
>>> pred_data['col2'] = np.asarray([5, 5], dtype=np.float32)
>>> cu_sgd = cumlSGD(learning_rate='constant', eta0=0.005, epochs=2000,
...                  fit_intercept=True, batch_size=2,
...                  tol=0.0, penalty=None, loss='squared_loss')
>>> cu_sgd.fit(X, y)
SGD()
>>> cu_pred = cu_sgd.predict(pred_data).to_numpy()
>>> print(" cuML intercept : ", cu_sgd.intercept_)
cuML intercept :  0.00418...
>>> print(" cuML coef : ", cu_sgd.coef_)
cuML coef :  0      0.9841...
1      0.0097...
dtype: float32
>>> print("cuML predictions : ", cu_pred)
cuML predictions :  [3.0055...  2.0214...]

fit(self, X, y, *, convert_dtype=True) → 'SGD'[source]#

Fit the model with X and y.

Parameters:

Xarray-like (device or host) shape = (n_samples, n_features): Dense matrix. If datatype is other than floats or doubles, then the data will be converted to float which increases memory utilization. Set the parameter convert_dtype to False to avoid this, then the method will throw an error instead. Acceptable formats: CUDA array interface compliant objects like CuPy, cuDF DataFrame/Series, NumPy ndarray and Pandas DataFrame/Series.
yarray-like (device or host) shape = (n_samples, 1): Dense matrix. If datatype is other than floats or doubles, then the data will be converted to float which increases memory utilization. Set the parameter convert_dtype to False to avoid this, then the method will throw an error instead. Acceptable formats: CUDA array interface compliant objects like CuPy, cuDF DataFrame/Series, NumPy ndarray and Pandas DataFrame/Series.
convert_dtypebool, optional (default = True): When set to True, the train method will, when necessary, convert y to be the same data type as X if they differ. This will increase memory used for the method.

predict(self, X, *, convert_dtype=True) → CumlArray[source]#

Predicts the y for X.

Parameters:

Xarray-like (device or host) shape = (n_samples, n_features): Dense matrix. If datatype is other than floats or doubles, then the data will be converted to float which increases memory utilization. Set the parameter convert_dtype to False to avoid this, then the method will throw an error instead. Acceptable formats: CUDA array interface compliant objects like CuPy, cuDF DataFrame/Series, NumPy ndarray and Pandas DataFrame/Series.
convert_dtypebool, optional (default = True): When set to True, the predict method will, when necessary, convert the input to the data type which was used to train the model. This will increase memory used for the method.

Returns:

predscuDF, CuPy or NumPy object depending on cuML’s output type configuration, shape = (n_samples,)

Predicted values

For more information on how to configure cuML’s output type, refer to: Output Data Type Configuration.