Ridge#

class cuml.dask.linear_model.Ridge(*, client=None, verbose=False, **kwargs)[source]#

Ridge extends LinearRegression by providing L2 regularization on the coefficients when predicting response y with a linear combination of the predictors in X. It can reduce the variance of the predictors, and improves the conditioning of the problem.

cuML’s Dask Ridge (multi-node multi-GPU) expects Dask cuDF DataFrame and provides an eigendecomposition-based algorithm (Eig) to fit a linear model. The Eig algorithm is usually preferred when X is a tall and skinny matrix. As the number of features in X increases, the accuracy of the Eig algorithm may decrease.

Parameters:
alphafloat (default = 1.0)

Regularization strength - must be a positive float. Larger values specify stronger regularization.

solver{‘eig’}

Eig uses an eigendecomposition of the covariance matrix.

fit_interceptboolean (default = True)

If True, Ridge adds an additional term c to correct for the global mean of y, modeling the response as “x * beta + c”. If False, the model expects that you have centered the data.

Attributes:
coef_array, shape (n_features)

The estimated coefficients for the linear regression model.

intercept_array

The independent term. If fit_intercept is False, will be 0.

Methods

fit(X, y)

Fit the model with X and y.

predict(X[, delayed])

Make predictions for X and returns a dask collection.

fit(X, y)[source]#

Fit the model with X and y.

Parameters:
XDask cuDF DataFrame or CuPy backed Dask Array (n_rows, n_features)

Features for regression

yDask cuDF DataFrame or CuPy backed Dask Array (n_rows, 1)

Labels (outcome values)

predict(X, delayed=True)[source]#

Make predictions for X and returns a dask collection.

Parameters:
XDask cuDF DataFrame or CuPy backed Dask Array (n_rows, n_features)

Distributed dense matrix (floats or doubles) of shape (n_samples, n_features).

delayedbool (default = True)

Whether to do a lazy prediction (and return Delayed objects) or an eagerly executed one.

Returns:
yDask cuDF DataFrame or CuPy backed Dask Array (n_rows, 1)