LinearRegression#
- class cuml.dask.linear_model.LinearRegression(*, client=None, verbose=False, **kwargs)[source]#
LinearRegression is a simple machine learning model where the response y is modelled by a linear combination of the predictors in X.
cuML’s Dask Linear Regression (multi-node multi-GPU) expects Dask cuDF DataFrame and provides an eigendecomposition-based algorithm (Eig) to fit a linear model. The Eig algorithm is usually preferred when X is a tall and skinny matrix. As the number of features in X increases, the accuracy of the Eig algorithm may decrease.
- Parameters:
- algorithm‘eig’
Eig uses an eigendecomposition of the covariance matrix, and is much faster. SVD is slower, but guaranteed to be stable.
- fit_interceptboolean (default = True)
LinearRegression adds an additional term c to correct for the global mean of y, modeling the response as “x * beta + c”. If False, the model expects that you have centered the data.
- Attributes:
- coef_cuDF series, shape (n_features)
The estimated coefficients for the linear regression model.
- intercept_array
The independent term. If
fit_interceptis False, will be 0.
Methods
fit(X, y)Fit the model with X and y.
predict(X[, delayed])Make predictions for X and returns a dask collection.
- fit(X, y)[source]#
Fit the model with X and y.
- Parameters:
- XDask cuDF DataFrame or CuPy backed Dask Array (n_rows, n_features)
Features for regression
- yDask cuDF DataFrame or CuPy backed Dask Array (n_rows, 1)
Labels (outcome values)
- predict(X, delayed=True)[source]#
Make predictions for X and returns a dask collection.
- Parameters:
- XDask cuDF DataFrame or CuPy backed Dask Array (n_rows, n_features)
Distributed dense matrix (floats or doubles) of shape (n_samples, n_features).
- delayedbool (default = True)
Whether to do a lazy prediction (and return Delayed objects) or an eagerly executed one.
- Returns:
- yDask cuDF DataFrame or CuPy backed Dask Array (n_rows, 1)