cuml.dask#
Multi-node, multi-GPU algorithms using Dask.
Cluster#
Decomposition#
PCA (Principal Component Analysis) is a fundamental dimensionality reduction technique used to combine features in X in linear combinations such that each new component captures the most information or variance of the data. |
|
Ensemble#
Multi-GPU Random Forest classifier model which fits multiple decision tree classifiers in an ensemble. |
|
Multi-GPU Random Forest regressor model which fits multiple decision tree regressors in an ensemble. |
Linear Models#
LinearRegression is a simple machine learning model where the response y is modelled by a linear combination of the predictors in X. |
|
Ridge extends LinearRegression by providing L2 regularization on the coefficients when predicting response y with a linear combination of the predictors in X. |
|
Lasso extends LinearRegression by providing L1 regularization on the coefficients when predicting response y with a linear combination of the predictors in X. |
|
ElasticNet extends LinearRegression with combined L1 and L2 regularizations on the coefficients when predicting response y with a linear combination of the predictors in X. |
Manifold#
Uniform Manifold Approximation and Projection |
Naive Bayes#
Distributed Naive Bayes classifier for multinomial models |
Neighbors#
Multi-node Multi-GPU NearestNeighbors Model. |
|
Multi-node Multi-GPU K-Nearest Neighbors Classifier Model. |
|
Multi-node Multi-GPU K-Nearest Neighbors Regressor Model. |
Preprocessing#
A distributed version of LabelBinarizer for one-hot encoding a collection of labels. |
|
Encode categorical features as a one-hot numeric array. |
Feature Extraction#
Distributed TF-IDF transformer |
Datasets#
Makes labeled Dask-Cupy arrays containing blobs for a randomly generated set of centroids. |
|
Generate a random n-class classification problem. |
|
Generate a random regression problem. |
Solvers#
Multi-node Multi-GPU Coordinate Descent Solver. |