Limitations#

The cuml.accel zero code change accelerator is currently a beta feature. As such, it has a number of known limitations and bugs. The team is working to address these, and expect the number of limitations to reduce with every release.

These limitations fall into a few categories:

  • Estimators that are fully unaccelerated. For example, while we currently provide GPU acceleration for models like sklearn.linear_model.Ridge, we don’t accelerate other models like sklearn.linear_model.BayesianRidge. Unaccelerated estimators won’t result in bugs or failures, but also won’t run any faster than they would under sklearn. If you don’t see an estimator on listed on this page, we do not provide acceleration for it.

  • Estimators that are only partially accelerated. cuml.accel will fall back to using the CPU implementations for some algorithms in the presence of certain hyperparameters or input types. These cases are documented below in estimator-specific sections. See Logging and Profiling for how to enable logging to gain insight into when cuml.accel needs to fall back to CPU.

  • Missing fitted attributes. cuml.accel does not currently generate the full set of fitted attributes that sklearn does. In _most_ cases this is not a problem, the missing attributes are usually minor things like n_iters_ that are useful for inspecting a model fit but not necessary for inference. Like unsupported parameters, missing fitted attributes are documented in algorithm-specific sections below.

  • Differences between fit models. The algorithms and implementations used in cuml naturally differ from those used in sklearn, this may result in differences between fit models. This is to be expected. To compare results between models fit with cuml.accel and those fit without, you should compare the model quality (using e.g. model.score) and not the numeric values of the fitted coefficients.

None of the above should result in bugs (exceptions, failures, poor model quality, …). That said, as a beta feature there are likely bugs. If you find a case that errors or results in a model with measurably worse quality when run under cuml.accel, please open an issue.

A few additional general notes:

  • Performance improvements will be most apparent when running on larger data. On very small datasets you might see only a small speedup (or even potentially a slowdown).

  • For most algorithms, y must already be converted to numeric values; arrays of strings are not supported. Pre-encode string labels into numerical or categorical formats (e.g., using scikit-learn’s LabelEncoder) prior to processing.

  • The accelerator is compatible with scikit-learn version 1.4 or higher. This compatibility ensures that cuML’s implementation of scikit-learn compatible APIs works as expected.

  • Error and warning messages and formats may differ from scikit-learn. Some errors might present as C++ stacktraces instead of python errors.

For notes on each algorithm, please refer to its specific section on this file.

hdbscan#

HDBSCAN will fall back to CPU in the following cases:

  • If metric is not "l2" or "euclidean".

  • If a memory location is configured.

  • If match_reference_implementation=True.

  • If branch_detection_data=True.

Additionally, the following fitted attributes are currently not computed:

  • exemplars_

  • outlier_scores_

  • relative_validity_

Additional notes:

  • The HDBSCAN in cuml uses a parallel MST implementation, which means the results are not deterministic when there are duplicates in the mutual reachability graph.

sklearn.cluster#

The algorithms used in cuml differ from those in sklearn. As such, you shouldn’t expect the fitted attributes (e.g. labels_) to numerically match an estimator fitted without cuml.accel.

To compare results between estimators, we recommend comparing scores like sklearn.metrics.adjusted_rand_score or sklearn.metrics.adjusted_mutual_info_score. For low dimensional data you can also visually inspect the resulting cluster assignments.

KMeans#

KMeans will fall back to CPU in the following cases:

  • If a callable init is provided.

  • If X is sparse.

DBSCAN#

DBSCAN will fall back to CPU in the following cases:

  • If algorithm isn’t "auto" or "brute".

  • If metric isn’t one of the supported metrics ("l2", "euclidean", "cosine", "precomputed").

  • If X is sparse.

sklearn.decomposition#

The sklearn.decomposition implementations used by cuml.accel uses different SVD solvers than the ones in Scikit-Learn, which may result in numeric differences in the components_ and explained_variance_ values. These differences should be small for most algorithms, but may be larger for randomized or less-numerically-stable solvers like "randomized" or "covariance_eigh".

Likewise, note that the implementation in cuml.accel currently may result in some of the vectors in components_ having inverted signs. This result is not incorrect, but can make it harder to do direct numeric comparisons without first normalizing the signs. One common way of handling this is by normalizing the first non-zero values in each vector to be positive. You might find the following numpy function useful for this.

import numpy as np

def normalize(components):
    """Normalize the sign of components for easier numeric comparison"""
    nonzero = components != 0
    inds = np.where(nonzero.any(axis=1), nonzero.argmax(axis=1), 0)[:, None]
    first_nonzero = np.take_along_axis(components, inds, 1)
    return np.sign(first_nonzero) * components

PCA#

PCA will fall back to CPU in the following cases:

  • If n_components="mle".

Additional notes:

  • Parameters for the "randomized" solver like random_state, n_oversamples, power_iteration_normalizer are ignored.

TruncatedSVD#

TruncatedSVD will fall back to CPU in the following cases:

  • If X is sparse.

Additional notes:

  • Parameters for the "randomized" solver like random_state, n_oversamples, power_iteration_normalizer are ignored.

sklearn.ensemble#

The random forest implementation used by cuml.accel algorithmically differs from the one in sklearn. As such, you shouldn’t expect the fitted attributes (e.g. estimators_) to numerically match an estimator fitted without cuml.accel.

To compare results between estimators, we recommend comparing scores like sklearn.metrics.root_mean_squared_error (for regression) or sklearn.metrics.log_loss (for classification).

RandomForestClassifier#

RandomForestClassifier will fall back to CPU in the following cases:

  • If criterion is "log_loss".

  • If oob_score=True.

  • If warm_start=True.

  • If monotonic_cst is not None.

  • If max_values is an integer.

  • If min_weight_fraction_leaf is not 0.

  • If ccp_alpha is not 0.

  • If class_weight is not None.

  • If sample_weight is passed to fit or score.

  • If X is sparse.

Additionally, the following fitted attributes are currently not computed:

  • feature_importances_

  • estimators_samples_

RandomForestRegressor#

RandomForestRegressor will fall back to CPU in the following cases:

  • If criterion is "absolute_error" or "friedman_mse".

  • If oob_score=True.

  • If warm_start=True.

  • If monotonic_cst is not None.

  • If max_values is an integer.

  • If min_weight_fraction_leaf is not 0.

  • If ccp_alpha is not 0.

  • If sample_weight is passed to fit or score.

  • If X is sparse.

Additionally, the following fitted attributes are currently not computed:

  • feature_importances_

  • estimators_samples_

sklearn.kernel_ridge#

KernelRidge#

KernelRidge will fall back to CPU in the following cases:

  • If X is sparse.

KernelRidge results should be almost identical to those of Scikit-Learn when running with cuml.accel enabled. In particular, the fitted dual_coef_ should be close enough that they may be compared via np.allclose.

sklearn.linear_model#

The linear model solvers used by cuml.accel differ from those used in sklearn. As such, you shouldn’t expect the fitted attributes (e.g. coef_) to numerically match an estimator fitted without cuml.accel. For some estimators (e.g. LinearRegression) you might get a close match, but for others there may larger numeric differences.

To compare results between estimators, we recommend comparing model quality scores like sklearn.metrics.r2_score (for regression) or sklearn.metrics.accuracy_score (for classification).

LinearRegression#

LinearRegression will fall back to CPU in the following cases:

  • If positive=True.

  • If X is sparse.

Additionally, the following fitted attributes are currently not computed:

  • rank_

  • singular_

LogisticRegression#

LogisticRegression will fall back to CPU in the following cases:

  • If warm_start=True.

  • If intercept_scaling is not 1.

  • If the deprecated multi_class parameter is used.

ElasticNet#

ElasticNet will fall back to CPU in the following cases:

  • If positive=True.

  • If warm_start=True.

  • If precompute is not False.

  • If X is sparse.

Additionally, the following fitted attributes are currently not computed:

  • dual_gap_

  • n_iter_

Ridge#

Ridge will fall back to CPU in the following cases:

  • If positive=True.

  • If solver="lbfgs".

  • If X is sparse.

  • If X has more columns than rows.

  • If y is multioutput.

Additionally, the following fitted attributes are currently not computed:

  • n_iter_

Lasso#

Lasso will fall back to CPU in the following cases:

  • If positive=True.

  • If warm_start=True.

  • If precompute is not False.

  • If X is sparse.

Additionally, the following fitted attributes are currently not computed:

  • dual_gap_

  • n_iter_

sklearn.manifold#

TSNE#

TSNE will fall back to CPU in the following cases:

  • If n_components is not 2.

  • If init is an array.

  • If metric isn’t one of the supported metrics ( "l2", "euclidean", "sqeuclidean", "cityblock", "l1", "manhattan", "minkowski", "chebyshev", "cosine", "correlation").

Additionally, the following fitted attributes are currently not computed:

  • n_iter_

Additional notes:

  • Even with a random_state, the TSNE implementation used by cuml.accel isn’t completely deterministic.

While the exact numerical output for TSNE may differ from that obtained without cuml.accel, we expect the quality of results will be approximately as good in most cases. Beyond comparing the visual representation, you may find comparing the trustworthiness score (computed via sklearn.manifold.trustworthiness) or the kl_divergence_ fitted attribute useful.

sklearn.neighbors#

NearestNeighbors#

NearestNeighbors will fall back to CPU in the following cases:

  • If metric is not one of the supported metrics ( "l2", "euclidean", "l1", "cityblock", "manhattan", "taxicab", "canberra", "minkowski", "lp", "chebyshev", "linf", "jensenshannon", "cosine", "correlation", "inner_product", "sqeuclidean", "haversine").

Additional notes:

  • The algorithm parameter is ignored, the GPU accelerated "brute" implementation in cuml will always be used.

  • The radius_neighbors method isn’t implemented in cuml and will always fall back to CPU.

KNeighborsClassifier#

KNeighborsClassifier will fall back to CPU in the following cases:

  • If metric is not one of the supported metrics ( "l2", "euclidean", "l1", "cityblock", "manhattan", "taxicab", "canberra", "minkowski", "lp", "chebyshev", "linf", "jensenshannon", "cosine", "correlation", "inner_product", "sqeuclidean", "haversine").

  • If weights is not "uniform".

Additional notes:

  • The algorithm parameter is ignored, the GPU accelerated "brute" implementation in cuml will always be used.

KNeighborsRegressor#

KNeighborsRegressor will fall back to CPU in the following cases:

  • If metric is not one of the supported metrics ( "l2", "euclidean", "l1", "cityblock", "manhattan", "taxicab", "canberra", "minkowski", "lp", "chebyshev", "linf", "jensenshannon", "cosine", "correlation", "inner_product", "sqeuclidean", "haversine").

  • If weights is not "uniform".

Additional notes:

  • The algorithm parameter is ignored, the GPU accelerated "brute" implementation in cuml will always be used.

sklearn.svm#

The SVM used by cuml.accel differ from those used in sklearn. As such, you shouldn’t expect the fitted attributes (e.g. coef_ or support_vectors_) to numerically match an estimator fitted without cuml.accel.

To compare results between estimators, we recommend comparing model quality scores like sklearn.metrics.r2_score (for regression) or sklearn.metrics.accuracy_score (for classification).

SVC#

SVC will fall back to CPU in the following cases:

  • If kernel="precomputed" or is a callable.

  • If X is sparse.

  • If y is multiclass.

Additionally, the following fitted attributes are currently not computed:

  • class_weight_

  • n_iter_

SVR#

SVR will fall back to CPU in the following cases:

  • If kernel="precomputed" or is a callable.

  • If X is sparse.

Additionally, the following fitted attributes are currently not computed:

  • n_iter_

LinearSVC#

LinearSVC will fall back to CPU in the following cases:

  • If X is sparse.

  • If intercept_scaling is not 1.

  • If multi_class is not "ovr".

The following fitted attributes are currently not computed:

  • n_iter_

Additional notes:

  • Sample weight functionality may not produce equivalent results to replicating data according to weights.

  • Use of sample weights may not produce exactly equivalent results when compared to replicating data according to weights.

  • Models may not be picklable; pickling or unpickling may fail.

  • Multi-class models may have coefficient shape differences that cause pickling failures.

LinearSVR#

LinearSVR will fall back to CPU in the following cases:

  • If X is sparse.

  • If intercept_scaling is not 1.

The following fitted attributes are currently not computed:

  • n_iter_

Additional notes:

  • Use of sample weights may not produce exactly equivalent results when compared to replicating data according to weights.

  • Models may not be picklable under certain conditions; pickling or unpickling may fail.

umap#

UMAP will fall back to CPU in the following cases:

  • If init is not "random" or "spectral".

  • If metric is not one of the supported metrics ("l1", "cityblock", "taxicab", "manhattan", "euclidean", "l2", "sqeuclidean", "canberra", "minkowski", "chebyshev", "linf", "cosine", "correlation", "hellinger", "hamming", "jaccard").

  • If target_metric is not one of the supported metrics ("categorical", "l2", "euclidean").

  • If unique=True.

  • If densmap=True.

Additional notes:

  • Reproducibility with the use of a seed (the random_state parameter) comes at the relative expense of performance.

  • Parallelism during the optimization stage implies numerical imprecisions, which can lead to difference in the results between CPU and GPU in general.

While the exact numerical output for UMAP may differ from that obtained without cuml.accel, we expect the quality of results will be approximately as good in most cases. Beyond comparing the visual representation, you may find comparing the trustworthiness score (computed via sklearn.manifold.trustworthiness) useful.