Limitations#
The cuml.accel zero code change accelerator is currently a beta feature. As
such, it has a number of known limitations and bugs. The team is working to
address these, and expect the number of limitations to reduce with every
release.
These limitations fall into a few categories:
Estimators that are fully unaccelerated. For example, while we currently provide GPU acceleration for models like
sklearn.linear_model.Ridge, we don’t accelerate other models likesklearn.linear_model.BayesianRidge. Unaccelerated estimators won’t result in bugs or failures, but also won’t run any faster than they would undersklearn. If you don’t see an estimator on listed on this page, we do not provide acceleration for it.Estimators that are only partially accelerated.
cuml.accelwill fall back to using the CPU implementations for some algorithms in the presence of certain hyperparameters or input types. These cases are documented below in estimator-specific sections. See Logging and Profiling for how to enable logging to gain insight into whencuml.accelneeds to fall back to CPU.Missing fitted attributes.
cuml.acceldoes not currently generate the full set of fitted attributes thatsklearndoes. In _most_ cases this is not a problem, the missing attributes are usually minor things liken_iters_that are useful for inspecting a model fit but not necessary for inference. Like unsupported parameters, missing fitted attributes are documented in algorithm-specific sections below.Differences between fit models. The algorithms and implementations used in
cumlnaturally differ from those used insklearn, this may result in differences between fit models. This is to be expected. To compare results between models fit withcuml.acceland those fit without, you should compare the model quality (using e.g.model.score) and not the numeric values of the fitted coefficients.
None of the above should result in bugs (exceptions, failures, poor model
quality, …). That said, as a beta feature there are likely bugs. If you find
a case that errors or results in a model with measurably worse quality when
run under cuml.accel, please open an issue.
A few additional general notes:
Performance improvements will be most apparent when running on larger data. On very small datasets you might see only a small speedup (or even potentially a slowdown).
For most algorithms,
ymust already be converted to numeric values; arrays of strings are not supported. Pre-encode string labels into numerical or categorical formats (e.g., using scikit-learn’s LabelEncoder) prior to processing.The accelerator is compatible with scikit-learn version 1.4 or higher. This compatibility ensures that cuML’s implementation of scikit-learn compatible APIs works as expected.
Error and warning messages and formats may differ from scikit-learn. Some errors might present as C++ stacktraces instead of python errors.
For notes on each algorithm, please refer to its specific section on this file.
hdbscan#
HDBSCAN will fall back to CPU in the following cases:
If
metricis not"l2"or"euclidean".If a
memorylocation is configured.If
match_reference_implementation=True.If
branch_detection_data=True.
Additionally, the following fitted attributes are currently not computed:
exemplars_outlier_scores_relative_validity_
Additional notes:
The
HDBSCANincumluses a parallel MST implementation, which means the results are not deterministic when there are duplicates in the mutual reachability graph.
sklearn.cluster#
The algorithms used in cuml differ from those in sklearn. As such, you
shouldn’t expect the fitted attributes (e.g. labels_) to numerically match
an estimator fitted without cuml.accel.
To compare results between estimators, we recommend comparing scores like
sklearn.metrics.adjusted_rand_score or
sklearn.metrics.adjusted_mutual_info_score. For low dimensional data you
can also visually inspect the resulting cluster assignments.
KMeans#
KMeans will fall back to CPU in the following cases:
If a callable
initis provided.If
Xis sparse.
DBSCAN#
DBSCAN will fall back to CPU in the following cases:
If
algorithmisn’t"auto"or"brute".If
metricisn’t one of the supported metrics ("l2","euclidean","cosine","precomputed").If
Xis sparse.
sklearn.decomposition#
The sklearn.decomposition implementations used by cuml.accel uses
different SVD solvers than the ones in Scikit-Learn, which may result in
numeric differences in the components_ and explained_variance_ values.
These differences should be small for most algorithms, but may be larger for
randomized or less-numerically-stable solvers like "randomized" or
"covariance_eigh".
Likewise, note that the implementation in cuml.accel currently may result
in some of the vectors in components_ having inverted signs. This result is
not incorrect, but can make it harder to do direct numeric comparisons without
first normalizing the signs. One common way of handling this is by normalizing
the first non-zero values in each vector to be positive. You might find the
following numpy function useful for this.
import numpy as np
def normalize(components):
"""Normalize the sign of components for easier numeric comparison"""
nonzero = components != 0
inds = np.where(nonzero.any(axis=1), nonzero.argmax(axis=1), 0)[:, None]
first_nonzero = np.take_along_axis(components, inds, 1)
return np.sign(first_nonzero) * components
PCA#
PCA will fall back to CPU in the following cases:
If
n_components="mle".
Additional notes:
Parameters for the
"randomized"solver likerandom_state,n_oversamples,power_iteration_normalizerare ignored.
TruncatedSVD#
TruncatedSVD will fall back to CPU in the following cases:
If
Xis sparse.
Additional notes:
Parameters for the
"randomized"solver likerandom_state,n_oversamples,power_iteration_normalizerare ignored.
sklearn.ensemble#
The random forest implementation used by cuml.accel algorithmically
differs from the one in sklearn. As such, you
shouldn’t expect the fitted attributes (e.g. estimators_) to numerically match
an estimator fitted without cuml.accel.
To compare results between estimators, we recommend comparing scores like
sklearn.metrics.root_mean_squared_error (for regression) or
sklearn.metrics.log_loss (for classification).
RandomForestClassifier#
RandomForestClassifier will fall back to CPU in the following cases:
If
criterionis"log_loss".If
oob_score=True.If
warm_start=True.If
monotonic_cstis notNone.If
max_valuesis an integer.If
min_weight_fraction_leafis not0.If
ccp_alphais not0.If
class_weightis notNone.If
sample_weightis passed tofitorscore.If
Xis sparse.
RandomForestRegressor#
RandomForestRegressor will fall back to CPU in the following cases:
If
criterionis"absolute_error"or"friedman_mse".If
oob_score=True.If
warm_start=True.If
monotonic_cstis notNone.If
max_valuesis an integer.If
min_weight_fraction_leafis not0.If
ccp_alphais not0.If
sample_weightis passed tofitorscore.If
Xis sparse.
sklearn.kernel_ridge#
KernelRidge#
KernelRidge will fall back to CPU in the following cases:
If
Xis sparse.
KernelRidge results should be almost identical to those of Scikit-Learn
when running with cuml.accel enabled. In particular, the fitted
dual_coef_ should be close enough that they may be compared via
np.allclose.
sklearn.linear_model#
The linear model solvers used by cuml.accel differ from those used in
sklearn. As such, you shouldn’t expect the fitted attributes (e.g.
coef_) to numerically match an estimator fitted without cuml.accel. For
some estimators (e.g. LinearRegression) you might get a close match, but
for others there may larger numeric differences.
To compare results between estimators, we recommend comparing model quality
scores like sklearn.metrics.r2_score (for regression) or
sklearn.metrics.accuracy_score (for classification).
LinearRegression#
LinearRegression will fall back to CPU in the following cases:
If
positive=True.If
Xis sparse.
Additionally, the following fitted attributes are currently not computed:
rank_singular_
LogisticRegression#
LogisticRegression will fall back to CPU in the following cases:
If
warm_start=True.If
intercept_scalingis not1.If the deprecated
multi_classparameter is used.
ElasticNet#
ElasticNet will fall back to CPU in the following cases:
If
positive=True.If
warm_start=True.If
precomputeis notFalse.If
Xis sparse.
Additionally, the following fitted attributes are currently not computed:
dual_gap_n_iter_
Ridge#
Ridge will fall back to CPU in the following cases:
If
positive=True.If
solver="lbfgs".If
Xis sparse.If
Xhas more columns than rows.If
yis multioutput.
Additionally, the following fitted attributes are currently not computed:
n_iter_
Lasso#
Lasso will fall back to CPU in the following cases:
If
positive=True.If
warm_start=True.If
precomputeis notFalse.If
Xis sparse.
Additionally, the following fitted attributes are currently not computed:
dual_gap_n_iter_
sklearn.manifold#
TSNE#
TSNE will fall back to CPU in the following cases:
If
n_componentsis not2.If
initis an array.If
metricisn’t one of the supported metrics ("l2","euclidean","sqeuclidean","cityblock","l1","manhattan","minkowski","chebyshev","cosine","correlation").
Additional notes:
Even with a
random_state, the TSNE implementation used bycuml.accelisn’t completely deterministic.
While the exact numerical output for TSNE may differ from that obtained without
cuml.accel, we expect the quality of results will be approximately as
good in most cases. Beyond comparing the visual representation, you may find
comparing the trustworthiness score (computed via
sklearn.manifold.trustworthiness) or the kl_divergence_ fitted
attribute useful.
SpectralEmbedding#
SpectralEmbedding will fall back to CPU in the following cases:
If
affinityis not"nearest_neighbors"or"precomputed".If
Xis sparse.If
Xhas only 1 feature.
The following fitted attributes are currently not computed:
affinity_matrix_
sklearn.neighbors#
NearestNeighbors#
NearestNeighbors will fall back to CPU in the following cases:
If
metricis not one of the supported metrics ("l2","euclidean","l1","cityblock","manhattan","taxicab","canberra","minkowski","lp","chebyshev","linf","jensenshannon","cosine","correlation","inner_product","sqeuclidean","haversine").
Additional notes:
The
algorithmparameter is ignored, the GPU accelerated"brute"implementation in cuml will always be used.The
radius_neighborsmethod isn’t implemented in cuml and will always fall back to CPU.
KNeighborsClassifier#
KNeighborsClassifier will fall back to CPU in the following cases:
If
metricis not one of the supported metrics ("l2","euclidean","l1","cityblock","manhattan","taxicab","canberra","minkowski","lp","chebyshev","linf","jensenshannon","cosine","correlation","inner_product","sqeuclidean","haversine").If
weightsis not"uniform".
Additional notes:
The
algorithmparameter is ignored, the GPU accelerated"brute"implementation in cuml will always be used.
KNeighborsRegressor#
KNeighborsRegressor will fall back to CPU in the following cases:
If
metricis not one of the supported metrics ("l2","euclidean","l1","cityblock","manhattan","taxicab","canberra","minkowski","lp","chebyshev","linf","jensenshannon","cosine","correlation","inner_product","sqeuclidean","haversine").If
weightsis not"uniform".
Additional notes:
The
algorithmparameter is ignored, the GPU accelerated"brute"implementation in cuml will always be used.
KernelDensity#
KernelDensity will fall back to CPU in the following cases:
If
metricis not one of the supported metrics ("cityblock","cosine","euclidean","l1","l2","manhattan","sqeuclidean","canberra","chebyshev","minkowski","hellinger","correlation","jensenshannon","hamming","kldivergence","russellrao","nan_euclidean").
Additional notes:
The
algorithm,atol,rtol,breadth_first, andleaf_sizeparameters are ignored. The GPU accelerated pairwise brute-force implementation in cuml will always be used.
sklearn.svm#
The SVM used by cuml.accel differ from those used in sklearn. As such,
you shouldn’t expect the fitted attributes (e.g. coef_ or
support_vectors_) to numerically match an estimator fitted without
cuml.accel.
To compare results between estimators, we recommend comparing model quality
scores like sklearn.metrics.r2_score (for regression) or
sklearn.metrics.accuracy_score (for classification).
SVC#
SVC will fall back to CPU in the following cases:
If
kernel="precomputed"or is a callable.If
Xis sparse.If
yis multiclass.
Additionally, the following fitted attributes are currently not computed:
class_weight_n_iter_
SVR#
SVR will fall back to CPU in the following cases:
If
kernel="precomputed"or is a callable.If
Xis sparse.
Additionally, the following fitted attributes are currently not computed:
n_iter_
LinearSVC#
LinearSVC will fall back to CPU in the following cases:
If
Xis sparse.If
intercept_scalingis not1.If
multi_classis not"ovr".
The following fitted attributes are currently not computed:
n_iter_
Additional notes:
Sample weight functionality may not produce equivalent results to replicating data according to weights.
Use of sample weights may not produce exactly equivalent results when compared to replicating data according to weights.
LinearSVR#
LinearSVR will fall back to CPU in the following cases:
If
Xis sparse.If
intercept_scalingis not1.
The following fitted attributes are currently not computed:
n_iter_
Additional notes:
Use of sample weights may not produce exactly equivalent results when compared to replicating data according to weights.
umap#
UMAP will fall back to CPU in the following cases:
If
initis not"random"or"spectral".If
metricis not one of the supported metrics ("l1","cityblock","taxicab","manhattan","euclidean","l2","sqeuclidean","canberra","minkowski","chebyshev","linf","cosine","correlation","hellinger","hamming","jaccard").If
target_metricis not one of the supported metrics ("categorical","l2","euclidean").If
unique=True.If
densmap=True.
Additional notes:
Reproducibility with the use of a seed (the
random_stateparameter) comes at the relative expense of performance.Parallelism during the optimization stage implies numerical imprecisions, which can lead to difference in the results between CPU and GPU in general.
While the exact numerical output for UMAP may differ from that obtained without
cuml.accel, we expect the quality of results will be approximately as
good in most cases. Beyond comparing the visual representation, you may find
comparing the trustworthiness score (computed via
sklearn.manifold.trustworthiness) useful.