Limitations#
The cuml.accel
zero code change accelerator is currently a beta feature. As
such, it has a number of known limitations and bugs. The team is working to
address these, and expect the number of limitations to reduce with every
release.
These limitations fall into a few categories:
Estimators that are fully unaccelerated. For example, while we currently provide GPU acceleration for models like
sklearn.linear_model.Ridge
, we don’t accelerate other models likesklearn.linear_model.BayesianRidge
. Unaccelerated estimators won’t result in bugs or failures, but also won’t run any faster than they would undersklearn
. If you don’t see an estimator on listed on this page, we do not provide acceleration for it.Estimators that are only partially accelerated.
cuml.accel
will fall back to using the CPU implementations for some algorithms in the presence of certain hyperparameters or input types. These cases are documented below in estimator-specific sections. See Logging and Profiling for how to enable logging to gain insight into whencuml.accel
needs to fall back to CPU.Missing fitted attributes.
cuml.accel
does not currently generate the full set of fitted attributes thatsklearn
does. In _most_ cases this is not a problem, the missing attributes are usually minor things liken_iters_
that are useful for inspecting a model fit but not necessary for inference. Like unsupported parameters, missing fitted attributes are documented in algorithm-specific sections below.Differences between fit models. The algorithms and implementations used in
cuml
naturally differ from those used insklearn
, this may result in differences between fit models. This is to be expected. To compare results between models fit withcuml.accel
and those fit without, you should compare the model quality (using e.g.model.score
) and not the numeric values of the fitted coefficients.
None of the above should result in bugs (exceptions, failures, poor model
quality, …). That said, as a beta feature there are likely bugs. If you find
a case that errors or results in a model with measurably worse quality when
run under cuml.accel
, please open an issue.
A few additional general notes:
Performance improvements will be most apparent when running on larger data. On very small datasets you might see only a small speedup (or even potentially a slowdown).
For most algorithms,
y
must already be converted to numeric values; arrays of strings are not supported. Pre-encode string labels into numerical or categorical formats (e.g., using scikit-learn’s LabelEncoder) prior to processing.The accelerator is compatible with scikit-learn version 1.4 or higher. This compatibility ensures that cuML’s implementation of scikit-learn compatible APIs works as expected.
Error and warning messages and formats may differ from scikit-learn. Some errors might present as C++ stacktraces instead of python errors.
For notes on each algorithm, please refer to its specific section on this file.
hdbscan#
HDBSCAN
will fall back to CPU in the following cases:
If
metric
is not"l2"
or"euclidean"
.If a
memory
location is configured.If
match_reference_implementation=True
.If
branch_detection_data=True
.
Additionally, the following fitted attributes are currently not computed:
exemplars_
outlier_scores_
relative_validity_
Additional notes:
The
HDBSCAN
incuml
uses a parallel MST implementation, which means the results are not deterministic when there are duplicates in the mutual reachability graph.
sklearn.cluster#
The algorithms used in cuml
differ from those in sklearn
. As such, you
shouldn’t expect the fitted attributes (e.g. labels_
) to numerically match
an estimator fitted without cuml.accel
.
To compare results between estimators, we recommend comparing scores like
sklearn.metrics.adjusted_rand_score
or
sklearn.metrics.adjusted_mutual_info_score
. For low dimensional data you
can also visually inspect the resulting cluster assignments.
KMeans#
KMeans
will fall back to CPU in the following cases:
If a callable
init
is provided.If
X
is sparse.
DBSCAN#
DBSCAN
will fall back to CPU in the following cases:
If
algorithm
isn’t"auto"
or"brute"
.If
metric
isn’t one of the supported metrics ("l2"
,"euclidean"
,"cosine"
,"precomputed"
).If
X
is sparse.
sklearn.decomposition#
The sklearn.decomposition
implementations used by cuml.accel
uses
different SVD solvers than the ones in Scikit-Learn, which may result in
numeric differences in the components_
and explained_variance_
values.
These differences should be small for most algorithms, but may be larger for
randomized or less-numerically-stable solvers like "randomized"
or
"covariance_eigh"
.
Likewise, note that the implementation in cuml.accel
currently may result
in some of the vectors in components_
having inverted signs. This result is
not incorrect, but can make it harder to do direct numeric comparisons without
first normalizing the signs. One common way of handling this is by normalizing
the first non-zero values in each vector to be positive. You might find the
following numpy
function useful for this.
import numpy as np
def normalize(components):
"""Normalize the sign of components for easier numeric comparison"""
nonzero = components != 0
inds = np.where(nonzero.any(axis=1), nonzero.argmax(axis=1), 0)[:, None]
first_nonzero = np.take_along_axis(components, inds, 1)
return np.sign(first_nonzero) * components
PCA#
PCA
will fall back to CPU in the following cases:
If
n_components="mle"
.
Additional notes:
Parameters for the
"randomized"
solver likerandom_state
,n_oversamples
,power_iteration_normalizer
are ignored.
TruncatedSVD#
TruncatedSVD
will fall back to CPU in the following cases:
If
X
is sparse.
Additional notes:
Parameters for the
"randomized"
solver likerandom_state
,n_oversamples
,power_iteration_normalizer
are ignored.
sklearn.ensemble#
The random forest implementation used by cuml.accel
algorithmically
differs from the one in sklearn
. As such, you
shouldn’t expect the fitted attributes (e.g. estimators_
) to numerically match
an estimator fitted without cuml.accel
.
To compare results between estimators, we recommend comparing scores like
sklearn.metrics.root_mean_squared_error
(for regression) or
sklearn.metrics.log_loss
(for classification).
RandomForestClassifier#
RandomForestClassifier
will fall back to CPU in the following cases:
If
criterion
is"log_loss"
.If
oob_score=True
.If
warm_start=True
.If
monotonic_cst
is notNone
.If
max_values
is an integer.If
min_weight_fraction_leaf
is not0
.If
ccp_alpha
is not0
.If
class_weight
is notNone
.If
sample_weight
is passed tofit
orscore
.If
X
is sparse.
Additionally, the following fitted attributes are currently not computed:
feature_importances_
estimators_samples_
RandomForestRegressor#
RandomForestRegressor
will fall back to CPU in the following cases:
If
criterion
is"absolute_error"
or"friedman_mse"
.If
oob_score=True
.If
warm_start=True
.If
monotonic_cst
is notNone
.If
max_values
is an integer.If
min_weight_fraction_leaf
is not0
.If
ccp_alpha
is not0
.If
sample_weight
is passed tofit
orscore
.If
X
is sparse.
Additionally, the following fitted attributes are currently not computed:
feature_importances_
estimators_samples_
sklearn.kernel_ridge#
KernelRidge#
KernelRidge
will fall back to CPU in the following cases:
If
X
is sparse.
KernelRidge
results should be almost identical to those of Scikit-Learn
when running with cuml.accel
enabled. In particular, the fitted
dual_coef_
should be close enough that they may be compared via
np.allclose
.
sklearn.linear_model#
The linear model solvers used by cuml.accel
differ from those used in
sklearn
. As such, you shouldn’t expect the fitted attributes (e.g.
coef_
) to numerically match an estimator fitted without cuml.accel
. For
some estimators (e.g. LinearRegression
) you might get a close match, but
for others there may larger numeric differences.
To compare results between estimators, we recommend comparing model quality
scores like sklearn.metrics.r2_score
(for regression) or
sklearn.metrics.accuracy_score
(for classification).
LinearRegression#
LinearRegression
will fall back to CPU in the following cases:
If
positive=True
.If
X
is sparse.
Additionally, the following fitted attributes are currently not computed:
rank_
singular_
LogisticRegression#
LogisticRegression
will fall back to CPU in the following cases:
If
warm_start=True
.If
intercept_scaling
is not1
.If the deprecated
multi_class
parameter is used.
ElasticNet#
ElasticNet
will fall back to CPU in the following cases:
If
positive=True
.If
warm_start=True
.If
precompute
is notFalse
.If
X
is sparse.
Additionally, the following fitted attributes are currently not computed:
dual_gap_
n_iter_
Ridge#
Ridge
will fall back to CPU in the following cases:
If
positive=True
.If
solver="lbfgs"
.If
X
is sparse.If
X
has more columns than rows.If
y
is multioutput.
Additionally, the following fitted attributes are currently not computed:
n_iter_
Lasso#
Lasso
will fall back to CPU in the following cases:
If
positive=True
.If
warm_start=True
.If
precompute
is notFalse
.If
X
is sparse.
Additionally, the following fitted attributes are currently not computed:
dual_gap_
n_iter_
sklearn.manifold#
TSNE#
TSNE
will fall back to CPU in the following cases:
If
n_components
is not2
.If
init
is an array.If
metric
isn’t one of the supported metrics ("l2"
,"euclidean"
,"sqeuclidean"
,"cityblock"
,"l1"
,"manhattan"
,"minkowski"
,"chebyshev"
,"cosine"
,"correlation"
).
Additionally, the following fitted attributes are currently not computed:
n_iter_
Additional notes:
Even with a
random_state
, the TSNE implementation used bycuml.accel
isn’t completely deterministic.
While the exact numerical output for TSNE may differ from that obtained without
cuml.accel
, we expect the quality of results will be approximately as
good in most cases. Beyond comparing the visual representation, you may find
comparing the trustworthiness score (computed via
sklearn.manifold.trustworthiness
) or the kl_divergence_
fitted
attribute useful.
sklearn.neighbors#
NearestNeighbors#
NearestNeighbors
will fall back to CPU in the following cases:
If
metric
is not one of the supported metrics ("l2"
,"euclidean"
,"l1"
,"cityblock"
,"manhattan"
,"taxicab"
,"canberra"
,"minkowski"
,"lp"
,"chebyshev"
,"linf"
,"jensenshannon"
,"cosine"
,"correlation"
,"inner_product"
,"sqeuclidean"
,"haversine"
).
Additional notes:
The
algorithm
parameter is ignored, the GPU accelerated"brute"
implementation in cuml will always be used.The
radius_neighbors
method isn’t implemented in cuml and will always fall back to CPU.
KNeighborsClassifier#
KNeighborsClassifier
will fall back to CPU in the following cases:
If
metric
is not one of the supported metrics ("l2"
,"euclidean"
,"l1"
,"cityblock"
,"manhattan"
,"taxicab"
,"canberra"
,"minkowski"
,"lp"
,"chebyshev"
,"linf"
,"jensenshannon"
,"cosine"
,"correlation"
,"inner_product"
,"sqeuclidean"
,"haversine"
).If
weights
is not"uniform"
.
Additional notes:
The
algorithm
parameter is ignored, the GPU accelerated"brute"
implementation in cuml will always be used.
KNeighborsRegressor#
KNeighborsRegressor
will fall back to CPU in the following cases:
If
metric
is not one of the supported metrics ("l2"
,"euclidean"
,"l1"
,"cityblock"
,"manhattan"
,"taxicab"
,"canberra"
,"minkowski"
,"lp"
,"chebyshev"
,"linf"
,"jensenshannon"
,"cosine"
,"correlation"
,"inner_product"
,"sqeuclidean"
,"haversine"
).If
weights
is not"uniform"
.
Additional notes:
The
algorithm
parameter is ignored, the GPU accelerated"brute"
implementation in cuml will always be used.
sklearn.svm#
The SVM used by cuml.accel
differ from those used in sklearn
. As such,
you shouldn’t expect the fitted attributes (e.g. coef_
or
support_vectors_
) to numerically match an estimator fitted without
cuml.accel
.
To compare results between estimators, we recommend comparing model quality
scores like sklearn.metrics.r2_score
(for regression) or
sklearn.metrics.accuracy_score
(for classification).
SVC#
SVC
will fall back to CPU in the following cases:
If
kernel="precomputed"
or is a callable.If
X
is sparse.If
y
is multiclass.
Additionally, the following fitted attributes are currently not computed:
class_weight_
n_iter_
SVR#
SVR
will fall back to CPU in the following cases:
If
kernel="precomputed"
or is a callable.If
X
is sparse.
Additionally, the following fitted attributes are currently not computed:
n_iter_
LinearSVC#
LinearSVC
will fall back to CPU in the following cases:
If
X
is sparse.If
intercept_scaling
is not1
.If
multi_class
is not"ovr"
.
The following fitted attributes are currently not computed:
n_iter_
Additional notes:
Sample weight functionality may not produce equivalent results to replicating data according to weights.
Use of sample weights may not produce exactly equivalent results when compared to replicating data according to weights.
Models may not be picklable; pickling or unpickling may fail.
Multi-class models may have coefficient shape differences that cause pickling failures.
LinearSVR#
LinearSVR
will fall back to CPU in the following cases:
If
X
is sparse.If
intercept_scaling
is not1
.
The following fitted attributes are currently not computed:
n_iter_
Additional notes:
Use of sample weights may not produce exactly equivalent results when compared to replicating data according to weights.
Models may not be picklable under certain conditions; pickling or unpickling may fail.
umap#
UMAP
will fall back to CPU in the following cases:
If
init
is not"random"
or"spectral"
.If
metric
is not one of the supported metrics ("l1"
,"cityblock"
,"taxicab"
,"manhattan"
,"euclidean"
,"l2"
,"sqeuclidean"
,"canberra"
,"minkowski"
,"chebyshev"
,"linf"
,"cosine"
,"correlation"
,"hellinger"
,"hamming"
,"jaccard"
).If
target_metric
is not one of the supported metrics ("categorical"
,"l2"
,"euclidean"
).If
unique=True
.If
densmap=True
.
Additional notes:
Reproducibility with the use of a seed (the
random_state
parameter) comes at the relative expense of performance.Parallelism during the optimization stage implies numerical imprecisions, which can lead to difference in the results between CPU and GPU in general.
While the exact numerical output for UMAP may differ from that obtained without
cuml.accel
, we expect the quality of results will be approximately as
good in most cases. Beyond comparing the visual representation, you may find
comparing the trustworthiness score (computed via
sklearn.manifold.trustworthiness
) useful.