set_global_output_type#

cuml.set_global_output_type(output_type)[source]#

Set the global output type.

This output type will be used by functions and estimator methods.

Note that instead of setting globally, an output type may be set contextually using using_output_type(), or on the estimator itself with the output_type parameter.

Parameters:
output_type{‘input’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’, None}

Desired output type of results and attributes of the estimators.

  • None: No globally configured output type. This is the same as 'input', except in cases where an estimator explicitly sets an output_type.

  • 'input': returns arrays of the same type as the inputs to the function or method. Fitted attributes will be of the same array type as X.

  • 'cupy': returns cupy arrays.

  • 'numpy': returns numpy arrays.

  • 'cudf': returns cudf.Series for single dimensional results and cudf.DataFrame otherwise.

  • 'pandas': returns pandas.Series for single dimensional results and pandas.DataFrame otherwise.

Notes

cupy is the most efficient output type, as it supports flexible memory layouts and doesn’t require device <-> host transfers.

cudf has slightly more overhead for single dimensional outputs. For two dimensional outputs additional copies may be needed due to memory layout requirements of cudf.DataFrame.

numpy and pandas have a more significant overhead as they require device <-> host transfers. Whether that overhead matters is of course application specific.

Examples

>>> import cuml
>>> import cupy as cp
>>> import cudf
>>> original_output_type = cuml.global_settings.output_type

Fit a model with a cupy array. By default the fitted attributes will be cupy arrays.

>>> X = cp.array([[1.0, 4.0, 4.0], [2.0, 2.0, 2.0], [5.0, 1.0, 1.0]])
>>> model = cuml.DBSCAN(eps=1.0, min_samples=1).fit(X)
>>> isinstance(model.labels_, cp.ndarray)
True

With a global output type set though, the fitted attributes will match the configured output type.

>>> cuml.set_global_output_type("cudf")
>>> isinstance(model.labels_, cudf.Series)
True

Reset the output type back to its original value.

>>> cuml.set_global_output_type(original_output_type)