Introduction ============ cuML accelerates machine learning on GPUs. The library follows a couple of key principles, and understanding these will help you take full advantage cuML. 1. Where possible, match the scikit-learn API --------------------------------------------- cuML estimators look and feel just like `scikit-learn estimators `_. You initialize them with key parameters, fit them with a ``fit`` method, then call ``predict`` or ``transform`` for inference. .. code-block:: python import cuml.LinearRegression model = cuml.LinearRegression() model.fit(X_train, y) y_prediction = model.predict(X_test) You can find many more complete examples in the `Introductory Notebook `_ and in the cuML API documentation. 2. Accept flexible input types, return predictable output types --------------------------------------------------------------- cuML estimators can accept NumPy arrays, cuDF dataframes, cuPy arrays, 2d PyTorch tensors, and really any kind of standards-based Python array input you can throw at them. This relies on the ``__array__`` and ``__cuda_array_interface__`` standards, widely used throughout the PyData community. By default, outputs will mirror the data type you provided. So, if you fit a model with a NumPy array, the ``model.coef_`` property containing fitted coefficients will also be a NumPy array. If you fit a model using cuDF's GPU-based DataFrame and Series objects, the model's output properties will be cuDF objects. You can always override this behavior and select a default datatype with the `memory_utils.set_global_output_type `_ function. The `RAPIDS Configurable Input and Output Types `_ blog post goes into much more detail explaining this approach. 3. Be fast! ----------- cuML's estimators rely on highly-optimized CUDA primitives and algorithms within ``libcuml``. On a modern GPU, these can exceed the performance of CPU-based equivalents by a factor of anything from 4x (for a medium-sized linear regression) to over 1000x (for large-scale tSNE dimensionality reduction). The `cuml.benchmark `_ module provides an easy interface to benchmark your own hardware. To maximize performance, keep in mind - a modern GPU can have over 5000 cores, so make sure you're providing enough data to keep it busy! In many cases, performance advantages appear as the dataset grows. Learn more ---------- To get started learning cuML, walk through the `Introductory Notebook `_. Then try out some of the other notebook examples in the ``notebooks`` directory of the repository. Finally, do a deeper dive with the `cuML blogs `_.