MultinomialNB#

class cuml.dask.naive_bayes.MultinomialNB(*, client=None, verbose=False, **kwargs)[source]#

Distributed Naive Bayes classifier for multinomial models

Methods

fit(X, y[, classes])

Fit distributed Naive Bayes classifier model

predict(X)

Use distributed Naive Bayes model to predict the classes for a given set of data samples.

score(X, y)

Compute accuracy score

Examples

Load the 20 newsgroups dataset from Scikit-learn and train a Naive Bayes classifier.

>>> import cupy as cp

>>> from sklearn.datasets import fetch_20newsgroups
>>> from sklearn.feature_extraction.text import CountVectorizer

>>> from dask_cuda import LocalCUDACluster
>>> from dask.distributed import Client
>>> import dask
>>> from cuml.dask.common import to_sparse_dask_array
>>> from cuml.dask.naive_bayes import MultinomialNB

>>> # Create a local CUDA cluster
>>> cluster = LocalCUDACluster()
>>> client = Client(cluster)

>>> # Load corpus
>>> twenty_train = fetch_20newsgroups(subset='train',
...                           shuffle=True, random_state=42)

>>> cv = CountVectorizer()
>>> xformed = cv.fit_transform(twenty_train.data).astype(cp.float32)
>>> X = to_sparse_dask_array(xformed, client)
>>> y = dask.array.from_array(twenty_train.target, asarray=False,
...                       fancy=False).astype(cp.int32)

>>> # Train model
>>> model = MultinomialNB()
>>> model.fit(X, y)
<cuml.dask.naive_bayes.naive_bayes.MultinomialNB object at 0x...>

>>> # Compute accuracy on training set
>>> model.score(X, y)
array(0.924...)
>>> client.close()
>>> cluster.close()
fit(X, y, classes=None)[source]#

Fit distributed Naive Bayes classifier model

Parameters:
Xdask.Array with blocks containing dense or sparse cupy arrays
ydask.Array with blocks containing cupy.ndarray
classesarray-like containing unique class labels
Returns:
cuml.dask.naive_bayes.MultinomialNB current model instance
predict(X)[source]#

Use distributed Naive Bayes model to predict the classes for a given set of data samples.

Parameters:
Xdask.Array with blocks containing dense or sparse cupy arrays
Returns:
dask.Array containing predicted classes
score(X, y)[source]#

Compute accuracy score

Parameters:
XDask.Array

Features to predict. Note- it is assumed that chunk sizes and shape of X are known. This can be done for a fully delayed Array by calling X.compute_chunks_sizes()

yDask.Array

Labels to use for computing accuracy. Note- it is assumed that chunk sizes and shape of X are known. This can be done for a fully delayed Array by calling X.compute_chunks_sizes()

Returns:
scorefloat the resulting accuracy score