MultinomialNB#
- class cuml.dask.naive_bayes.MultinomialNB(*, client=None, verbose=False, **kwargs)[source]#
Distributed Naive Bayes classifier for multinomial models
Methods
fit(X, y[, classes])Fit distributed Naive Bayes classifier model
predict(X)Use distributed Naive Bayes model to predict the classes for a given set of data samples.
score(X, y)Compute accuracy score
Examples
Load the 20 newsgroups dataset from Scikit-learn and train a Naive Bayes classifier.
>>> import cupy as cp >>> from sklearn.datasets import fetch_20newsgroups >>> from sklearn.feature_extraction.text import CountVectorizer >>> from dask_cuda import LocalCUDACluster >>> from dask.distributed import Client >>> import dask >>> from cuml.dask.common import to_sparse_dask_array >>> from cuml.dask.naive_bayes import MultinomialNB >>> # Create a local CUDA cluster >>> cluster = LocalCUDACluster() >>> client = Client(cluster) >>> # Load corpus >>> twenty_train = fetch_20newsgroups(subset='train', ... shuffle=True, random_state=42) >>> cv = CountVectorizer() >>> xformed = cv.fit_transform(twenty_train.data).astype(cp.float32) >>> X = to_sparse_dask_array(xformed, client) >>> y = dask.array.from_array(twenty_train.target, asarray=False, ... fancy=False).astype(cp.int32) >>> # Train model >>> model = MultinomialNB() >>> model.fit(X, y) <cuml.dask.naive_bayes.naive_bayes.MultinomialNB object at 0x...> >>> # Compute accuracy on training set >>> model.score(X, y) array(0.924...) >>> client.close() >>> cluster.close()
- fit(X, y, classes=None)[source]#
Fit distributed Naive Bayes classifier model
- Parameters:
- Xdask.Array with blocks containing dense or sparse cupy arrays
- ydask.Array with blocks containing cupy.ndarray
- classesarray-like containing unique class labels
- Returns:
- cuml.dask.naive_bayes.MultinomialNB current model instance
- predict(X)[source]#
Use distributed Naive Bayes model to predict the classes for a given set of data samples.
- Parameters:
- Xdask.Array with blocks containing dense or sparse cupy arrays
- Returns:
- dask.Array containing predicted classes
- score(X, y)[source]#
Compute accuracy score
- Parameters:
- XDask.Array
Features to predict. Note- it is assumed that chunk sizes and shape of X are known. This can be done for a fully delayed Array by calling X.compute_chunks_sizes()
- yDask.Array
Labels to use for computing accuracy. Note- it is assumed that chunk sizes and shape of X are known. This can be done for a fully delayed Array by calling X.compute_chunks_sizes()
- Returns:
- scorefloat the resulting accuracy score