GaussianNB#

class cuml.naive_bayes.GaussianNB(*, priors=None, var_smoothing=1e-09, output_type=None, verbose=False)[source]#

Gaussian Naive Bayes (GaussianNB)

Can perform online updates to model parameters via partial_fit(). For details on algorithm used to update feature means and variance online, see Stanford CS tech report STAN-CS-79-773 by Chan, Golub, and LeVeque:

http://i.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf

Parameters:

priorsarray-like of shape (n_classes,): Prior probabilities of the classes. If specified the priors are not adjusted according to the data.
var_smoothingfloat, default=1e-9: Portion of the largest variance of all features that is added to variances for calculation stability.
output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None: Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.
verboseint or boolean, default=False: Sets logging level. It must be one of cuml.common.logger.level_*. See Verbosity Levels for more info.

Attributes:

class_prior_

Methods

`fit`(X, y[, sample_weight])	Fit Gaussian Naive Bayes classifier according to X, y
`partial_fit`(X, y[, classes, sample_weight])	Incremental fit on a batch of samples.

Examples

>>> import cupy as cp
>>> from cuml.naive_bayes import GaussianNB
>>> X = cp.array(
...     [[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]],
...     dtype=cp.float32
... )
>>> y = cp.array([1, 1, 1, 2, 2, 2])
>>> clf = GaussianNB().fit(X, y)
>>> print(clf.predict(cp.array([[-0.8, -1]], cp.float32)))
[1]

fit(X, y, sample_weight=None) → GaussianNB[source]#

Fit Gaussian Naive Bayes classifier according to X, y

Parameters:

X{array-like, cupy sparse matrix} of shape (n_samples, n_features): Training vectors, where n_samples is the number of samples and n_features is the number of features.
yarray-like shape (n_samples): Target values.
sample_weightarray-like of shape (n_samples): Weights applied to individual samples.

partial_fit(X, y, classes=None, sample_weight=None) → GaussianNB[source]#

Incremental fit on a batch of samples.

This method is expected to be called several times consecutively on different chunks of a dataset so as to implement out-of-core or online learning.

This is especially useful when the whole dataset is too big to fit in memory at once.

This method has some performance overhead hence it is better to call partial_fit on chunks of data that are as large as possible (as long as fitting in the memory budget) to hide the overhead.

Parameters:

X{array-like, cupy sparse matrix} of shape (n_samples, n_features): Training vectors, where n_samples is the number of samples and n_features is the number of features. A sparse matrix in COO format is preferred, other formats will go through a conversion to COO.
yarray-like of shape (n_samples): Target values.
classesarray-like of shape (n_classes): List of all the classes that can possibly appear in the y vector. Must be provided at the first call to partial_fit, can be omitted in subsequent calls.
sample_weightarray-like of shape (n_samples): Weights applied to individual samples.

Returns:

selfobject