BernoulliNB#

class cuml.naive_bayes.BernoulliNB(*, alpha=1.0, binarize=0.0, fit_prior=True, class_prior=None, output_type=None, verbose=False)[source]#

Naive Bayes classifier for multivariate Bernoulli models.

Like MultinomialNB, this classifier is suitable for discrete data. The difference is that while MultinomialNB works with occurrence counts, BernoulliNB is designed for binary/boolean features.

Parameters:
alphafloat, default=1.0

Additive (Laplace/Lidstone) smoothing parameter (0 for no smoothing).

binarizefloat or None, default=0.0

Threshold for binarizing (mapping to booleans) of sample features. If None, input is presumed to already consist of binary vectors.

fit_priorbool, default=True

Whether to learn class prior probabilities or not. If false, a uniform prior will be used.

class_priorarray-like of shape (n_classes,), default=None

Prior probabilities of the classes. If specified the priors are not adjusted according to the data.

output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None

Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.

verboseint or boolean, default=False

Sets logging level. It must be one of cuml.common.logger.level_*. See Verbosity Levels for more info.

Attributes:
class_count_ndarray of shape (n_classes)

Number of samples encountered for each class during fitting.

class_log_prior_ndarray of shape (n_classes)

Log probability of each class (smoothed).

classes_ndarray of shape (n_classes,)

Class labels known to the classifier

feature_count_ndarray of shape (n_classes, n_features)

Number of samples encountered for each (class, feature) during fitting.

feature_log_prob_ndarray of shape (n_classes, n_features)

Empirical log probability of features given a class, P(x_i|y).

n_features_int

Number of features of each sample.

References

C.D. Manning, P. Raghavan and H. Schuetze (2008). Introduction to Information Retrieval. Cambridge University Press, pp. 234-265. https://nlp.stanford.edu/IR-book/html/htmledition/the-bernoulli-model-1.html A. McCallum and K. Nigam (1998). A comparison of event models for naive Bayes text classification. Proc. AAAI/ICML-98 Workshop on Learning for Text Categorization, pp. 41-48. V. Metsis, I. Androutsopoulos and G. Paliouras (2006). Spam filtering with naive Bayes – Which naive Bayes? 3rd Conf. on Email and Anti-Spam (CEAS).

Examples

>>> import cupy as cp
>>> from cuml.naive_bayes import BernoulliNB
>>> rng = cp.random.RandomState(1)
>>> X = rng.randint(5, size=(6, 100), dtype=cp.int32)
>>> y = cp.array([1, 2, 3, 4, 4, 5])
>>> clf = BernoulliNB()
>>> clf.fit(X, y)
BernoulliNB()
>>> print(clf.predict(X[2:3]))
[3]