CategoricalNB#
- class cuml.naive_bayes.CategoricalNB(*, alpha=1.0, fit_prior=True, class_prior=None, output_type=None, verbose=False)[source]#
Naive Bayes classifier for categorical features.
The categorical Naive Bayes classifier is suitable for classification with discrete features that are categorically distributed. The categories of each feature are drawn from a categorical distribution.
- Parameters:
- alphafloat, default=1.0
Additive (Laplace/Lidstone) smoothing parameter (0 for no smoothing).
- fit_priorbool, default=True
Whether to learn class prior probabilities or not. If false, a uniform prior will be used.
- class_priorarray-like of shape (n_classes,), default=None
Prior probabilities of the classes. If specified the priors are not adjusted according to the data.
- output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None
Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (
cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.- verboseint or boolean, default=False
Sets logging level. It must be one of
cuml.common.logger.level_*. See Verbosity Levels for more info.
- Attributes:
- category_count_ndarray of shape (n_features, n_classes, n_categories)
With n_categories being the highest category of all the features. This array provides the number of samples encountered for each feature, class and category of the specific feature.
- class_count_ndarray of shape (n_classes,)
Number of samples encountered for each class during fitting.
- class_log_prior_ndarray of shape (n_classes,)
Smoothed empirical log probability for each class.
- classes_ndarray of shape (n_classes,)
Class labels known to the classifier
- feature_log_prob_ndarray of shape (n_features, n_classes, n_categories)
With n_categories being the highest category of all the features. Each array of shape (n_classes, n_categories) provides the empirical log probability of categories given the respective feature and class,
P(x_i|y). This attribute is not available when the model has been trained with sparse data.- n_features_int
Number of features of each sample.
Examples
>>> import cupy as cp >>> from cuml.naive_bayes import CategoricalNB >>> rng = cp.random.RandomState(1) >>> X = rng.randint(5, size=(6, 100), dtype=cp.int32) >>> y = cp.array([1, 2, 3, 4, 5, 6]) >>> clf = CategoricalNB() >>> clf.fit(X, y) CategoricalNB() >>> print(clf.predict(X[2:3])) [3]