# Naive Bayes Classifier

Say our data has d attributes/features, and that these attributes are binary. Also, we have a finite set of classes C.

$$C\_{MAP} = argmax\_{c\in C} , P(X|C)P(C)$$

The main issue here is that we do not know P(X|C) in advance, and it is difficult, if not impossible, to model the joint distribution of all the d attributes. Therefore, we use the simplifying assumption that the attributes are **conditionally independent** i.e. the attributes are independent, *given the class*. This assumption is also called **class-conditional independence**.

So, $$P(\[x\_1, x\_2,..., x\_d]|C) = P(x\_1|C) \* P(x\_2|C) \* ... \* P(x\_d|C)$$

Therefore, we calculate $$C\_{MAP}$$ as:

$$C\_{MAP} = argmax\_C P(x\_1|C)*P(x\_2|C)*...\*P(x\_d|C)\*P(C)$$

Since we need to compute a product of probabilities, we must prevent any of the individual probabilities from being 0. This would nullify the effect of all the other probabilities.

To do so, we use **smoothing**.

If N is the total number of examples having a certain class C, and t is the number of times $$x\_i = v$$, then the non-smoothed estimate is given by:

$$P(x\_i=v|C) = \frac{t}{N}$$

If t=0, this probability becomes 0!

So, we perform **add-m smoothing**:

$$P(x\_i=v|C) = \frac{t+m}{N+ms}$$ (where s is the number of possible values for the attribute $$x\_i$$). Sometimes, we use m=1. This is called **add-1 smoothing**. In most cases, we limit m to the range (0,1].


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://vikram-bajaj.gitbook.io/cs-gy-6923-machine-learning/main-4/types-of-machine-learning/supervised-learning/probabilistic-modeling/naive-bayes-classifier.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
