Naive Bayes – A classification algorithm under a supervised learning group based on probabilistic logic. This is one of the simplest machine-learning algorithms of all time. Generative algorithms from GANs are also used as classifiers, interestingly they can do much more than categorisation though. Logistic regression is another classification algorithm which models posterior probability by learning input-to-output mapping and creates a discriminative model.

This is AILabPage‘s other tutorial series post to discuss the Naive Bayes classifier algorithm to explain it in simple terms.

  • What is Naive Bayes Algorithm?
  • Why it’s called Naive
  • How it works?
  • Mathematics Behind Naive Bayes Algorithm
  • When to use it?

Lets Unfold – Naive Bayes

AILabPage defines machine learning as “A focal point where business, data, experience meets emerging technology and decides to work together”. If you have not unfolded machine learning jargon already then please take a look at our machine learning post-series library.

This probabilistic (science of uncertainty and data) algorithm is used for making a prediction in a classification fashion after the predictive model gets fully tested. Naïve Bayes Classifier is a definitive and most important milestone to understand and begin the machine learning journey.

The naive Bayes algorithm is a method set of probabilities. For each attribute from each class set, it uses probability to make predictions. This algorithm falls under a supervised machine-learning approach. The data model that comes out of this effort is called as “Predictive Model” with probabilistic problems at the foundation level

This falls under a family of probabilistic algorithms that take advantage of probability theory and Bayes theorem.  As per Wikipedia “Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For example, when the temperature going to be good, there is a good probability that my son will play tennis.

Why It is Naive?

“Naive” because of its nature and the assumptions it takes. By default, this algorithm assumes that all attributes are independent of each other. This is the simplest algorithm to understand and implement. How this works how? To answer this question, the Bayes theory of probability is the only answer.

The naive Bayes classifier works well, especially for text data. This one is the most simple algorithm that can be applied to text data with strong independence assumptions between the features. In other words, it does well when categorical input variables are compared to numerical variables. For instance, if our goal is to find a fruit based on its colour, shape, and skin, then spherical shape, orange colour and thick skin fruit would most likely be an orange.

An Intuitive Explanation or More

While walking on football grounds the first white round moving object we will see and label a football but on cricket playground, the label gets changed to a cricket ball. Matter of fact the object can be anything, then why not something else. This is where probability comes into play.

The human mind is programmed in such a way that it can easily and quickly classify objects with labels by their features. Feature mapping is so tightly coupled in the brain that it’s very unlikely to make mistakes. Because the human brain does feature extraction at rapid speed and applies probability swiftly thus classification of the object gets labelled as a ball.

In short, what we have concluded in the above example is a probabilistic classifier. The naive Bayes algorithm learns the probability of an object with certain features belonging to a particular class.

Mathematics Behind Naive Bayes Algorithm

It is simply written as (A|B= [ B|A⋅ ) ]  P ( B )

So here “A” represents the class eg. ball or anything. “B” represents features calculated individually. Calculating posterior probability was made simple in Bayes theorem it provides a method for same i.e. P ( A| B ), from P ( A ), P ( B ), and P ( B| A ).

  • P ( A | B ) is the posterior probability of class A given predictor (features).
  • P ( A ) is the probability of class.
  • P ( B | A ) is the likelihood which is the probability of predictor given class.
  • P ( B ) is the earlier probability of predictor.

The assumption actually strong assumption taken by Naive Bayes classifier is that the effect of the value of a predictor ( B ) on a given class ( A ) is independent of the values of other predictors. Because of the assumption, it is also called as class conditional independence.

To break down each and rewrite our function with the above example in mind it can be written as below.

Lets put some real example here.

Imagine we have 3 sets of playing balls with the feature as diameter in cm, red in colour and hard in feel. To tabulate the data as sample training data we have following.

To train the classifier, we count up various subsets of points and use them to compute the earlier and conditional probabilities. Now we know Naive Bayes classifier is just a matter of counting some attributes frequency against each class. Other words how many times each attribute co-occurs with each class.

From our training data set, let’s try to predict the class of another ball. Let’s assume for our prediction problem we have data set as

  • Diameter-22cm – Yes or 1
  • Red – Yes or 1
  • Hard – Yes or 1

Now to classify above attributes in the formula its simple to calculate our target label to make a prediction. The result of probability calculation from each class i.e. testing against all 3 parameters with the formula above will lead to best prediction. The class with the highest probability will win the game.

To implement and build a Naive Bayes Algorithm in Python. The first step is to import all the necessary libraries.

  • import numpy as np
  • import pandas as pd
  • from sklearn.naive_bayes import GaussianNB
  • from sklearn.preprocessing import LabelEncoder
  • from sklearn.model_selection import train_test_split
  • from sklearn.metrics import accuracy_score

Steps to Implement a Naive Bayes Algorithm

The main aim in the Naive Bayes algorithm is to calculate the conditional probability of an object with a feature vector f1, f2,, fn belongs to a particular class. Below are some steps to implement.

  1. Data Loading:
    1. Load the data from whichever file
    2. Format it in “xls” or in a “csv” file format
    3. Split it into training and test data sets.
  2. Preparing Data:
    1. Summarise properties of training dataset to calculate probabilities to make predictions.
  3. Generate a forecast:
    1. Use training dataset from step-2, clean it and organise it.
    2. Generate a single prediction on the test dataset.
  4. Performance evaluation:
    1. Evaluate the model
    2. Iterate steps 2, 3 and 4 to improve the accuracy of predictions.
    3. Find the least error margin.
  5. Clean-up and Finalisation:
    1. Cleaning, organising and polishing the predictive model code
    2. Used a predictive model with all elements to present a complete quality  code
    3. Implement and keep the code for your algorithm.

If we have a machine learning classification problem in hand then the Naive Bayes algorithm turns out to be the best and first choice. It’s simple and easy with multiple features and classes.

Advantages for Naive Bayes

  • The strongest advantage of Naive Bayes is its real-time & multiclass prediction.
  • Naive Bayes is a strong trigger for learning classifier that too with speed
  • It is a relatively easy algorithm to build and understand and easily trainable using a small data set.

Books + Other readings Referred

Feedback & Further Question

Do you have any questions about Machine Learning, Data Science or Data Analytics? Leave a comment or ask your question via email. Will try my best to answer it.

Points to Note:

All credits if any remains on the original contributor only. We have kept our discussion limited i.e only around Naive Bayes algorithm though for more details on machine learning and its details you can browse the blog.

SECaaS - Security as a Service Is the Next Big Thing

Conclusion – In the post, we have learnt that Naive Bayes is a supervised machine learning algorithm. Also, we have learnt about its usage which is mainly in text-based data sets for learning and understands something about text data sets. It is called “Naive” because of its nature strong thought process. Assumptions it takes for the occurrence of a certain feature even though they are independent of the occurrence of other features.

In the above example its clear that once the algorithm is trained, it can then give a label without being told. It can then predict which classifier based on what it was taught. We have seen, the training data is vital to the success of the model.

Posted by V Sharma

A Technology Specialist boasting 22+ years of exposure to Fintech, Insuretech, and Investtech with proficiency in Data Science, Advanced Analytics, AI (Machine Learning, Neural Networks, Deep Learning), and Blockchain (Trust Assessment, Tokenization, Digital Assets). Demonstrated effectiveness in Mobile Financial Services (Cross Border Remittances, Mobile Money, Mobile Banking, Payments), IT Service Management, Software Engineering, and Mobile Telecom (Mobile Data, Billing, Prepaid Charging Services). Proven success in launching start-ups and new business units - domestically and internationally - with hands-on exposure to engineering and business strategy. "A fervent Physics enthusiast with a self-proclaimed avocation for photography" in my spare time.


  1. Willard Humphrey at

    What I liked about this post is – The author is actually either working on what he is writing as a blog post or sharing his experience from hand on work. This post is laid very well and in-depth explanation of every chunk, which is always expected from the blogs like this. I would like to hear more about new courses/posts from the same instructor.

    Looking forward to the next post

    AI enthusiast


  2. Well explained Article
    Look other Machine learning article @ Unsupervised Machine learning


    1. V Sharma at

      Are you looking for backlink

  3. […] Naive Bayes’ – A family of simple probabilistic classifiers. […]


  4. […] Naive Bayes – Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) independence assumptions between the features. […]


  5. Hello Vinod Sir,

    I am trying to understand, how algorithms works in machine learning, So i have read 3-4 blogs on the net but still my dought is not clear.
    In general programming we have function/algorithm that takes input and returns some output.

    Illustration in the image

    is amazing but still not clear about modeling.

    But In ML algorithms it learns from data and and based on input data it provides predicted output but my question is that how Can you provide me any Hello World example for ML algorithm ?

    So that i can do mind mapping with it.




  6. It’s late discovering this demonstration. At any rate, it’s a thing to be acquainted with that there are such functions exist. I concur with your Blog and I will have returned to assess it more later on so please keep up your demonstration.
    data scientist training and placement in hyderabad


  7. 360DigiTMG offers courses ranging from basic to advanced, start your career journey with us and let us aid you in bagging your dream job.
    best data science institute in hyderabad


  8. […] Naive Bayes’ – A family of simple probabilistic classifiers. […]


Leave a Reply