Artificial Intelligence

# Machine Learning and Its Algorithms to Know – MLAlgos

Describing and picturing MLAlgos and Machine Learning is the main idea of this post. I will attempt to answer few basic questions as well. Though these questions have been answered many a times in the past and are widely available. Answering them again here from my very own experience on the ground may makes the difference though rather then simply answering from phd or scholar books material prospective.

• Machine Learning at a Glance
• Types of Machine Learning
• Algorithms in Machine Learning – MLAlgos

The learning process for human child or new machine algorithm is same regardless of  the fact that something is made up of bones and flash or wires and metal.

#### Types of Machine Learning

Before we get into MLAlgos lets understand some basics here. The approach of developing ML includes learning from data inputs based on “What has happened”. Evaluating and optimising different model results remains focus here. As on date Machine Learning is widely used in data analytics as a method to develop algorithms for making predictions on data. It is related to probability, statistics, and linear algebra.

Machine Learning is classified into four categories at high level depending on the nature of the learning and learning system.  Some how I find difficult to accept Semi-supervised Learning.

#### 3 Major + 1 Non Major Types

1. Supervised learning: Supervised learning gets labelled inputs and their desired outputs. The goal is to learn a general rule to map inputs to the output.
2. Unsupervised learning: Machine gets inputs without desired outputs, the goal is to find structure in inputs.
3. Reinforcement learning: In this algorithm interacts with a dynamic environment, and it must perform a certain goal without guide or teacher.
4. Semi-supervised Learning: This type of ml i.e. semi-supervised algorithms are the best candidates for the model building in the absence of labels for some data. So if data is mix of label and un-label then this can be the answer. Typically a small amount of labeled data with a large amount of unlabeled data is used here.

#### Some of the popular Machine Learning Algorithms (MLAlgos)

• Linear Regression – Simple Linear Regression- there is only independent variable. Multiple Linear Regression- refers to defining a relationship between independent and dependent variables
•  Logistic Regression – A super simple form of regression analysis in which the outcome variable is binary or dichotomous. Helps to estimate adjusted prevalence rates, adjusted for potential confounders (sociodemographic or clinical characteristics)
• Linear Discriminant Analysis –  A generalization of Fisher’s linear discriminant, a method used in  statistics, pattern recognition and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events.
• Classification and Regression Trees-  Decision trees are are an important type of algorithm for predictive modeling machine learning. A greedy algorithm based on divide and conquer rule. Split the records based on an attribute test that optimizes certain criterion. Real value is in determining how to split the records.
• Naive Bayes – Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) independence assumptions between the features.
•  K-Nearest Neighbors – A laziest algorithm which is also very simple algorithm that stores all available cases and predict the numerical target based on a similarity measure. In the beginning of 1970’s as a non-parametric technique KNN has been used in statistical estimation and pattern recognition already.
• Learning Vector Quantization-  It has aim i.e. representation of large amounts of data by (few) prototype vectors by identification and grouping in clusters of similar data.
• Support Vector Machines- A Support Vector Machine (SVM) is a supervised machine learning algorithm. This can be employed for both classification and regression purposes. SVMs are more commonly used in classification problems and as such.
• Bagging and Random Forest- Bagging in general is an acronym like work that is a portmanteau of Bootstrap and aggregation. In general by taking a bunch of bootstrapped samples of  from original dataset; fit models will be mainly be  all bb model predictions this is bootstrap aggregation i.e. Bagging.
• The fundamental difference between bagging and random forest is that in Random forests, only a subset of features are selected at random out of the total and the best split feature from the subset is used to split each node in a tree, unlike in bagging where all features are considered for splitting a node.” Does that means that bagging is the same as random forest, if only one explanatory variable (predictor) is used as input?
• Boosting and AdaBoost-  Boosting is a machine learning ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learningalgorithms which convert weak learners to strong ones. Algorithms that achieve hypothesis boosting quickly became simply known as “boosting”.
• Qlearning is a model-free reinforcement learning technique. It is able to compare the expected utility of the available actions (for a given state) without requiring a model of the environment.

#### Learning Process Machine & Human

The learning process for human child or new machine algorithm is same regardless of  the fact that something is made up of bones and flash or wires and metal. The basic learning process is similar. It can be divided into four interrelated components:

• Data storage
• Information abstraction from stored data
• Generalization of information abstracted from the data.
• Evaluation of each peace of information and making use of same.

I will talk about each of type of machine learning and respective algorithms in detailed manner in my upcoming “Machine Learning Series” post from next week.
AA
Disclaimer – All credits if any remains on the original contributor only.
AA

Conclusion – Point to note here is that AI is much more then ML. I particularly think that getting to know the types of MLAlgos actually helps to see somewhat clear picture of AI. The answer to the question “What machine learning algorithm should I use?” is always “It depends.” It depends on the size, quality, and nature of the data. It depends on what you want to do with the answer.

It depends on how the math of the algorithm was translated into instructions for the computer you are using. And it depends on how much time you have.

### 26 replies »

1. John Kirkilis says:

In the third graphic, the light-blue rounded rect titled “Bagging and Random Forest” ends with “Random Forest has two parameters:” and then doesn’t list them.

• V Sharma says:

Noted, will update the graphics, thank you very much for highlighting the missing part.

• V Sharma says:

Good One

2. That’s a quite comprehensive post on Machine Learning and AI along with supporting infographics. Good and informative for the ones wishing to add on their knowledge on ML Algos and AI.
What machine learning algorithm should we use? This depends on the size, quality and nature of the data. So, true!

This site uses Akismet to reduce spam. Learn how your comment data is processed.