Artificial Intelligence

# Machine Learning – Introduction to Its Algorithms – MLAlgos

MLAlgos (Machine Learning Algorithms) – Describing and picturising machine learning algorithms is the main idea of this post. We will attempt to answer a few basic questions as well. Though these questions have been answered many a time in the past and are widely available on an open internet. Answering them again here from my very own experience on the ground may make the difference compared to simply answering them from PhD scholar books material perspective.

### Points Covered in this Post:

This post is limited to below index items. The coverage of the points is from a layman perspective at a high level in simple English. Anyone looking for detailed explanations or codes should get in touch with me. This post is divided into 3 parts as below:

• Machine Learning at a Glance
• Types of Machine Learning
• Algorithms in Machine Learning – MLAlgos

From the policy standpoint, we never share any code unless it’s an open source code or link to the published work. Some of you might wonder why we need to know all of them or what exactly is the meaning of each term. We have gathered all the information at one place i.e. in this blog post.

### Machine Learning at a Glance

#### Types of Machine Learning

Before we get into MLAlgos lets understand some basics here. The approach of developing ML includes learning from data inputs based on “What has happened”. Evaluating and optimising different model results remains focus here. As on date Machine Learning is widely used in data analytics as a method to develop algorithms for making predictions on data. It is related to probability, statistics, and linear algebra.

Machine Learning is classified into four categories at a high level depending on the nature of the learning and learning system. Semi-supervised learning is actually very interesting of them all:

1. Supervised learning: Supervised learning gets labelled inputs and their desired outputs. The goal is to learn a general rule to map inputs to the output.
2. Unsupervised learning: Machine gets inputs without desired outputs, the goal is to find structure in inputs.
3. Reinforcement learning: In this algorithm interacts with a dynamic environment, and it must perform a certain goal without a guide or teacher.
4. Semi-supervised Learning: This type of ml i.e. semi-supervised algorithms are the best candidates for the model building in the absence of labels for some data. So if data is a mix of label and un-label then this can be the answer. Typically a small amount of labelled data with a large amount of unlabeled data is used here.

ML also has a very close relationship to statistics; which can be called as a graphical branch (From data representation point of view) of mathematics. It instructs an algorithm to learn for itself by analyzing data. The more data it processes, the smarter the algorithm gets.

### Some of the popular Machine Learning Algorithms (MLAlgos)

• Linear Regression – Simple Linear Regression- there is only an independent variable. Multiple Linear Regression- refers to defining a relationship between independent and dependent variables
•  Logistic Regression – A super simple form of regression analysis in which the outcome variable is binary or dichotomous. Helps to estimate adjusted prevalence rates, adjusted for potential confounders (sociodemographic or clinical characteristics)
• Linear Discriminant Analysis –  A generalization of Fisher’s linear discriminant, a method used in statistics, pattern recognition and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events.

• Classification and Regression Trees-  Decision trees are are an important type of algorithm for predictive modelling machine learning. A greedy algorithm based on divide and conquer rule. Split the records based on an attribute test that optimizes certain criterion. The real value is in determining how to split the records.
• Naive Bayes – Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes’ theorem with strong (naive) independence assumptions between the features.
•  K-Nearest Neighbors – The laziest algorithm which is also a very simple algorithm that stores all available cases and predict the numerical target based on a similarity measure. In the beginning of 1970s as a non-parametric technique, KNN has been used in statistical estimation and pattern recognition already.
• Learning Vector Quantization-  It has aim i.e. representation of large amounts of data by (few) prototype vectors by identification and grouping in clusters of similar data.
• Support Vector Machines- A Support Vector Machine (SVM) is a supervised machine learning algorithm. This can be employed for both classification and regression purposes. SVMs are more commonly used in classification problems and as such.
• Bagging and Random Forest- Bagging, in general, is an acronym like work that is a portmanteau of Bootstrap and aggregation. In general by taking a bunch of bootstrapped samples of from original dataset; fit models will be mainly be all bb model predictions this is bootstrap aggregation i.e. Bagging.
• The fundamental difference between bagging and the random forest is that in Random forests, only a subset of features are selected at random out of the total and the best split feature from the subset is used to split each node in a tree, unlike in bagging where all features are considered for splitting a node.” Does that mean that bagging is the same as random forest if only one explanatory variable (predictor) is used as input?
• Boosting and AdaBoost-  Boosting is a machine learning ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms which convert weak learners to strong ones. Algorithms that achieve hypothesis boosting quickly became simply known as “boosting”.
• Qlearning  – It’s a model-free reinforcement learning technique. It is able to compare the expected utility of the available actions (for a given state) without requiring a model of the environment.

### Learning Process Machine & Human

The learning process for a human child or new machine algorithm is the same regardless of the fact that something is made up of bones and flash or wires and metal. The basic learning process is similar. It can be divided into four interrelated components:

• Data storage
• Information abstraction from stored data
• Generalization of information abstracted from the data.
• Evaluation of each piece of information and making use of the same.

I will talk about each type of machine learning and respective algorithms in a detailed manner in my upcoming “Machine Learning Series” post from next week.
AA

#### Points to Note:

All credits if any remains on the original contributor only. We have covered a few machine learning algorithms at a high level in this post. Our last posts on Supervised Machine Learning and Unsupervised Machine Learning got some decent feedback. Our next post will talk about Reinforcement Learning — Markov Decision Processes

#### Feedback & Further Question

AA

Conclusion – Point to note here is that AI is much more than ML. I particularly think that getting to know the types of MLAlgos actually helps to see a somewhat clear picture of AI. The answer to the question “What machine learning algorithm should I use?” is always “It depends.” It depends on the size, quality, and nature of the data. It depends on what you want to do with the answer.

It depends on how the math of the algorithm was translated into instructions for the computer you are using. And it depends on how much time you have. Only recently machine learning got spotlight and attention from the industry. Machine learning use cases like face recognition, image captioning, voice & text processing and self-driving cars now everyone talks about.

### 29 replies »

1. John Kirkilis says:

In the third graphic, the light-blue rounded rect titled “Bagging and Random Forest” ends with “Random Forest has two parameters:” and then doesn’t list them.

• V Sharma says:

Noted, will update the graphics, thank you very much for highlighting the missing part.

• V Sharma says:

Good One

2. That’s a quite comprehensive post on Machine Learning and AI along with supporting infographics. Good and informative for the ones wishing to add on their knowledge on ML Algos and AI.
What machine learning algorithm should we use? This depends on the size, quality and nature of the data. So, true!

This site uses Akismet to reduce spam. Learn how your comment data is processed.