Statistics and MachineLearning – The intersection between statistical analyses and machine learning methodologies has raised numerous inquiries regarding the relationship between the two fields.

Statistics and MachineLearning

Specifically, some may speculate whether machine learning merely represents an embellished or exalted rendition of statistical principles. As an individual pondering these intricacies,

I am confronted with a plethora of uncertainties. Presently, a multitude of my AILabPage colleagues pose queries of this nature to me. The phenomenon of widespread difficulties faced by individuals in diverse domains of their routine professional activities, accompanied by a persistent state of cognitive perplexity, has been observed.

Is Machine Learning a Computerised Version of Statistics?

Is machine learning a computerized or glorified version of statistics? The answer to this question is simple, which is “no” (in my personal opinion). To my understanding, they both complement each other and work as partners. The two friends from school days just crossed paths on some occasions.

There was also the “Statistical Modeling: The Two Cultures” paper by Leo Breiman in 2001. This paper argued that statisticians rely too heavily on data modeling and that machine learning techniques are making progress by instead relying on the predictive accuracy of models.

Are the terms like Statistics and Machine Learning synonyms?

Is there any difference between Statistics and Machine Learning jargons?

Let us not bring data mining into our minds here. The focus is only on these unfriended friends, and in my personal opinion, DM is not related here. Just to remind you, some time ago, people also called blockchain a glorified and polished version of swarm intelligence, which is not the case in any way.

Statistics and Machine Learning

Linear regression (LR) serves as a prime example in the field of statistics. In academia, it is typical to use linear regression as a tool for understanding how numerical input and output variables are connected. At present, the concept mentioned has been integrated into machine learning algorithms through the process of borrowing. The algorithm referred to here is known for its ability to function as both a statistical technique and a machine-learning approach.

Though there are some methodological differences between machine learning and statistics, those really don’t divorce these two friends at all. The difference between the two is that machine learning prioritizes improving efficiency and effectiveness, whereas statistics place significance on analyzing representative data sets and larger populations and creating hypotheses. The primary focus of machine learning is to generate predictions, even if the explanation behind such predictions remains unclear.

How to Lie with Statistics is a book written by Darrell Huff in 1954 that presents an introduction to statistics for the general reader. He was not a statistician but a journalist who wrote many “how-to” articles as a freelancer.

Roles – Machine Learning, Statistics and Computer Science

  • Machine Learning: Optimize a performance criterion using example data or past experience.
  • Statistics: inference from sample data and stress on hypotheses
  • Computer science: efficient algorithms to solve the optimization problem and represent and evaluate the model for inference

To be a better Data Scientist I should have started as Statistician

Statistics

Though it’s not entirely true, to some extent, statistics can be called a graphical branch of mathematics (in my personal opinion). As per Wikipedia, statistics is “the practice or science of collecting and analyzing numerical data in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample.”

It is the science of data, which deals with the collection, analysis, interpretation, and presentation of masses of numerical data. In general, statistics talk more about human language and touch. As a statistician would say, a prediction coming out of variable Y is true only if what variables a, b, c, and I are saying is true. Machine learning is to generate predictions, even if the explanation behind such predictions remains unclear.

Machine Learning

Machine learning is the younger of the two disciplines. It is built on the foundation of statistics. It has managed to absorb much of the philosophy of statistics and many techniques over the years. This focuses on data that is too complex for humans to figure out its meaningful regularities. Random samples are not good enough for the task here.

AILabPage defines machine learning as “a focal point where business, data, and experience meet emerging technology and decide to work together“. Machine learning is also a subset of artificial intelligence.

Statistics vs Machine Learning

Although machine learning and statistics may differ in approach, they still maintain a strong relationship without being separated. Machine learning differs from other methods in that it places emphasis on improving efficiency and attaining optimal outcomes, while the remaining approaches center on factors such as sample size, population, and hypothesis development. The main goal of machine learning is to generate precise predictions, even if the reasoning behind them is not readily understandable.

When the model is presented in the form of a simulation, there is a remarkable integration of statistics and computation. In brief, it is clear from statistics that in the current global context, it would be foolish to disregard the vast quantities of data at our disposal for examination. Relying solely on statistics is not a panacea.

“Dealing with the vast quantity of data requires either the intervention of machine learning or significant computing capability.” When dealing with large amounts of data, the primary concerns are typically storage and computation, particularly distributed computation.

Approximate Bayesian Computation – ABC

In the realm of statistics, the term “ABC” is used to describe the process of carrying out an assessment of specific factors. Parallel to forecasting the weather, this area deals with factors that have simple and clear semantic significance. It is the duty of the statistician to perform inference on the said variables. However, acquiring proficiency in the task requires a complex set of skills and careful thought about the best distribution and use of computer resources.

The same inference task in machine learning is known as “probabilistic programming”. In addition, it develops specialized programming languages (e.g., based on graphical models) in which to express these models.

Statistics is a graphical branch of mathematics (not fully correct). It deals with the collection, analysis, interpretation, and presentation of masses of numerical data.

A small project for learning !!

  • Machine Learning is more passionate about Predictions rather than Causality.

Within a small group of AILabPage members and other AILabPage lab fellows, we did some projects for testing and learning purposes. Our goal was to perform a similar task on an available data set using traditional “statistics techniques” and “machine learning techniques” and eventually see the final results.

  • Predictions and online services: We built a data model for a seamless transition from training to prediction. We targeted online and batch prediction services. Integration with Google global load balancing was the idea, but for some reason and a time shortage, we did not use it. Still, we manage to achieve this.
    • a predictive analytics model that is scalable.
    • Able to do a good demonstration of the promise by leveraging statistical breakthroughs.
    • Designed and used the artificial neural network architecture to make the model transparent and easy to debug without RNN, of course. It took a longer time than usual. We managed and developed a fully working model.

By using TensorFlow via a cloud-based machine learning engine, we were able to create a novel model for machine learning. We attained a prediction accuracy rate of 73% through this process. I must admit that we employed neural networks too. Do you have any suggestions for alternative approaches that can help us automatically expand our machine learning application and achieve our desired outcome? If so, please share your recommendations in the comment section.

Points to Note:

All credits, if any, remain with the original contributor only. We have covered all the basics around data analytics for digital marketing analytics in Chapter 1. In the next few chapters, we will talk about implementation, usage, and practice experience for markets.

Books + Other readings Referred

  • Research through open internet, news portals, white papers and imparted knowledge via live conferences & lectures.
  • Lab and hands-on experience of  @AILabPage (Self-taught learners group) members.

Feedback & Further Question

Do you have any questions about  AI,  Machine Learning, Data Science or Big Data Analytics? Leave a question in a comment or ask via email. Will try best to answer it.

Sign-tConclusion – With the rise of interest in machine learning, there are a couple of different perspectives out there on the similarities between them. One goes from a general to a specific conclusion and vice versa, but as a matter of fact, the two disciplines can’t be divorced. Better known as two sides of the same coin. They represent two key aspects of data science that should be integrated in the long run. So statistical machine learning may come up soon.

Statistics departments cannot run without people with programming skills. Therefore, it seems reasonable to include computer science classes in a statistics curriculum. They’re taught the same way, using the same books and the same mathematics. It depends on the data and research objective to choose the research methodology, either inductive or deductive.

============================About the Author =======================

Read about Author at : About Me

Thank you all, for spending your time reading this post. Please share your feedback / comments / critics / agreements or disagreement. Remark for more details about posts, subjects and relevance please read the disclaimer.

FacebookPage      ContactMe      Twitter         ====================================================================

Posted by V Sharma

A Technology Specialist boasting 22+ years of exposure to Fintech, Insuretech, and Investtech with proficiency in Data Science, Advanced Analytics, AI (Machine Learning, Neural Networks, Deep Learning), and Blockchain (Trust Assessment, Tokenization, Digital Assets). Demonstrated effectiveness in Mobile Financial Services (Cross Border Remittances, Mobile Money, Mobile Banking, Payments), IT Service Management, Software Engineering, and Mobile Telecom (Mobile Data, Billing, Prepaid Charging Services). Proven success in launching start-ups and new business units - domestically and internationally - with hands-on exposure to engineering and business strategy. "A fervent Physics enthusiast with a self-proclaimed avocation for photography" in my spare time.

13 Comments

  1. Jasmine Piere at

    Excellent post, short and chrisp

    Reply

  2. Sceaun Seychelles at

    Short and Chrisp post like the idea of statistician saying and I

    Reply

  3. […] there are many Statistics and Machine Learning algorithms for supervised learning, most use the same basic workflow for obtaining a predictor […]

    Reply

  4. Have a look at my post about AI and ML in cyber, as well as some of my presentations on the topic: https://www.slideshare.net/zrlram – Would love your comments and feedback.

    Reply

  5. […] mining is another key unsupervised data mining method, after clustering, that finds interesting associations (relationships, dependencies) in large sets of data […]

    Reply

  6. […] is also a subset of Artificial Intelligence. ML borrows principles from computer science and statistics which is a graphical branch of […]

    Reply

  7. […] Artificial Intelligence and field of study which harnesses principles of computer science & statistics. AILabPage members intuitively calls statistics as a graphical branch of […]

    Reply

  8. […] mining is another key unsupervised data mining method, after clustering, that finds interesting associations (relationships, dependencies) in large sets of data […]

    Reply

  9. […] is also a subset of Artificial Intelligence. ML borrows principles from computer science and statistics which is a graphical branch of […]

    Reply

  10. 360digitmg_guduvanchery at

    I am impressed by the information that you have on this blog. It shows how well you understand this subject.
    360DigiTMG

    Reply

  11. […] also has a very close relationship to statistics; which can be called as a graphical branch (From data representation point of view) of mathematics. […]

    Reply

  12. […] mining is another key unsupervised data mining method, after clustering, that finds interesting associations (relationships, dependencies) in large sets of data […]

    Reply

Leave a Reply