Deep Learning

# Deep Learning – Backpropagation Algorithm Basics

Backpropagation Algorithm – An important mathematical tool for making better and high accuracy predictions in machine learning. This algorithm uses supervised learning methods for training Artificial Neural Networks. The whole idea of training multi-layer perceptrons is to compute the derivatives of the error function or gradient descent with respect to weights using the backpropagation algorithm. This algorithm is actually based on the linear algebraic operation with a goal of optimising error function by harnessing its intelligence and provisioning updates.

In this post, we will focus on backpropagation and basic details around it on a high level in simple English.

### What is the Backpropagation Algorithm

As mentioned above “Backpropagation” is an algorithm which uses supervised learning methods to compute the gradient descent (delta rule) with respect to weights. As per wiki – “Backpropagation is a method used in artificial neural networks to calculate a gradient that is needed in the calculation of the weights to be used in the network.”

This algorithm is used for finding minimum value error function in the neural network during the training model stage. The core idea of backpropagation is to find, what impact it would bring to the overall cost of the neural network if we play around with weights.

Weights are used to minimise the error function, so where it minimises that point is considered as the solution to our learning problem. To understand this better we can take an example below.

Let’s take below table to demonstrate our weights importance

 Input Value Desired Output 0 0 3 6 9 18 27 54

So now if we start playing with weights we will see the real game. With weight as 4, we will have below output. Point to note in below table the difference between the actual and the desired output:

 Input Value Desired Output Weight(w) *4 Error Sq. Error 0 0 0 0 0 3 6 12 6 36 9 18 36 18 324 27 54 108 54 2916

Now let’s compare the two tables above, the weighted output has a huge error margin as 6,18 and 54 for 3 input values and only one value is correct. When we do the square we will notice a further increase. Let’s change our weight value to 3 from 4, the error margin reduces to 0, 3, 9 and 27 but it’s still not optimal.  So one thing clear our approach is in the correct direction i.e. reducing weights is the correct decision here. Let’s decrease it further to 2. With weight value as 2 our Desired output value on the spot with zero error margin.

What was done here

• With initial random value for “Weight” (W), we actually used forward propagate method. This is actually the first step in any neural networks. Forward propagate helps to get the output to be compared with the desired output real value to get the error.
• We got our error values as 0, 6, 18 and 54 which were really not appealing values for obvious reasons. To reduce the error, the backwards propagation method was used i.e. reduced the value of ‘W’.
• After reducing there was still an error (0, 3, 9, and 27) though it was decreased, it was not our desired result. Learning was made as “Reducing the value of “W” is in the correct direction and any increment will never yield desired output“.
• Again we propagated backwards and reduced the value of  ‘W’ to 2 from 3.

The whole idea of forward/backward propagation and playing with weights is to reduce/minimise or optimise the error value. After a couple of iteration, the network learns, which side of the number scale it needs to move until error gets minimised. There is a sort of breakpoint where any further update to the weight results in the increase of error and an indication to stop, take it’s as a final weight value.

### Backpropagation algorithm Step by Step

In neural networks the learning about to make neuron intelligent on activation process i.e. when to get activated and when to remain mum. The human brain is not designed to accommodate or allow any of the backpropagation principles. The basic steps in the artificial neural network for  backpropagation used for calculating derivatives in a much faster manner:

• Set inputs and desired outputs – Choose inputs and set the desired outputs
• Set random weights – This is needed for manipulating the output values.
• Calculating the error – Calculating error helps to check how far is the required output from the actual. How good/bad is the model output from the actual output.
• Minimising the error – Now at this step, we need to check the error rate for its minimization
• Updating the parameters – In case the error has a huge gap then, change/update the parameters i.e. weights and biases to reduce it. Repeating this check and update process until error gets minimised is the motive here.
• Model readiness for a prediction – After the last step, we get our error optimised and once it’s done, we can now test our output with some testing inputs.

The human brain is a deep and complex recurrent neural network. Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. In very simple words and not to confuse anything/anyone here, we can define both models as below.

• Feedforward propagation – Type of Neural Network architecture where the connections are “fed forward”only i.e. input to hidden to output The values are “fed forward”.
• Backpropagation (supervised learning algorithm) is a training algorithm with 2 steps:
• Feedforward the values
• Calculate the error and propagate it back to the layer before.

Propagating forward help to see the behaviour of neural network i.e how well the performance is. Observe the error and then backpropagation comes in to reduce the error (also update the bias and weight)  in the gradient descent manner. In short, forward-propagation is part of the backpropagation algorithm but comes before back-propagating.

### The Need of Backpropagation

Backpropagation or backward propagation comes in as a very handy, important and useful mathematical tool when it’s about improving the accuracy of our prediction in machine learning. As mentioned above as well it is used in neural networks as the learning algorithm for computing the gradient descent by playing around with weights.

Backpropagation is a very efficient learning algorithm for multi-layer neural networks as compared with the form of reinforcement learning. In perturbation, we try to randomly perturb one wight at a time to measure the change in performance and saving of any improvement is seen thus quite inefficient.

In backpropagation, computation of efficient error derivatives is possible and that too for all hidden units at the same time. So in this regard backpropagation is far better as you don’t need to randomly change one wight and do the whole forward propagate. This is kind of supervised machine learning algorithm with reason in which it requires a known, desired output for each input value. This way it calculates the loss function gradient descent. This algorithm is emerging as an important machine learning tool for predictive analytics.

### How the backpropagation algorithm works

Backpropagation algorithm works like a recipe for changing the weights Wij in any feed-forward network. The idea for it to learn the training set of input-output pairs (a1b, a2b). The section below will try to describe the working process of the multi-layer neural network which employs the backpropagation algorithm. Will take a three-layer neural network for our example with two inputs and one output as shown in the slideshow below:

This slideshow requires JavaScript.

Here in the example above we taking each neuron with two units.

• Unit One – This will adds products of weights coefficients and input signals.
• Second unit – This unit realises nonlinear function, called neuron transfer (activation) function.

In the picture above signal ‘r’ is an adder output, and b = f(r) is an output nonlinear element signal. As a practice, a neural network needs a training data set to learn and get trained. Our training data set has input signals (a1 and a2) assigned with our desired output as ‘b’. Since training neural nets is an iterative process thus in each iteration weights and coefficients of nodes are replaced using new data from the training data set.

#### Books Referred & Other material referred

• Open Internet reading and research work
• AILabPage (group of self-taught engineers) members hands-on lab work.

#### Points to Note:

When to use artificial neural networks as oppose to traditional machine learning algorithms is a complex one to answer. It entirely depends upon on the problem in hand to solve. One needs to be patient and experienced enough to have the correct answer. All credits if any remains on the original contributor only. In the next upcoming post will talk about using neural nets to recognize handwritten digits.

#### Feedback & Further Question

Do you have any questions about  AI,  Machine Learning, Data Science or Big Data Analytics? Leave a question in a comment section or ask via an email. Will try best to answer it.

Conclusion – In the post above we saw and got firmed understanding that • The whole idea of Backpropagation (a generalization form of the Widrow-Hoff learning rule to multiple-layer networks is to optimize the weights on the connecting neurons and the bias of each hidden layer.
• Backpropagation is used in neural networks as the learning algorithm for computing the gradient descent by playing with weights.

In order to get correct and accurate results backpropagation algorithm is needed though it’s been said the problems can be solved. One goes from general to the specific conclusion and vice versa but as a matter fact, for sake of best performance for neural networks, backpropagation can’t be divorced from it. So backpropagation is used to train a neural network until it can give the best approximate function

#### ============================ About the Author =======================

1. tom says: