Deep Learning

Deep Learning – Introduction to Convolutional Neural Networks

Convolutional neural network – CNN’s are inspired by the structure of the brain but our focus will not be on neural science in here as we do not have any expertise or academic knowledge in any of biological aspect. We are going artificial in this post. CNN’s are a class of Neural Networks that have proven very effective in areas of image recognition, processing and classification. In this article, we will explore and discuss our intuitive explanation of convolutional neural networks (CNN’s) on a high level and in simple language.

Convolutional Neural Networks are a special kind of multi-layer neural networks.
AA

https://vinodsblog.com/2018/10/25/deep-learning-introduction-to-artificial-neural-networks/

 

 

What are Convolutional Neural Networks

Convolutional neural networks (CNN) –  Might look or appears like magic to many but in reality, it’s just simple science and mathematics only. CNN’s are a class of neural networks that have proven very effective in areas of image recognition thus in most of the cases it’s applied to image processing. This network is a great example of variation for multilayer perceptron for processing and classification. It’s a deep learning algorithm in which it takes input as an image and put weights and biases effectively to its objects and finally able to differentiate images from each other.

Everything You Need to Know About Convolutional Neural Networks.

 

As per Wiki – In machine learning, a convolutional neural network (CNN, or ConvNet) is a class of deep, feed-forward artificial neural networks, most commonly applied to analysing visual imagery.

Artificial Intelligence solutions behind CNN’s amazingly transform how businesses and developers create user experiences and solve real-world problems. CNN’s are also known as the application of neuroscience to machine learning. They employe mathematical operations known as “Convolution”; which is a specialised kind of linear operation.

Convolutional Neural Networks applications include high calibres AI systems such as AI-based robots, virtual assistants, and self-driving cars. Other common applications are used for

  • Image Processing
    • Recognition
    • Classification
    • Video labelling
    • Text analysis,
  • Speech Recognition
    • Natural language processing
    • Text classification processing

 

Everything You Need to Know About Convolutional Neural Networks.

 

Data Processing – Convolutional Neural Network

CNN’s have grid topology for processing data. Data points in this are called as grid-like topology as the processing of data happens in a spatial correlation between the neighbouring data points.

  • 1D Grid – Time series data – Takes samples at regular time intervals
  • 2D Grid – Image data – Grid of pixels

These neural networks use the convolution method as opposed to general matrix multiplication in at least one of the layer. Convolution leverages on

  • Equivariant Representations –  This simply means that if the input changes, the output changes in the same way
  • Sparse Interactions – This allows the network to efficiently describe complicated interactions between many variables from simple building blocks.
  • Parameter sharing – Using the same parameter for more than one function in a model

The above structure is created to improve a machine learning system. CNN’s also allows for working with inputs of variable size and efficiently describe complicated interactions between many variables from simple building blocks.

There are significant limitations of these Neural Networks is their constraints at the API level. Input e.g. an image and output e.g. classes of probabilities are both fixed-size vectors. Even the computation through its data models is performed by mapping using a fixed number of layers.

 

Some history around – Convolutional Neural Networks (CNN’s)

ConvNets have been successful in identifying faces, objects and traffic signs apart from powering vision in robots and self-driving cars. Yann LeCun was named LeNet5 after many previous successful iterations since the year 1988. LeNet was one of the very first convolutional neural networks which have pushed forward  Deep Learning. 

Everything You Need to Know About Convolutional Neural Networks.

In 2012, Alex Krizhevsky used Convolutional neural network in ImageNet competition and ever since then all big companies are running for this. CNN’s are the most influential innovations in the computer vision field. in 1990s LeNet architecture was used mainly for character recognition tasks such as reading zip codes, digits, etc.

 

Image Processing – Human vs Computers

For humans, recognition of objects is the first skill we learn right from birth. Newborn baby starts recognising faces as Papa, Mumma etc. By the time turn to an adult; recognition becomes effortless and kind of automated process.

Human behaviour for processing image is very different from machines. Humans give a label to each image automatically by just looking around and immediately characterise the scene and give each object a label without even consciously noticing.

For computer recognising objects are slightly complex as they see everything as input and output which come as a class or set of classes. This is known as image processing which we will discuss in the next section in detail. In computers, CNN’s do image recognition, image classifications. It is very useful in object detection, face recognition and success in various text classification with word embedding tasks etc.

Everything You Need to Know About Convolutional Neural Networks.

So in simple words, computer vision is the ability to automatically understand any image or video based on visual elements and patterns.

 

Inputs and Outputs – How it works

Our focus in this post will be on image processing only

CNN’s require models to train and test. Each input image passes through a series of convolution layers with filters (Kernels), Pooling, fully connected layers (FC) and apply softmax function (Generalisation of the logistic function that “squashes” a K-dimensional vector of arbitrary real values to real values Kd vector) to classify an object with probabilistic values between 0 and 1. This is the reason every image in CNN’s gets represented as a matrix of pixel values.

The Convolutional Neural Network classifies an input image into categories e.g dog, cat, deer, lion or bird.

Everything You Need to Know About Convolutional Neural Networks

the Convolution + Pooling layers act as feature extractors from the input image while fully connected layer acts as a classifier. In the above image figure, on receiving a dear image as input, the network correctly assigns the highest probability for it (0.94) among all four categories. The sum of all probabilities in the output layer should be one though. There are four main operations in the ConvNet shown in the image above:

  1. Convolution
  2. Non Linearity (ReLU)
  3. Pooling or Sub Sampling
  4. Classification (Fully Connected Layer)

These operations are the basic building blocks of every Convolutional Neural Network, so understanding how this work is an important step to developing a sound understanding of ConvNets or CNN’s.

 

Convolutional Neural Networks – Architecture

Layers used to build Convolutional Neural Networks as we have mentioned in the above picture. Simple ConvNet is a sequence of layers, and every layer of a ConvNet transforms one volume of activations to another through a differentiable function. We use four main types of layers to build our ConvNet architectures above.

Convolutional Layer, ReLU, Pooling, and Fully Connected Layer.

 

Initialisation 

The initialisation of all filters and parameters/weights with random values

 

Convolution Layer 

This holds raw pixel values of the training image as input. In the example above an image (deer) of width 32, height 32, and with three colour channels R, G, B is used. It goes through the forward propagation step and finds the output probabilities for each class. This layer ensures the spatial relationship between pixels by learning image features using small squares of input data.

  • Lets assume the output probabilities for image above are [0.2, 0.1, 0.3, 0.4]
  • The size of the feature map is controlled by three parameters.
    • Depth –  Number of filters used for the convolution operation.
    • Stride – Number of pixels by which filter matrix over the input matrix.
    • padding – It’s good to input matrix with zeros around the border, matrix.
    • Calculating total error at the output layer with summation over all 4 classes.
      •  Total Error = ∑  ½ (target probability – output probability) ²
      • Computation of output of neurons that are connected to local regions in the input. This may result in volume such as [32x32x16] for 16 filters.

 

Rectified Linear Unit (ReLU) Layer

A non-linear operation. This layer applies an element-wise activation function. ReLU is used after every Convolution operation. It is applied per pixel and replaces all negative pixel values in the feature map by zero. This leaves the size of the volume unchanged ([32x32x16]). ReLU is a non-linear operation.

 

Pooling Layer

Also called as subsampling or downsampling. Pool layer does a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x16] i.e reduces the dimensionality of each feature map but retains the most important information.

  • Max Pooling operation on a Rectified Feature map.

Everything You Need to Know About Convolutional Neural Networks.

 

Fully Connected Layer

In Fully Connected Layer -each node is connected to every other node in the adjacent layer. FC layer computes the class scores with traditional multilayer perceptron that uses a softmax activation function in the output layer. It results in the volume of size [1x1x10], where each of the 10 numbers corresponds to a class score, such as among the 10 categories of CIFAR-10.

Everything You Need to Know About Convolutional Neural Networks.

The main job of this layer basically takes an input volume as is coming as output from conv or ReLU or pool layer proceedings. Arrange the output in the N-dimensional vector where N is the number of classes that the program has to choose from.

 

Convolutional Neural Networks – Real life Business Use Cases

Many modern companies are using CNN’s as the backbone of their business e.g. Pinterest use it for home feed personalisation and Instagram for search infrastructure. 3 of the biggest users are as below.

Everything You Need to Know About Convolutional Neural Networks

  • Automatic Tagging Algorithms – Tagging, or social bookmarking, refers to the action of associating a relevant keyword or phrase with an entity (e.g. document, image, or video). Our experiment (above) showed us that effective time-frequency representation for automatic tagging and more complex models benefit from more training data.
  • Photo Search  To find images that are similar to the one’s user input or on text input is available in google search results. It works well on the Chrome app. Google’s algorithms rely on more than 200 unique signals or “clues” that make it possible to guess search. Attributes here are websites, the age of content, IP address based region and PageRanks. Sadly this is highly biased based on your colour of skin. You can give it a try though.
  • Product Recommendations – Large scale recommenders systems are in use in almost every e-commerce, retail, video on demand, or music streaming business.  Algorithms in recommenders systems are typically classified into two categories — content-based and collaborative filtering methods although modern recommenders combine both approaches.

 

Books Referred & Other material referred

  • Open Internet & AILabPage members hands-on lab work.
  • LeNet5  documentation.

 

Points to Note:

All credits if any remains on the original contributor only. We have covered Unsupervised machine learning in this post, where we find hidden gems from unlabelled historical data. The last post was on Supervised Machine Learning. In the next upcoming post will talk about Reinforcement machine learning.

 

Feedback & Further Question

Do you have any questions about Deep Learning or Machine Learning? Leave a comment or ask your question via email. Will try my best to answer it.

 

SECaaS - Security as a Service Is the Next Big ThingConclusion- This post was an attempt to explain the main concepts behind Convolutional Neural Networks in simple terms.CNN is a neural network with some convolutional and some other layers. The convolutional layer has a number of filters that do a convolutional operation. The process of building a CNN’s always involves four major steps i.e Convolution, Pooling, Flattening and Full connection which was covered in details. Choosing parameters, apply filters with strides, padding if requires. Perform convolution on the image and apply ReLU activation to the matrix. is main core process  in CNN and if you get this incorrect the whole joy gets over then and there

 

============================ About the Author =======================

Read about Author at : About Me

Thank you all, for spending your time reading this post. Please share your opinion / comments / critics / agreements or disagreement. Remark for more details about posts, subjects and relevance please read the disclaimer.

FacebookPage                        ContactMe                          Twitter         ====================================================================

Facebook Comments
Advertisements

11 replies »

  1. Hi,
    This is the latest booming technology to learn more about it your post help me.
    Wonderful illustrated information. I thank you for that. No doubt it will be very useful for my future projects. Would like to see some other posts on the same subject!

    Thank you for sharing…

  2. This is very high level info not much of details to learn.
    Do we loose any information when using a feature detector at Convolution + Pooling layers which act as feature extractors?

  3. I am student and worker at same time and I loved your narrative Convolutional Neural Networks are very similar to ordinary Neural Networks from the previous. Please help to answer in details how the flow to FCL happens, pls let me know bit by bit

  4. Great post! Thanks. so much for the work for people like me really appreciate. I have few questions though if you can answer please

    1 – What makes convolutional filters in the first convolutional layer “unique”?
    2 – Are all 5×5 filters have same behaviour.
    3 – Are they just being passed through different non-linear functions or something?
    4 – Why don’t they produce the same representations?
    5 – What informs such decisions? makes

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.