A brief guide to CNN: Convolutional Neural Networks

In this article, I will explain the concept of convolution neural networks (CNN’s) by implementing many instances with pictures and will make the case of using CNN’s over regular multilayer neural networks for processing images. Let’s take a dive and discuss CNN (convolutional neural networks) in detail that will be more helpful to you.

Now, you might be wondering Why CNN can be applied only on Images, not on Data? Right? CNN simply strides over the image and able to captivate the analysis and with data is not possible to stride and detect the data.

What is Image analysis?

Image analysis (also known as “computer vision” or “image recognition”) is the ability of computers to recognize attributes within an Image. Do you use Google Photos or Apple’s Photos app on your smartphone?

They both use some basic image analysis features to recognize faces and categorize them in your photos so you can look at all of your photos of a particular person. Type “dog” into the search function within either app to quickly locate your collection of puppy photos or type “beach” to find your tropical vacation pics.

What are Artificial Neural Networks?

If we talk about Artificial Neural Networks then it's an attempt to replicate the network of neurons that make up a human brain so that the machine or the computer will be able to determine things and make decisions in a human-like manner.

Image result for artificial neural networks"

GIF Source Google This is an impeccable example that goes perfectly for artificial neural networks.

Different Layers of CNN

A gray-scale (0-255) image contains only shades of gray and no color. And an Image contains only three layers, where its value describes the intensity of the pixels at every point. And i.e.

Step 1: The height and weight that can be considered of an image’s filters are quite smaller than the input volume.

Step 2: A Conv layer contains a set of filters/kernels whose parameters need to be understood.

Step 3: A Convolutional layer is composed of several (outputs) feature maps, each of them being connected to some or all the feature maps of the previous layers all through a convolutional layer.

GRAY-SCALE (0-255)

Now, let’s discuss how convolutional neural networks work?

The work process of CNN depends and the size of the feature map depends on the dimensions of the filters. And simultaneously, the number of feature maps will depend on the number of filters being used.

Let’s understand what convolutional filters are trying to explain. There are 9 convolutional filters in the Input layer. An image can be seen as a matrix 1, where 1 (x,y)

Which is the brightness of the pixel located at coordinates (x,y).

A convolutional image is computed between matrix 1 and a kernel matrix k which represents the type of filter.

Let’s extract each feature extraction in a convolutional layer one by one.

CONVOLUTIONAL LAYER

Input layer: The input layers of a neural network are composed of artificial input neurons, and bring the initial data to the system for further processing by subsequent layers of artificial neurons. The input layers are the very beginning of the workflow for the artificial network.

Every image is a matrix of the pixel value. The range of values that can be encoded in each pixel depends upon its bit size.

Filters/kernels: Convolutional is using a kernel to extract certain features from an input image. In other words, a kernel is a matrix, which is slid across the image and multiplied with the input such that the output is enhanced in a certain desirable way.

Output/Feature maps: The feature map is the output of one filter applied to the previous layer. A given filter is drawn all across the entire previous layer, moved one at a time. Each position results in activation of the particular neuron and output are gathered in the feature map.

ACTIVATION FUNCTIONS

ReLu: ReLu is considered a Rectifier linear Unit and it is commonly used for the activation purpose in deep learning models. The function returns 0 if it receives any negative input, but for any positive value x it returns that value back, so it can be written as:

Tanh: The activation that works almost better than the sigmoid function is the Tanh function also known as Tangent Hyperbolic Function. It’s actually a mathematically shifted version of the sigmoid function. They both are similar and can be derived from each other.

Equation:

Sigmoid: Sigmoid functions are used excessively in neural networks. It’s a kind of activation function and more specifically such functions defined as a squashing function.

POOLING LAYER

Max Pooling: It calculates the maximum value for each patch of the feature map.

Average Pooling: It calculates the value for each patch on the feature map.

FLATTEN LAYERS

Here, once the pooled feature map is obtained, the next step is to flatten it. And

It converts 1 n-d array into a 1-d array.

The generic arrangement of layers can thus be summarized as follows:

Let’s extract each Classification Phase in the convolutional layer one by one.

In the classification phase, and all after flattening on the pooling layers, it updates the weights and biases, which can be considered as trainable parameters. This entire process is called “Feed Forward” because after updating it doesn’t go back with the “back-propagation”.

Bottom Line

In a nutshell, this was a complete tutorial based on convolutional neural networks. Convolutional neural networks play a significant role in AI. It's a fundamental example of deep learning, where the models push the evolution of “Artificial Intelligence” by submitting systems that simulate the different types of biological human brain activities. Today, transfer learning has brought higher-end AI to edge computing. One can use feature extraction weights of a pre-trained network in the customized classification model. In this way, one can create object detection, image classification models using transfer learning and even train them on CPUs.

Hence, convolutional neural networks play a very vital role in different Computer vision applications such as real-time face recognition, object detection, human trafficking.

Search This Blog

Blog Paradise Techsoft