Introduction to Neural Network Tutorial
Introduction: What are Neural Networks?
Neural Networks are a form of computing which has its beginnings in the 40s, when people started thinking human brain as a computer. One of them was Alan Turing, one of the most famous computer scientist and mathematicians of all times, who, in one of his essays, proposed an architecture based on this very idea where very simple units can be combined to do calculations based on the interactions in between them. Neural Networks, or rather, Artificial Neural Networks (ANNs) are, as Wikipedia explains, a family of machine learning models inspired by the “original” neural networks which are present in the nervous system of living beings. Hence, they are artificially created out of the inspiration. These are used to approximate results which require a large number of inputs, generally.
The concept is not new but it has picked up pace in recent times with the advancements in the world of technology like cheap parallel processes (which are essentials for calculations in neural networks), usage by most of the cutting-edge technologies and the boom of big data.
What happens in the neural networks of our nervous system?
Neural Networks in the nervous system, or “biological” neural networks are series of interconnected “neurons” (pic below). Human brain consists of ~1011 neurons, each of them connected to ~104 other neurons. Neighbouring neurons interact each other through axon terminals and dendrites of each other. When a number or sum of input signals into a particular neuron passes a certain threshold, it transmits the electric signal through its axon. This particular transmission property led to the inspiration behind creating neural networks artificially. A neuron ‘fires’ the information in the form of pulses.
Image courtesy: Wikimedia Commons
How do ANNs resemble biological neurons?
- Neural networks acquire knowledge through a learning process. The neural network in the brain learns for the human body during his lifespan. The ANNs learn to perform better in the modelling process.
- The acquired knowledge is stored in the interconnections in the form of weights.These weights keep on changing as the network is trained and thus, the “updated weights” is the “acquired knowledge”.
Talking in the same context, for ANNs, we have layers of neurons here which transmit messages across neighbouring layers, the connection between which are assigned numeric weights and are tuned to perform better as it learns further adapting to inputs. These weights and transformation function are defined by the network designer. Usually, the weights are random at first and then are trained to perform better, as it happens in other machine learning models. For example, in case of picture recognition, the model needs to trained with sample pictures first, which tunes the ‘adaptive’ weights in between the layers which in turn, improves the prediction accuracy. Artificial Neural Networks have spurred remarkable recent progress in image classification and speech recognition.
Interestingly, with the growing popularity of neural networks, people are doing the backward direction of neural networks, i.e., if you have a thing which knows everything about a picture, it should create new pictures then. These are known as generative neural networks. These are also used when you have a piece of text and you want the network to extend the text for you, music is another example.
Model Representation & Feed Forward Propagation
As discussed above, artificial neural networks are composed of layers of neurons. Let's take an example of a 3-layer network. The first layer which takes the input is known as input layer and the one which outputs is the output layer. All the layers in between are generally known as hidden layers. The information is transmitted in the form of ‘synapse’, which is a junction between two nerve cells, consisting of a minute gap across which impulses pass by diffusion of a neurotransmitter. These ‘weights’ are stored in synapses which manipulate data for tuning.
Typically, in an artificial neural network model:
- All neuron layers must be interconnected.
- There must be a process for updating the weights while learning from the model.
- There must be an ‘activation function’ which essentially determines the output from neuron’s weighted inputs.
Let’s understand this mathematically using a diagram.
The first image is what a basic logical unit of ANN looks like. 𝒙1, 𝒙2, 𝒙3 are inputs to the neuron, which is represented as a yellow circle, and outputs hθ(x) which is the activation function applied to the inputs and corresponding weights θ.
The second image is how a neural ‘network’ looks like, which is nothing but layers of inputs connected in networked fashion. All the 𝓪i(j) are basically the output of the activation function applied on the inputs and weights and so on. So, the neuron in Layer 3 will take the outputs of Layer1 as inputs from Layer2 and compute further.
Here, θ is the matrix of weights and θk is the corresponding weight taken into account for the connection between layer (j) to layer (j+1).
This kind of network in which network graph is a directed acyclic graph is known as feed-forward propagation. Other prominent types are backward propagation and recurrent neural networks. Essentially, a network in which, the information moves only in one direction, forward from the input to output neurons going through all the hidden ones in between and makes no cycles in the network is known as feed-forward neural network.
Let’s look at some examples:
- Single Layer perceptron
Perceptron are networks which consists of just one unit. It is the simplest kind of neural network which has just one layer in which inputs and weights are fed to get the output. The sum of the products of input and their corresponding weight is calculated at each node and then is checked with the threshold. If the sum crosses the threshold, it takes the activated value, deactivated value, otherwise. Perceptrons can be used to implement gradient descent by calculating the error in the output and adjust the weight accordingly.
- Multi-layer perceptron(MLP)
Multilayer perceptron is a network of multiple layer of neurons connect in feed-forward fashion. The backpropagation algorithm is the workhorse for designing MLP. Each neuron in a layer is connected with every other neuron in the subsequent layer. An MLP has an input layer and an output layer with one or more hidden layers in between. The neurons in the hidden layers are not directly accessible, hence they are called hidden. Often the neurons in these networks are referred to as nodes.
Image Reference : Feedforward Neural Networks: An Introduction, by Wiley.
Information enters at the inputs and goes through the system, layer by layer, until it reaches the output layer. Amid typical operation, that is the point at which it goes about as a classifier, there is no feedback between layers. This is the reason they are called feedforward neural networks. Sigmoid function is one commonly used activation function in this case.
- Sigmoid function
One of the reasons for this particular function’s popularity is that its derivative is easily calculated.