“Sharing will enrich everyone with more knowledge.”
As the name itself, telling its definition - Transferring of knowledge from one task to another. It is a process in which model is trained for one task and reuse for similar/related another task. The more relation between the tasks, the more easiness for us to transfer our knowledge.
Reason to use Transfer learning:
We know that to train a model for Deep Neural Networks in Deep Learning, we need –
· extremely large amount of data
· large clusters of computing servers
· a lot of time
· a lot of energy-electricity, because it uses GPUs and TPUs
All above factors increase financial costs. Also, It increases environmental costs because emission of CO2 from the clusters of processors impact the environment as it runs for longer time for deep neural network models. For these reasons, we should optimize the models so that we can reduce these factors significantly. It will save the cost and the environment which are major factor in considering any model. Here, Transfer Learning comes into the picture. It helps when data is insufficient to train a model. It also reduces the time to run the models and the emission of carbon dioxide. It saves energy and cost both.
Let’s see some Examples of Transfer Learning:
In deep learning, first a neural network model is trained on one problem which is related to another problem which we want to solve. In this, we freeze one or more layers of pre-trained model and used it in another model.
First, we will see some simple examples of transfer learning in our life:
1) If someone know how to cook noodles, it will be easy for him/her to cook Chinese food too.
2) If we know Math concepts very well, It will be easy to learn Machine Learning as it uses many concepts of Math’s.
There are lot of examples where we can see this transfer of knowledge in our daily life. Many times, we don’t need to start from scratch, just transfer our knowledge to solve new related problems.
Now, consider an example on CNN. Let’s see a simple classification problem - Cat and Dog images from the Transfer learning point of view. I mean, can we train a CNN on Cat images and then use that training to recognize Dog images? To get the answer, Let’s go deeper. We know that Cat & Dog are different in shape and size, but the features identified in Cat image, can be useful in identifying Dog image or vice versa like both have 2 eyes, 4 legs, one tail, whiskers, fur and so on. These are the generic features both animals have. These can be reused to identifying Dog in new model. We can freeze the features of pre-trained model in new model, fine tune for the specific features of Dog and then classify the Dog images in the new model.
How to apply Transfer Learning:
We can apply Transfer Learning on models in below steps:
· Feature extraction
· Fine tuning the pre-trained model
In CNN, there are many layers in the model to capture different features at different levels in the network. Thus, we can utilize this property to use them as feature extractors.
We will consider CNN model, which we will train on Cat images, as Model-1.
First, the input image is given to Input layer which is first layer of the Model-1. Suppose input is 3D image of Cat. The machine saves this image as a pixel array. Then this input array is given to next layer which is feature extraction layer. There are lot of hidden layers in this layer in CNN. The convolution process is done in these internal layers. Pooling is also done here which reduces the size of the input image so that the computational time will be less. The output of the feature extraction layer are the Features (Like 2 eye, 4 legs, one tail, whiskers, fur, etc.) which represent features from the Cat image (in this model) and can then be used for next layers, which are fully connected layers, as an input. In fully connected layer, each neuron is connected to every neuron of other layer. It provides the classification which will show the result that input image is Cat or not. Simple deep learning model as Model-1 is shown in below pictures for our case.
Fine tuning the pre-trained model:
Now our Model-1 is ready which is trained on Cat images. Now we will use it in Model-2. As we know, CNN is highly configurable architecture with various hyperparameters. We can freeze certain layers while retraining or fine-tune the rest of them to fit our needs. It helps us to achieve better performance with less training time.
Now we got the clear picture how we will use Transfer Learning in our models. Here, the idea behind the feature transfer is to use the generic features from extraction layers that have been trained on Model-1 with a given data set (in our case- Cat images), freeze them in new Model-2 to train it on these generic features. Then in next layers, fine tune the Model-1 for specific features of Dog like shape of nose, body structure, etc. Then feed this output to classification layers of Model-2 to identify the Dog images. If the input image is Dog, it will give Yes otherwise No. This method is ideal if the two problem domains are similar or related as in our case it is true.
Therefore, we can say using transfer learning, we can train a CNN on Cat images and then use that training to recognize Dog images.
*CNN – Convolution Neural Network
I hope now you got the basic idea of Transfer Learning in CNN after reading this blog. I tried my best to explain this concept with very simple examples. Thanks and Happy Reading!!