Padding in general means a cushioning material. In CNN it refers to the amount of pixels added to an image when it is being processed which allows more accurate analysis. This padding adds some extra space to cover the image which helps the kernel to improve performance. This is more helpful when used to detect the borders of an image.
When the image is undergoing the process of convolution the kernel is passed according to the stride. While moving, the kernel scans each pixel and in this process it scans few pixels multiple times and few pixels less times(borders).In general, pixels in the middle are used more often than pixels on corners and edges. This in turn may cause poor border detection. We can overcome this problem using padding.
For a gray scale (n x n) image and (f x f) filter/kernel, the dimensions of the image resulting from a convolution operation is (n – f + 1) x (n – f + 1).For example if we use 8x8 image and 3x3 filter the output would be 6x6 after convolution. It means after every convolution the image is shrinked. This can cause a limitation to build deeper networks but we can overcome this by padding.
Types of padding
There are few types of padding like Valid, Same, Causal, Constant, Reflection and Replication. Of these most popular are Valid padding and Same padding. Let us see them more clearly.
Valid padding (or no padding):Valid padding is simply no padding. This is by default keras choose if not specified. When (n x n) image is used and (f x f) filter is used with valid padding the output image size would be (n-f+1)x(n-f+1).
[(n x n) image] * [(f x f) filter] —> [(n – f + 1) x (n – f + 1) image]
Same padding: Same padding is used when we need an output of the same shape as the input. This value calculates and adds padding required to the input image to ensure the shape before and after. If the values for the padding are zeroes then it can be called zero padding. When the padding is set to zero, then every pixel in padding has value of zero. When the zero padding is set to 1 then 1 pixel border is added to the image with value zero. When we use an (n x n) image and (f x f) filter and we add padding (p) to the image. The output image size would be (n x n). That means it restores the size of the image. The following equation represents the sizes of input and output with the same padding.
[(n + 2p) x (n + 2p) image] * [(f x f) filter] —> [(n x n) image].
The value of p = (f-1)/2 since (n+2p-f+1) = n
We can use the above formula and calculate how many layers of padding can be added to get the same size of the original image. For example if we use a 6x6 image and 3x3 filter we need 1 layer of padding [P = (3 -1)/2 = 1] to get 6x6 output image. This example is represented in the following diagram.
It is important to understand the concept of padding because it helps us to preserve the border information of the input data. As the borders of the original cannot be inspected properly since the borders cannot be in the center of the kernel to get scanned well. Hence the need of padding for more accuracy. This also helps to retain the size of input. The parameters for padding can be valid or same.