MAX POOLING (2024)

The pooling operation involves sliding a two-dimensional filter over each channel of feature map and summarising the features lying within the region covered by the filter.
For a feature map having dimensions nh x nw x nc, the dimensions of output obtained after a pooling layer is

(nh - f + 1)/s *(nw - f+ 1)/s *nc

where

 nh - height of feature map
nw - width of feature map
nc - number of channels in the feature map
f - size of filter
s - stride length

A common CNN model architecture is to have a number of convolution and pooling layers stacked one after the other.

Pooling Layers?

  • Pooling layers are used to reduce the dimensions of the feature maps. Thus, it reduces the number of parameters to learn and the amount of computation performed in the network.
  • The pooling layer summarises the features present in a region of the feature map generated by a convolution layer. So, further operations are performed on summarised features instead of precisely positioned features generated by the convolution layer. This makes the model more robust to variations in the position of the features in the input image.

Types of Pooling:

  1. MaxPooling
  2. Average Pooling
  3. Global Pooling

Max Pooling

Max pooling is a pooling operation that selects the maximum element from the region of the feature map covered by the filter. Thus, the output after max-pooling layer would be a feature map containing the most prominent features of the previous feature map.

Average Pooling

Average pooling computes the average of the elements present in the region of feature map covered by the filter. Thus, while max pooling gives the most prominent feature in a particular patch of the feature map, average pooling gives the average of features present in a patch.

Global pooling reduces each channel in the feature map to a single value. Thus, an nh x nw x nc feature map is reduced to 1 x 1 x nc feature map. This is equivalent to using a filter of dimensions nh x nw i.e. the dimensions of the feature map.
Further, it can be either global max pooling or global average pooling.

MaxPooling is a down-sampling operation often used in Convolutional Neural Networks (CNNs) to reduce the spatial dimensions of the input volume. It is a form of pooling layer, and it helps in retaining the most important information while discarding less important details. MaxPooling is typically applied after convolutional layers in a CNN.

The basic idea behind MaxPooling is to divide the input image into non-overlapping rectangular regions and, for each region, output the maximum value. This operation is performed independently for each channel in the input.

Here’s a simple explanation of how MaxPooling works:

Input Region:

  • The input image is divided into small regions (usually 2x2 or 3x3).
  • For each region, the maximum value is computed.

Output Feature Map:

  • The maximum value for each region is taken and forms the output of that region.
  • The result is a down-sampled version of the input, with reduced spatial dimensions.

Mathematically, if we denote the input as X and the output as Y, the MaxPooling operation can be defined as:

Y[i,j,k]=max(X[2i:2i+2,2j:2j+2,k])

where i and j iterate over the height and width dimensions of the input, and k iterates over the channels.

Common choices for the size of the pooling window are 2x2 or 3x3, and the stride (the step size when moving the pooling window) is often set to be equal to the size of the window for non-overlapping pooling.

MAX POOLING (2)
import numpy as np
from keras.models import Sequential
from keras.layers import MaxPooling2D

# define input image
image = np.array([[2, 2, 7, 3],
[9, 4, 6, 1],
[8, 5, 2, 4],
[3, 1, 2, 6]])
image = image.reshape(1, 4, 4, 1)

# define model containing just a single max pooling layer
model = Sequential(
[MaxPooling2D(pool_size = 2, strides = 2)])

# generate pooled output
output = model.predict(image)

# print output image
output = np.squeeze(output)
print(output)

[[9. 7.]
[8. 6.]]

Let’s go through a simple example of MaxPooling with a 2x2 pooling window. Consider a small 4x4 input matrix:

MAX POOLING (3)

Now, let’s apply 2x2 MaxPooling to this input matrix. The pooling operation involves moving a 2x2 window across the input and, for each window, taking the maximum value. The output matrix, Y, will have reduced spatial dimensions.

Y[i,j]=max(X[2i:2i+2,2j:2j+2])

Let’s calculate Y step by step:

  1. For i=0 and j=0:

[0,0]=max(X[0:2,0:2])

=max([1 5​
3 6​]) =6
  1. For i=0 and j=1:

Y[0,1]=max(X[0:2,2:4])

=max([2 7​ 
4 8​])=8
  1. For i=1 and j=0:

Y[1,0]=max(X[2:4,0:2])

max([9 13​ 
10 14​])=14
  1. For i=1 and j=1:

Y[1,1]=max(X[2:4,2:4])

=max([11 15 
​12 16​])=16
The resulting output matrix Y is:

Y=[ 6 14
​8 16 ]

Max pooling offers several benefits in the context of CNNs:

  • Feature Invariance: Max pooling helps the model to become invariant to the location and orientation of features. This means that the network can recognize an object in an image no matter where it is located.
  • Dimensionality Reduction: By downsampling the input, max pooling significantly reduces the number of parameters and computations in the network, thus speeding up the learning process and reducing the risk of overfitting.
  • Noise Suppression: Max pooling helps to suppress noise in the input data. By taking the maximum value within the window, it emphasizes the presence of strong features and diminishes the weaker ones.

In practice, max pooling layers are placed after convolutional layers in a CNN. After a convolutional layer extracts features from the input image, the max pooling layer reduces the spatial size of the convolved feature map, keeping only the most salient information. This process is repeated for multiple convolutional and pooling layers, allowing the network to learn a hierarchy of features at various levels of abstraction.

Max pooling is a simple yet effective technique that has been instrumental in the success of CNNs in various applications, particularly in image and video recognition tasks. Its ability to reduce the computational burden while maintaining the essential features has made it a staple component in deep learning architectures.

Despite its benefits, max pooling is not without its challenges. One criticism is that it can sometimes be too aggressive, discarding potentially useful information that could be important for the classification task. Moreover, max pooling is a fixed operation and does not learn from the data, unlike convolutional layers that have learnable parameters.

As a result, some modern CNN architectures have started to move away from traditional max pooling layers, using alternatives like strided convolutions for downsampling or incorporating learnable pooling operations that can adapt to the data.

The link to the last article which contains the initial part of the article

from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
layers.Conv2D(filters=64, kernel_size=3), # activation is None
layers.MaxPool2D(pool_size=2),
# More layers follow
])

A MaxPool2D layer is much like a Conv2Dlayer, except that it uses a simple maximum function instead of a kernel, with the pool_size parameter analogous to kernel_Size. A MaxPool2D layer doesn't have any trainable weights like a convolutional layer does in its kernel, however.

Let’s take another look at the extraction figure from the last lesson. Remember that MaxPool2D is the Condense step.

MAX POOLING (4)

Notice that after applying the ReLU function (Detect) the feature map ends up with a lot of “dead space,” that is, large areas containing only 0’s (the black areas in the image). Having to carry these 0 activations through the entire network would increase the size of the model without adding much useful information. Instead, we would like to condense the feature map to retain only the most useful part — the feature itself.

This in fact is what maximum pooling does. Max pooling takes a patch of activations in the original feature map and replaces them with the maximum activation in that patch.

MAX POOLING (5)

When applied after the ReLU activation, it has the effect of “intensifying” features. The pooling step increases the proportion of active pixels to zero pixels.

Translation Invariance

We called the zero-pixels “unimportant”. Does this mean they carry no information at all? In fact, the zero-pixels carry positional information. The blank space still positions the feature within the image. When MaxPool2D removes some of these pixels, it removes some of the positional information in the feature map. This gives a convnet a property called translation invariance. This means that a convnet with maximum pooling will tend not to distinguish features by their location in the image. ("Translation" is the mathematical word for changing the position of something without rotating it or changing its shape or size.)

Watch what happens when we repeatedly apply maximum pooling to the following feature map.

MAX POOLING (6)

The two dots in the original image became indistinguishable after repeated pooling. In other words, pooling destroyed some of their positional information. Since the network can no longer distinguish between them in the feature maps, it can’t distinguish them in the original image either: it has become invariant to that difference in position.

In fact, pooling only creates translation invariance in a network over small distances, as with the two dots in the image. Features that begin far apart will remain distinct after pooling; only some of the positional information was lost, but not all of it.

MAX POOLING (7)

This invariance to small differences in the positions of features is a nice property for an image classifier to have. Just because of differences in perspective or framing, the same kind of feature might be positioned in various parts of the original image, but we would still like for the classifier to recognize that they are the same.

Other Pooling Layers

MAX POOLING (8)
import numpy as np
from keras.models import Sequential
from keras.layers import AveragePooling2D

# define input image
image = np.array([[2, 2, 7, 3],
[9, 4, 6, 1],
[8, 5, 2, 4],
[3, 1, 2, 6]])
image = image.reshape(1, 4, 4, 1)

# define model containing just a single average pooling layer
model = Sequential(
[AveragePooling2D(pool_size = 2, strides = 2)])

# generate pooled output
output = model.predict(image)

# print output image
output = np.squeeze(output)
print(output)

[[4.25 4.25]
[4.25 3.5 ]]
import numpy as np
from keras.models import Sequential
from keras.layers import GlobalMaxPooling2D
from keras.layers import GlobalAveragePooling2D

# define input image
image = np.array([[2, 2, 7, 3],
[9, 4, 6, 1],
[8, 5, 2, 4],
[3, 1, 2, 6]])
image = image.reshape(1, 4, 4, 1)

# define gm_model containing just a single global-max pooling layer
gm_model = Sequential(
[GlobalMaxPooling2D()])

# define ga_model containing just a single global-average pooling layer
ga_model = Sequential(
[GlobalAveragePooling2D()])

# generate pooled output
gm_output = gm_model.predict(image)
ga_output = ga_model.predict(image)

# print output image
gm_output = np.squeeze(gm_output)
ga_output = np.squeeze(ga_output)
print("gm_output: ", gm_output)
print("ga_output: ", ga_output)

This ends the basic understanding of MaxPooling Layer in CNN architecture

MAX POOLING (2024)
Top Articles
Technology Makes it Easier, But What Do We Really Know About Why Students Cheat? | NEA
Firewall Allowlist
Mickey Moniak Walk Up Song
Evil Dead Movies In Order & Timeline
Camera instructions (NEW)
Arkansas Gazette Sudoku
Math Playground Protractor
My Boyfriend Has No Money And I Pay For Everything
Academic Integrity
Apnetv.con
Slapstick Sound Effect Crossword
Herbalism Guide Tbc
Fear And Hunger 2 Irrational Obelisk
Jenn Pellegrino Photos
Destiny 2 Salvage Activity (How to Complete, Rewards & Mission)
Gayla Glenn Harris County Texas Update
Nhl Tankathon Mock Draft
Rural King Credit Card Minimum Credit Score
Air Traffic Control Coolmathgames
Teen Vogue Video Series
Gazette Obituary Colorado Springs
Bidevv Evansville In Online Liquid
1 Filmy4Wap In
Reicks View Farms Grain Bids
Albert Einstein Sdn 2023
HP PARTSURFER - spare part search portal
Our 10 Best Selfcleaningcatlitterbox in the US - September 2024
Past Weather by Zip Code - Data Table
Craigs List Jax Fl
Craigslist Sf Garage Sales
Citibank Branch Locations In Orlando Florida
Martin Village Stm 16 & Imax
Fox And Friends Mega Morning Deals July 2022
Rust Belt Revival Auctions
Xemu Vs Cxbx
Telegram update adds quote formatting and new linking options
Bitchinbubba Face
Mcgiftcardmall.con
ENDOCRINOLOGY-PSR in Lewes, DE for Beebe Healthcare
Nearest Ups Office To Me
Rs3 Bis Perks
How to Print Tables in R with Examples Using table()
✨ Flysheet for Alpha Wall Tent, Guy Ropes, D-Ring, Metal Runner & Stakes Included for Hunting, Family Camping & Outdoor Activities (12'x14', PE) — 🛍️ The Retail Market
Worland Wy Directions
Treatise On Jewelcrafting
Poster & 1600 Autocollants créatifs | Activité facile et ludique | Poppik Stickers
Deviantart Rwby
Nkey rollover - Hitta bästa priset på Prisjakt
Factorio Green Circuit Setup
Bellin Employee Portal
Primary Care in Nashville & Southern KY | Tristar Medical Group
Latest Posts
Article information

Author: Lakeisha Bayer VM

Last Updated:

Views: 5699

Rating: 4.9 / 5 (49 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Lakeisha Bayer VM

Birthday: 1997-10-17

Address: Suite 835 34136 Adrian Mountains, Floydton, UT 81036

Phone: +3571527672278

Job: Manufacturing Agent

Hobby: Skimboarding, Photography, Roller skating, Knife making, Paintball, Embroidery, Gunsmithing

Introduction: My name is Lakeisha Bayer VM, I am a brainy, kind, enchanting, healthy, lovely, clean, witty person who loves writing and wants to share my knowledge and understanding with you.