Before Reading

CS231n Convolutional Neural Networks for Visual Recognition

Convolution

extract local features by using convolution kernel/filter with shape (hwd), where the depth (d) is determined by the number of input channels(depth of input image).

Stride: avoid overfiting

stride=1, convolution kernel(cube 331)

Untitled

Convolutional Neural Network(CNN)

Motivation

in image processing and speech recognition, if use fully connected neural network, such a large quantity of model parameters is usually difficult to converge, resulting in poor performance

CNN

Convolutional Neural Networks are very similar to ordinary Neural Networks from the previous chapter: they are made up of neurons that have learnable weights and biases. Each neuron receives some inputs, performs a dot product and optionally follows it with a non-linearity. The whole network still expresses a single differentiable score function: from the raw image pixels on one end to class scores at the other. And they still have a loss function (e.g. SVM/Softmax) on the last (fully-connected) layer and all the tips/tricks we developed for learning regular Neural Networks still apply.

So what changes? ConvNet architectures make the explicit assumption that the inputs are images, which allows us to encode certain properties into the architecture. These then make the forward function more efficient to implement and vastly reduce the amount of parameters in the network.

especially powerful in pattern classification
omits the complex pre-processing of images, sound
- resolve the image invariance issue(a rotated dog is still a dog).

Core Idea of CNN – Local Perception

Local Features Are the Key

Core Idea of CNN - Parameter Sharing

assumption: if one feature is useful to compute at some spatial position), then it should also be useful to compute at a different position due to the translationally-invariant structure of images.

scan the entire image by using a kernel such that input is convoluted with the same parameters in different regions.
reduce computing and memory requirements.