Deep Neural Network
Last updated
Last updated
A Deep Neural Network is a neural network that has multiple hidden layers.
Say we have L layers in the deep network. The vectorized implementation of forward propagation is as follows:
for l=1 to L:
where is the activation function for the layer l. (since different layers can have different activation functions).
The required probability will be i.e. the last value output by the loop. (Note that the input layer X is denoted as ).
One of the main reasons for bugs in our code could be mismatched dimensions. It's a good idea to keep the following dimensions in mind:
The main reason for using deep neural networks is that shallows networks may not be able to capture/learn complex features. In deep networks, every layer learns more complex features than its previous layer.
Forward and Backward Propagation for deep networks are pretty much the same as they were for shallow networks.
Forward Propagation:
Backward Propagation:
Parameters of a neural network include weights (W's) and biases (b's).
Other factors such as the learning rate, number of hidden layers, number of hidden units, number of iterations, choice of activation function etc. are hyperparameters.
Hyperparameters control the values of the parameters.
where denotes the number of nodes/neurons in layer l and m is the number of training examples.
Vector/Matrix
Dimensions
X
(n[0], m)
W[l]
(n[l], n[l-1])
b[l]
(n[l], 1)
Z[l]
(n[l], m)
A[l]
(n[l], m)