Deep Learning Specialization - Coursera
main
main
  • Introduction
  • Neural Networks and Deep Learning
    • Introduction to Deep Learning
    • Logistic Regression as a Neural Network (Neural Network Basics)
    • Shallow Neural Network
    • Deep Neural Network
  • Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization
    • Practical Aspects of Deep Learning
    • Optimization Algorithms
    • Hyperparameter Tuning, Batch Normalization and Programming Frameworks
  • Structuring Machine Learning Projects
    • Introduction to ML Strategy
    • Setting Up Your Goal
    • Comparing to Human-Level Performance
    • Error Analysis
    • Mismatched Training and Dev/Test Set
    • Learning from Multiple Tasks
    • End-to-End Deep Learning
  • Convolutional Neural Networks
    • Foundations of Convolutional Neural Networks
    • Deep Convolutional Models: Case Studies
      • Classic Networks
      • ResNets
      • Inception
    • Advice for Using CNNs
    • Object Detection
      • Object Localization
      • Landmark Detection
      • Sliding Window Detection
      • The YOLO Algorithm
      • Intersection over Union
      • Non-Max Suppression
      • Anchor Boxes
      • Region Proposals
    • Face Recognition
      • One-Shot Learning
      • Siamese Network
      • Face Recognition as Binary Classification
    • Neural Style Transfer
  • Sequence Models
    • Recurrent Neural Networks
      • RNN Structure
      • Types of RNNs
      • Language Modeling
      • Vanishing Gradient Problem in RNNs
      • Gated Recurrent Units (GRUs)
      • Long Short-Term Memory Network (LSTM)
      • Bidirectional RNNs
    • Natural Language Processing & Word Embeddings
      • Introduction to Word Embeddings
      • Learning Word Embeddings: Word2Vec and GloVe
      • Applications using Word Embeddings
      • De-Biasing Word Embeddings
    • Sequence Models & Attention Mechanisms
      • Sequence to Sequence Architectures
        • Basic Models
        • Beam Search
        • Bleu Score
        • Attention Model
      • Speech Recognition
Powered by GitBook
On this page
  • Forward Propagation in a Deep Network
  • Getting Your Dimensions Right
  • Why Deep Representations?
  • Forward and Backward Propagation
  • Parameters vs. Hyperparameters

Was this helpful?

  1. Neural Networks and Deep Learning

Deep Neural Network

A Deep Neural Network is a neural network that has multiple hidden layers.

Forward Propagation in a Deep Network

Say we have L layers in the deep network. The vectorized implementation of forward propagation is as follows:

for l=1 to L:

Z[l]=W[l]A[l−1]+B[l]Z^{[l]} = W^{[l]}A^{[l-1]} + B^{[l]}Z[l]=W[l]A[l−1]+B[l]

A[l]=g[l](Z[l])A^{[l]} = g^{[l]}(Z^{[l]})A[l]=g[l](Z[l])

where g[l]g^{[l]}g[l] is the activation function for the layer l. (since different layers can have different activation functions).

The required probability will be A[L]A^{[L]}A[L] i.e. the last value output by the loop. (Note that the input layer X is denoted as A[0]A^{[0]}A[0]).

Getting Your Dimensions Right

One of the main reasons for bugs in our code could be mismatched dimensions. It's a good idea to keep the following dimensions in mind:

Vector/Matrix

Dimensions

X

(n[0], m)

W[l]

(n[l], n[l-1])

b[l]

(n[l], 1)

Z[l]

(n[l], m)

A[l]

(n[l], m)

where n[l]n^{[l]}n[l] denotes the number of nodes/neurons in layer l and m is the number of training examples.

Why Deep Representations?

The main reason for using deep neural networks is that shallows networks may not be able to capture/learn complex features. In deep networks, every layer learns more complex features than its previous layer.

Forward and Backward Propagation

Forward and Backward Propagation for deep networks are pretty much the same as they were for shallow networks.

Forward Propagation:

Backward Propagation:

Parameters vs. Hyperparameters

Parameters of a neural network include weights (W's) and biases (b's).

Other factors such as the learning rate, number of hidden layers, number of hidden units, number of iterations, choice of activation function etc. are hyperparameters.

Hyperparameters control the values of the parameters.

PreviousShallow Neural NetworkNextImproving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

Last updated 4 years ago

Was this helpful?