Neural Style Transfer
Last updated
Last updated
Neural Style Transfer is the process of using a convolutional neural network to transfer the style of one image (style image S) to another image (content image C) and generate a new image G.
It is used by several popular apps such as Prisma.
Neurons in the layers of a CNN learn to detect different patterns. Neurons in deeper layers learn to identify more sophisticated patterns than those in shallower layers.
The following cost function is used for neural style transfer:
The first term (content cost) determines how similar the content of G is to that of C and the second term (style cost) determines how similar the style of G is to that of S.
To generate G, first we create G and initialize random values for its pixels. Then, we perform gradient descent on J(G), thereby updating the pixel values of G to get the required G after style transfer.
Say we use a hidden layer l (of a pre-trained CNN) to compute the content cost. (l is usually one of the middle hidden layers; neither too shallow nor too deep).
If and denote the activations of layer l for the images C and G respectively, we conclude that the images C and G have similar content if and are similar.
The content cost function is given by:
Say we use hidden layer l's activation to measure style. But what exactly is style?
We define style as the correlation between activations across channels.
Let be the activation at (i, j, k). We compute a style matrix for each image (S and G) with elements denoting the correlations of activations across channels. (k, k'=1, 2, ..., ).
Now, we will have two style matrices and . The style cost function for layer l is given by:
The overall style cost function is as follows:
where is a hyperparameter for layer l.