Neural Style Transfer explores methods for artistic style transfer based on Convolutional Neural Networks. The core idea proposed by Gatys et al. became very popular, and with further research, Johnson et al. overcame a significant limitation to achieve style transfer in real-time.
Well, as we know, NST combines the style of Style Image with the content of Content Image as follows:
We propose the Assorted NST, which combines the style of 3 Style Images with the content of Content Image. Below are a few examples:
We not only combine the three styles, but we can also control how much weight to give to which style. The above output was generated with weights [0.3, 0.3, 0.4]. The weights [0.1, 0.1, 0.8] (where style 3 has more weight) will give the following output:
It's pretty straightforward. Instead of giving the model a single style image as input, we take the weighted combination of all three style images and feed that to the model. Before taking the weighted combination, we resize the style images to have the exact dimensions.
In this way, the model can extract the style of the corresponding final image, which can be used for final image generation.
- For each set of content and style images, we have to do fine variations in weight values for the output to be better. It is impossible to have a fixed set of weights that work on all images. If the weights are not proper, then the outcome might be invalid like below:
- The time taken for output image generation is almost 8 seconds per iteration in manual implementation. We need at least ten iterations to get valid output. This can further be reduced using an end-to-end CNN model explicitly built for NST as introduced in Johnson et al. (which is used in TFHub implementation).