Adversarially Robust Neural Style Transfer



Reiichiro Nakano


Aug. 6, 2019



This article is part of a discussion of the Ilyas et al. paper “Adversarial examples are not bugs, they are features”. You can learn more in the main discussion article .

Other Comments Comment by Ilyas et al.

A figure in Ilyas, et. al. that struck me as particularly interesting was the following graph showing a correlation between adversarial transferability between architectures and their tendency to learn similar non-robust features.

Adversarial transferability vs test accuracy of different architectures trained on ResNet-50′s non-robust features.

One way to interpret this graph is that it shows how well a particular architecture is able to capture non-robust features in an image. Since the non-robust features are defined by the non-robust features ResNet-50 captures, NRFresnetNRF_{resnet}, what this graph really shows is how well an architecture captures NRFresnetNRF_{resnet}.

Notice how far back VGG is compared to the other models.

In the unrelated field of neural style transfer, VGG-based neural networks are also quite special since non-VGG architectures are known to not work very well This phenomenon is discussed at length in this Reddit thread. without some sort of parameterization trick . The above interpretation of the graph provides an alternative explanation for this phenomenon. Since VGG is unable to capture non-robust features as well as other architectures, the outputs for style transfer actually look more correct to humans! To follow this argument, note that the perceptual losses used in neural style transfer are dependent on matching features learned by a separately trained image classifier. If these learned features don’t make sense to humans (non-robust features), the outputs for neural style transfer won’t make sense either.

Before proceeding, let’s quickly discuss the results obtained by Mordvintsev, et. al. in Differentiable Image Parameterizations, where they show that non-VGG architectures can work for style transfer by using a simple technique previously established in feature visualization. In their experiment, instead of optimizing the output image in RGB space, they optimize it in Fourier space, and run the image through a series of transformations (e.g jitter, rotation, scaling) before passing it through the neural network.

Can we reconcile this result with our hypothesis linking neural style transfer and non-robust features?

One possible theory is that all of these image transformations weaken or even destroy non-robust features. Since the optimization can no longer reliably manipulate non-robust features to bring down the loss, it is forced to use robust features instead, which are presumably more resistant to the applied image transformations (a rotated and jittered flappy ear still looks like a flappy ear).

A quick experiment

Testing our hypothesis is fairly straightforward: Use an adversarially robust classifier for neural style transfer and see what happens.

I evaluated a regularly trained (non-robust) ResNet-50 with a robustly trained ResNet-50 from Engstrom, et. al. on their performance on neural style transfer. For comparison, I performed the same algorithm with a regular VGG-19  .

To ensure a fair comparison despite the different networks having different optimal hyperparameters, I performed a small grid search for each image and manually picked the best output per network. Further details can be read in a footnote L-BFGS was used for optimization as it showed faster convergence over Adam. For ResNet-50, the style layers used were the ReLu outputs after each of the 4 residual blocks, [relu2_x,relu3_x,relu4_x,relu5_x][relu2\_x, relu3\_x, relu4\_x, relu5\_x] while the content layer used was relu4_xrelu4\_x. For VGG-19, style layers [relu1_1,relu2_1,relu3_1,relu4_1,relu5_1][relu1\_1,relu2\_1,relu3\_1,relu4\_1,relu5\_1] were used with a content layer relu4_2relu4\_2. In VGG-19, max pooling layers were replaced with avg pooling layers, as stated in Gatys, et. al. or observed in the accompanying Colaboratory notebook.

The results of this experiment can be explored in the diagram below.

Content image Style image

Success! The robust ResNet shows drastic improvement over the regular ResNet. Remember, all we did was switch the ResNet’s weights, the rest of the code for performing style transfer is exactly the same!

A more interesting comparison can be done between VGG-19 and the robust ResNet. At first glance, the robust ResNet’s outputs seem on par with VGG-19. Looking closer, however, the ResNet’s outputs seem slightly noisier and exhibit some artifacts This is more obvious when the output image is initialized not with the content image, but with Gaussian noise..

Texture synthesized with VGG.
Mild artifacts.
Texture synthesized with robust ResNet.
Severe artifacts.
A comparison of artifacts between textures synthesized by VGG and ResNet. Interact by hovering around the images. This diagram was repurposed from Deconvolution and Checkerboard Artifacts by Odena, et. al.

It is currently unclear exactly what causes these artifacts. One theory is that they are checkerboard artifacts caused by non-divisible kernel size and stride in the convolution layers. They could also be artifacts caused by the presence of max pooling layers in ResNet. An interesting implication is that these artifacts, while problematic, seem orthogonal to the problem that adversarial robustness solves in neural style transfer.

VGG remains a mystery

Although this experiment started because of an observation about a special characteristic of VGG nets, it did not provide an explanation for this phenomenon. Indeed, if we are to accept the theory that adversarial robustness is the reason VGG works out of the box with neural style transfer, surely we’d find some indication in existing literature that VGG is naturally more robust than other architectures.

A few papers indeed show that VGG architectures are slightly more robust than ResNet. However, they also show that AlexNet, not known to work well for neural style transferAs shown by Dávid Komorowicz in this blog post. , is above VGG in terms of this “natural robustness”.

Perhaps adversarial robustness just happens to incidentally fix or cover up the true reason non-VGG architectures fail at style transfer (or other similar algorithms In fact, neural style transfer is not the only pretrained classifier-based iterative image optimization technique that magically works better with adversarial robustness. In Engstrom, et. al., they show that feature visualization via activation maximization works on robust classifiers without enforcing any priors or regularization (e.g. image transformations and decorrelated parameterization) used by previous work. In a recent chat with Chris Olah, he pointed out that the aforementioned feature visualization techniques actually work well on VGG without these priors, just like style transfer! ) i.e. adversarial robustness is a sufficient but unnecessary condition for good style transfer. Whatever the reason, I believe that further examination of VGG is a very interesting direction for future work.

To cite Ilyas et al.’s response, please cite their collection of responses.

Response Summary: Very interesting results, highlighting the effect of non-robust features and the utility of robust models for downstream tasks. We’re excited to see what kind of impact robustly trained models will have in neural network art! We were also really intrigued by the mysteriousness of VGG in the context of style transfer . As such, we took a deeper dive which found some interesting links between robustness and style transfer that suggest that perhaps robustness does indeed play a role here.

Response: These experiments are really cool! It is interesting that preventing the reliance of a model on non-robust features improves performance on style transfer, even without an explicit task-related objective (i.e. we didn’t train the networks to be better for style transfer).

We also found the discussion of VGG as a “mysterious network” really interesting — it would be valuable to understand what factors drive style transfer performance more generally. Though not a complete answer, we made a couple of observations while investigating further:

Style transfer does work with AlexNet: One wrinkle in the idea that robustness is the “secret ingredient” to style transfer could be that VGG is not the most naturally robust network — AlexNet is. However, based on our own testing, style transfer does seem to work with AlexNet out-of-the-box, as long as we use a few early layers in the network (in a similar manner to VGG):

Style transfer using AlexNet, using conv_1 through conv_4.

Observe that even though style transfer still works, there are checkerboard patterns emerging — this seems to be a similar phenomenon to the one noticed in the comment in the context of robust models. This might be another indication that these two phenomena (checkerboard patterns and style transfer working) are not as intertwined as previously thought.

From prediction robustness to layer robustness: Another potential wrinkle here is that both AlexNet and VGG are not that much more robust than ResNets (for which style transfer completely fails), and yet seem to have dramatically better performance. To try to explain this, recall that style transfer is implemented as a minimization of a combined objective consisting of a style loss and a content loss. We found, however, that the network we use to compute the style loss is far more important than the one for the content loss. The following demo illustrates this — we can actually use a non-robust ResNet for the content loss and everything works just fine:

Style transfer seems to be rather invariant to the choice of content network used, and very sensitive to the style network used.

Therefore, from now on, we use a fixed ResNet-50 for the content loss as a control, and only worry about the style loss.

Now, note that the way that style loss works is by using the first few layers of the relevant network. Thus, perhaps it is not about the robustness of VGG’s predictions, but instead about the robustness of the layers that we actually use for style transfer?

To test this hypothesis, we measure the robustness of a layer ff as:

R(f)=Ex1D[maxxf(x)f(x1)2]Ex1,x2D[f(x1)f(x2)2] R(f) = \frac{\mathbb{E}_{x_1\sim D}\left[\max_{x’} \|f(x’) - f(x_1)\|_2 \right]} {\mathbb{E}_{x_1, x_2 \sim D}\left[\|f(x_1) - f(x_2)\|_2\right]}

Essentially, this quantity tells us how much we can change the output of that layer f(x)f(x) within a small ball, normalized by how far apart representations are between images in general. We’ve plotted this value for the first few layers in a couple of different networks below:

The robustness R(f)R(f) of the first four layers of VGG16, AlexNet, and robust/standard ResNet-50 trained on ImageNet.

Here, it becomes clear that, the first few layers of VGG and AlexNet are actually almost as robust as the first few layers of the robust ResNet! This is perhaps a more convincing indication that robustness might have something to with VGG’s success in style transfer after all.

Finally, suppose we restrict style transfer to only use a single layer of the network when computing the style lossUsually style transfer uses several layers in the loss function to get the most visually appealing results — here we’re only interested in whether or not style transfer works (i.e. actually confers some style onto the image).. Again, the more robust layers seem to indeed work better for style transfer! Since all of the layers in the robust ResNet are robust, style transfer yields non-trivial results even using the last layer alone. Conversely, VGG and AlexNet seem to excel in the earlier layers (where they are non-trivially robust) but fail when using exclusively later (non-robust) layers:

Style transfer using a single layer. The names of the layers and their robustness R(f)R(f) are printed below each style transfer result. We find that for both networks, the robust layers seem to work (for the robust ResNet, every layer is robust).

Of course, there is much more work to be done here, but we are excited to see further work into understanding the role of both robustness and the VGG in network-based image manipulation.

You can find more responses in the main discussion article.


The experiment in this article was built on top of Engstrom, et. al.’s open-sourced code and model weights. Chris Olah pointed out that feature visualization works well on VGG without priors or regularization. Andrew Ilyas pointed out literature that showed VGG networks were slightly more robust than ResNet.

The diagram comparing artifacts was repurposed from Odena et. al.’s Deconvolution and Checkerboard Artifacts.

All experiments were performed on Google Colaboratory.


  1. Adversarial examples are not bugs, they are features
    Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B. and Madry, A., 2019. arXiv preprint arXiv:1905.02175.
  2. Very deep convolutional networks for large-scale image recognition
    Simonyan, K. and Zisserman, A., 2014. arXiv preprint arXiv:1409.1556.
  3. A Neural Algorithm of Artistic Style[PDF]
    Gatys, L.A., Ecker, A.S. and Bethge, M., 2015. CoRR, Vol abs/1508.06576.
  4. Differentiable Image Parameterizations
    Mordvintsev, A., Pezzotti, N., Schubert, L. and Olah, C., 2018. Distill. DOI: 10.23915/distill.00012
  5. Feature Visualization[link]
    Olah, C., Mordvintsev, A. and Schubert, L., 2017. Distill. DOI: 10.23915/distill.00007
  6. Learning Perceptually-Aligned Representations via Adversarial Robustness
    Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Tran, B. and Madry, A., 2019. arXiv preprint arXiv:1906.00945.
  7. On the limited memory BFGS method for large scale optimization
    Liu, D.C. and Nocedal, J., 1989. Mathematical programming, Vol 45(1-3), pp. 503--528. Springer.
  8. Deconvolution and checkerboard artifacts[link]
    Odena, A., Dumoulin, V. and Olah, C., 2016. Distill, Vol 1(10), pp. e3. DOI: distill.00003
  9. Geodesics of learned representations
    Henaff, O.J. and Simoncelli, E.P., 2016. CoRR, Vol abs/1511.06394.
  10. Batch Normalization is a Cause of Adversarial Vulnerability
    Galloway, A., Golubeva, A., Tanay, T., Moussa, M. and Taylor, G.W., 2019. ArXiv, Vol abs/1905.02161.
  11. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations
    Hendrycks, D. and Dietterich, T.G., 2019. ArXiv, Vol abs/1903.12261.
  12. Is Robustness the Cost of Accuracy? - A Comprehensive Study on the Robustness of 18 Deep Image Classification Models
    Su, D., Zhang, H., Chen, H., Yi, J., Chen, P. and Gao, Y., 2018. ArXiv, Vol abs/1808.01688.
  13. ImageNet Classification with Deep Convolutional Neural Networks[PDF]
    Krizhevsky, A., Sutskever, I. and Hinton, G.E., 2012. Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, pp. 1097--1105.
  14. Neural Style transfer with Deep Learning[link]
    Komorowicz, D., 2016.
  15. The Building Blocks of Interpretability
    Olah, C., Satyanarayan, A., Johnson, I., Carter, S., Schubert, L., Ye, K. and Mordvintsev, A., 2018. Distill. DOI: 10.23915/distill.00010

Updates and Corrections

If you see mistakes or want to suggest changes, please create an issue on GitHub.


Diagrams and text are licensed under Creative Commons Attribution CC-BY 4.0 with the source available on GitHub, unless noted otherwise. The figures that have been reused from other sources don’t fall under this license and can be recognized by a note in their caption: “Figure from …”.


For attribution in academic contexts, please cite this work as

Nakano, "A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarially Robust Neural Style Transfer", Distill, 2019.

BibTeX citation

  author = {Nakano, Reiichiro},
  title = {A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Adversarially Robust Neural Style Transfer},
  journal = {Distill},
  year = {2019},
  note = {},
  doi = {10.23915/distill.00019.4}