ROUNDTABLE DISCUSSION: Swapping Autoencoder for Deep Image Manipulation
A fundamental challenge of image manipulation problems is separating “texture” from “structure”. We explore the purely unsupervised setting, where an unlabeled image collection is given, and structure and texture must both be discovered. While generative models have become increasingly effective at producing realistic images from such a collection, adapting such models for controllable manipulation of existing images remains challenging. We propose the Swapping Autoencoder, which is designed specifically for image manipulation, rather than random sampling. The key idea is to encode an image with two independent components and enforce that any swapped combination maps to a realistic image. In particular, we enforce one component to encode co-occurrent patch statistics across different parts of an image, corresponding to its “texture”. Such a disentangled representation allows us to flexibly manipulate real images in various ways, including texture swapping, local and global editing, and latent code vector arithmetic.
Richard Zhang is a Research Scientist at Adobe Research, with interests in computer vision, deep learning, machine learning, and graphics. He obtained his PhD in EECS, advised by Professor Alexei A. Efros, at UC Berkeley in 2018. He graduated summa cum laude with BS and MEng degrees from Cornell University in ECE. He is a recipient of the 2017 Adobe Research Fellowship. More information can be found on his webpage: http://richzhang.github.io/.