20

A tool for Collaborating over GAN’s latent space

 4 years ago
source link: https://towardsdatascience.com/a-tool-for-collaborating-over-gans-latent-space-b7ea92ad63d8?gi=55773b75d164
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

In January 2020 we finalized the development phase of Marrow . Shirin Anlen and I are sharing lessons learned during this process, and our post about optimizing and augmenting a small dataset was recently published on towardsatascience . This post looks at how custom web-based tools can inspire a collaborative artistic workflow when working with machine learning models.

Apr 29 ·8min read

eIjAbuf.gif

Shadow animation from GAN’s latent space using the web explorer tool

Myself and Marrow

Marrow is a hands-on research project and an interactive theater experience by shirin anlen that explores the possibilities of mental disorders in machine learning . I have previously worked with Shirin on a number of projects, most notably the VR documentary Tzina: Symphony of Longing . In 2018 I joined shirin to preview Marrow as an installation at IDFA Doclab 2018 . The prototype was a success, and one year later we went as collaborators to an intensive development phase co-produced by the National Film Board of Canada and Atlas V .

About GAN and its latent space

Generative Adversarial Networks, or GAN, was the first machine learning model we decided to research . It focuses on generative visual imagery and exhibits a very clear dissonance if you attempt to train it on complex concepts using banal stock images. In a previous post we described how we created a dataset of ‘ Perfect family dinner’ images and used it to train StyleGAN V1 . This particular dataset was constructed to serve the story of the experience; one of a dysfunctional family that sees itself only through the distorted data that it was trained on. Because of this, we aimed for results that are imperfect and represent the glitches that emerge when the model tries to go deep into social narratives.

Our dataset was a bundle of around 6,500 images containing figures of four family members, stripped away from their family dinner setting. Once StyleGAN finished the training process, we ended up with a vast space of possibilities for newly generated images containing four distorted familial figures. The infinite, continuous, space of possibilities for an output image is called the Latent Space . It is “latent” because the output image generated by GAN is determined by a seemingly hidden process of mathematical transformations, starting from a series of numbers, and ending with a bitmap image. When you change any of the initial numbers in the series, the resulting image would be slightly different. The transformation network is so deep, that it’s hard to predict what would change in the image.

NryM7zm.gif

An animation of latent space transitions

If you have a good enough dataset and algorithm, you might be able to reach disentanglement : that is when one of the input numbers controls one meaningful element in the resulting image; for example, one number would change the age of one generated person, while another changes their hair color. Needless to say, we were not able to achieve disentanglement with our small dataset. A change in a single number from the initial series could induce various changes in multiple family members. The same number could simultaneously control one family member’s pose, another member’s smile, and the appearance of a Christmas hat in a third figure (a repeating motif in stock images, it seems). The family members were in fact entangled .

The Shadow Allegory

Marrow tracks each of its models ‘thinking’ process and questions what could go wrong. In GAN, the latent space gives us information about how input data is being broken-down and then reconstructed into something new. But as much as visualizing the latent space is intriguing, we were looking for ways to integrate storytelling into experience. We wanted to materialize GAN’s distorted image of the world.

When watching the ongoing training process of GAN we started noticing things that are other than human, coming from the source dataset. It was like staring at Rorschach tests; flat images that appear different depending on who is watching. We realized that we are learning more about GAN not by seeing the result that we expect, but by seeing its in-between spaces. Plato’s Allegory of the cave speaks about finding meaning in the simple and flattened representation of things. The people in the allegory are stuck in a cave with a fire burning outside. The fire projects the shadows of passing by objects on the cave’s walls, and that is all they can see of reality. They are so used to those shadows, that once a prisoner breaks free, their eyes get burned by the flaring sun. When the prisoner’s eyes are finally accustomed to reality, they come back to the cave to tell the others, but now they are unable to see anything in the darkness. The other prisoners assume that something evil lies outside.

Interestingly, Plato’s allegory of the cave corresponds quite well with the structure and training process of GAN . GAN is in constant conflict between reality, representations of reality, and fantasy. When the algorithm generates images that are too close to the original dataset, it finds itself stuck in a simple and flat representation of the world, unable to escape to pathways of creativity. When GAN’s generations are too fantastical, they are inevitably deemed as fake and wrong. GAN is in a constant struggle to find the balance between the real and the imaginary. Therefore, we decided to visualize GAN’s struggle by using the shadow representation of the distorted family outputs.

iqQ7N3Y.gif

Transitions in full color VS in shadow mode

Animating over the latent space

Marrow is an interactive theater piece where the participants play the role of machine learning models in a family dinner setting. In the experience, a participant who represents GAN is telling their story about the difficulties they face in discerning memory from imagination — both of those perceptions are in fact distorted in GAN, so we decided to explore at this phase the additional layer of fantastical animated layer over the world of shadows, that would represent the character’s struggle between the real and the fake. We worked with the talented Paloma Dawkins , a master of hand-drawn animations and alternate dimensions. Now we had to ask ourselves: how do we orchestrate a workflow that starts in the mathematical depths of GAN, but ends with hand-drawn animations that perfectly match GAN’s latent movements across the image space? The answer came in the form of our custom-developed tool: Marrow GAN Explorer .


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK