2

Mid (user) journey: the UX of AI-generated art

 1 year ago
source link: https://uxdesign.cc/mid-user-journey-the-ux-of-ai-generated-art-367b13c34f11
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Mid (user) journey: the UX of AI-generated art

The first time user experience of Midjourney from a UX Researcher’s personal perspective

Midjourney is an AI program that generates art based on prompts users write. It’s a godsend for folks that are creative, but have no talent for drawing. Some people in life opt to be UX designers, others UX researchers… For me, I never saw myself as a designer, mostly because of my stick figure like pictures, where I draw the sun in the corner of the page since it’s easier than making a perfect circle (the real ones know what I’m talking about.)

However, in 2022, if I am inspired to make art of a night sky, instead of haphazardly painting on a canvas, or opening illustrator and embarrassing myself, I can give the prompt “Magical gorgeous starry night, glowing water” to a bot in Discord, and it’ll make this:

High resolution picture/art piece of a starry night with glowing water and trees.

Prompt: Magical gorgeous starry night, glowing water (with version 4 and upscale beta)

In this case, it took seconds instead of hours to make art, and it doesn’t look like a kindergartner painted it! In addition to handling landscapes, with the 4th version of Midjourney, it has been able to create portraits with some good success:

A female mage with lightning coming out of the left side of her head. She has a blue mark on her forehead.

Doesn’t have the eyes symmetrically perfect, but it’s better than what I can do.

As I was going through Midjourney for the first time, being completely blown away, I decided to document my experience, which I’m going to share here. Instead of doing a mapping exercise, this user journey will be formatted as an article, and completely written out.

My Expectations

When my brother introduced me to Midjourney via a text message, I didn’t really know what to expect. He’s pretty technical, so I wasn’t sure of the learning curve. From my perspective, hearing “AI-generated art” sounded daunting. Despite this, he exclaimed it was really easy to get started, and that there is a free trial. He asked me to come up with a piece of art that was random in order to prove its merit. I told him that I wanted a “Giant medieval battle with orangutans versus flying walruses in Disneyland with an asteroid in the backdrop:

A giant creature that almost resembles a orangutan in Disney World, there appears to be a crowd of walruses, although it’s hard to discern.

Almost sort of looks like a giant orangutan… hmmm not what I was thinking, but nice try!

The art he showed me seemed cooler than anything I could draw, and although it didn’t look close to what I imagined, it was intriguing.

My Goals

The goals for my first time on the platform were roughly like the following, in this order:

  1. Figure out where to navigate to in order to use Midjourney.
  2. Once I‘m on the platform, determine which actions I need to take in order to generate art.
  3. “Pilot” some test art to see what the limitations are of the program.

To sum it up, I wanted to create some basic art, but wasn’t sure where to get started, or what the system’s constraints were.

The Two Tasks — Pain points — Points of Ease

Task 1 [Accept Discord Invite and Join Channel]: Luckily for me, I already had a Discord account. For all the people reading this that don’t know, Discord is a voice chat (VOIP) and messaging platform that also allows users to create channels. Discord has a lot of integrations and bots which can help facilitate a multitude of functions, or in this case, AI art generation. I went ahead and accepted the invite where I was brought to the Midjourney Discord server.

I had no idea where to begin, so I messaged a more experienced user, and he told me to join one of the “newbie” channels. Once I joined, I was amazed at what I saw; again, and again, art was being rapidly made by a myriad people in this message channel. The ideas varied from portraits of humanoids (we’re talking orcs, elves etc.) to landscapes, and everything in between. You could tell some people’s political views based on the satirical art they would generate…spoiler alert: users were not the biggest fans of Trump.

Task 1 Pain Points: As soon as I accepted the invite the channel I was brought to a welcome page with a series of URLS/links that completely overwhelmed me. It seemed like a rabbit hole that I didn’t want to partake in, and I got excited by the art that my brother showed me so I just wanted to get started.

A Discord channel with instructions as well as many different urls to click on.

5 links to investigate, which I wasn’t patient enough to click through, read, or anything. Despite looking at it now in more detail, where there’s a visual guide, I opted to get assistance from someone with more expertise.

  • Without the help of a more experienced user, it might have taken me a bit to realize I needed to join the newbie channel since there are a lot of links to click through, and lines of text to parse.

Task 1 Points of Ease:

  • The onboarding process was very convenient in that I didn’t need to create a new user name and password on separate site to get started, since I already had a Discord account.
  • Having the AI art generator as just another Discord server on my list of other servers actually minimized context switching and maximized consolidation. It’s always preferable to have a single platform that has similar navigational/interaction/experiential patterns, compared to multiple disparate platforms. There’s even research to show that there is a “toggle tax”, which is a result from drastic context switching.
  • The terms of service (TOS) are very simple and generous. Regular individuals and small companies are allowed to use Midjourney as long as its not for inflammatory purposes, and also can’t have gore or nudity. Furthermore, a user can utilize it to make money (even in the trial version) unless they are a business that makes over 1 million dollars a year (in which case they need an enterprise license).

Task 2 [Create First Art]: I went ahead and created my first prompt by typing: “/imagine prompt: Epic surreal fantasy land meadow with orcs elves and wizard.” From there a picture with 4 options were generated (like the above Disneyworld image with walruses). I could choose to make a variation of any of the 4 pictures, or upscale a picture to higher resolution and add more detail. I went ahead and picked U3 to upscale the 3rd choice. In seconds, this was what got generated:

Picture of a giant meadow with a red castle in the background and a crowd of people.

Very cool looking, however not exactly what I had in mind. I don’t see any orcs, elves, wizards etc.

Task 2 Pain Points:

  • The regular base version of Midjourney might be completely off (11/14/2022 base version.) Much like my medieval orc, elves and wizard picture, it can just not have the right details (granted version 4 is better, but a new user wouldn’t know to specify that, and I didn’t my first time.)
  • Wasn’t entirely sure what U1-U4 or V1-V4 meant at first, along with the other clickable options, some of which were more experimental/beta. I learned through trial and error as well as getting consulting from a more experienced user, but if someone didn’t have another person to ask they’d have to read documentation, rather than getting clues from the UI on what each option could do:
1*BKSYrPdw3RuorWX6WwdEuQ.png

U1-U4 = upscale image 1–4 . V1–4 = make a new variation of image 1–4

  • It’s unclear when a prompt should be a full on sentence with grammatical clauses vs a series of words and commas. For example “A cat running into an open meadowland with flowers and rainbows” vs “A cat running, open meadowland, flowers, rainbows.” Furthermore, I have seen users do both in one prompt, and will have a full on sentence with clauses, followed by words with commas separating them.
  • Portraits of faces have lot of details wrong, or are asymmetrical. I didn’t know at the time to use version 4, or the upscale beta process that handles faces, body parts, and finer details much more precisely. This is an example of the not as good base version for portraits:
Picture of a mage with fire where her arms should be. Her eyes are different shapes/colors from each other, and her nose has a blemish.

Missing arms, eyes and nose are off. Definitely cool but the normal version (not v4), doesn’t handle portraits well. You wouldn’t know about v4 unless you were a bit more seasoned or read the documentation/announcements.

  • It’s easy to lose track of where my art was when there was so many people in one channel generating art (think of a group message where people are spamming). It’s easy enough to scroll up, but it’s still pain point. Luckily since I bought 10 dollar basic version after this FTUE, it allowed me to have my own direct message chat with a Midjourney bot, so other peoples’ art won’t be in the chat.
  • As I’m going through the trial, it would have been insightful to be notified at certain milestones of my usage (25%, 50%, 75% etc.) There is a way to manually type “/info” which could have given me that information (if I remembered), but with my first time using the platform, I never knew the command.
  • Even with the latest version, it won’t get specific characters 100% correct. For example, when I made an image for “body builder Waluigi” (as a request to my friend) it got the face incorrect, as well as mustache:
4 pictures of what is supposed to be a body builder version of Waluigi. The pictures don’t really look that much like him though.

Is definitely not Waluigi, but it definitely got the purple aspect correct, as well as some aspects of the nose and he’s definitely a body builder.

Task 2 Points of Ease: From a pure usability and learning curve perspective (learnability), it’s absolutely mind-blowing how much easier this is to use to create art than actually drawing. Granted there’s technical ability with understanding the limitations of the prompts, but it’s still simpler than making art the old fashion way. There are also the following good points:

  • Images are easy to download, and the pictures’ file names are the prompt that I gave it, so I can determine which prompt generated a particular piece of art.
  • Midjourney can make interpretations of abstract concepts. I tried a prompt involving elation and happiness, in which this was generated:
A crowd of people in a colorful scene with balloons on top. Every one seems to be raising their hands in joy or elation.

This is the image I got for the prompt: elation, euphoria, happiness, glee

  • Although the group channel message feed can sometimes be distracting, and bury messages, it’s useful to learn while also gaining inspiration from other people much like the picture below (animal people I guess):
Midjourney’s Discord newbie channel, there are two images: One of a cat with a pinecone, and the other of a astronaut lion, created from different trial users

Is there a correlation between liking felines/cats and using Midjourney? Someone needs to scrape all the prompts and see what words people are using the most to make art.

Overall Thoughts

Midjourney and AI are the future when it comes to making art. It’s hard to imagine a world in 10 years where AI isn’t generating most art and then more granular changes are made by artists or even more specialized AI art software. Will AI fully replace artists? Maybe the technical side will be automated away; where drawing/painting/designing is meticulous, coming up with prompts and generating art is way faster. Creativity, however, will never be automated. The program still needs some sort of directive or prompt. Although I’m sure there’s an option to randomly generate pictures, the fun is melding two weird concepts together and seeing what gets created. For example, I wanted something that was beautiful, but also hellish. Scary but beautiful at the same time…

1*-46hOta42Bds_8qRkhP-uA.png

Prompt: hell and fire, horror, beautiful, gorgeous… I think Version 4 delivered

Here are some additional use cases that I think Midjourney can help fulfill today:

  1. Creating Dungeons and Dragons artifacts including landscapes, other environments for dungeon masters, and character portraits.
  2. Creating desktop wall papers.
  3. Creating posters and other physical art with size specifications in the prompt.

I could also imagine AI generated virtual worlds as another future use case if the technology expands into that realm. What is AI going to be capable of in the future? 10 years ago we didn’t have anything like this. It’ll be amazing seeing the progress of the AI art journey.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK