A.I. software called DALL-E turns your words into pictures

Key Points

The DALL-E Mini software from a group of open-source developers isn't perfect, but sometimes it does effectively come up with pictures that match people's text descriptions.

Screenshot

In scrolling through your social media feeds of late, there's a good chance you've noticed illustrations accompanied by captions. They're popular now.

The pictures you're seeing are likely made possible by a text-to-image program called DALL-E. Before posting the illustrations, people are inserting words, which are then being converted into images through artificial intelligence models.

For example, a Twitter user posted a tweet with the text, "To be or not to be, rabbi holding avocado, marble sculpture." The attached picture, which is quite elegant, shows a marble statue of a bearded man in a robe and a bowler hat, grasping an avocado.

The AI models come from Google's Imagen software as well as OpenAI, a start-up backed by Microsoft that developed DALL-E 2. On its website, OpenAI calls DALL-E 2 "a new AI system that can create realistic images and art from a description in natural language."

But most of what's happening in this area is coming from a relatively small group of people sharing their pictures and, in some cases, generating high engagement. That's because Google and OpenAI have not made the technology broadly available to the public.

Many of OpenAI's early users are friends and relatives of employees. If you're seeking access, you have to join a waiting list and indicate if you're a professional artist, developer, academic researcher, journalist or online creator.

"We're working hard to accelerate access, but it's likely to take some time until we get to everyone; as of June 15 we have invited 10,217 people to try DALL-E," OpenAI's Joanne Jang wrote on a help page on the company's website.

One system that is publicly available is DALL-E Mini. it draws on open-source code from a loosely organized team of developers and is often overloaded with demand. Attempts to use it can be greeted with a dialog box that says "Too much traffic, please try again."

It's a bit reminiscent of Google's Gmail service, which lured people with unlimited email storage space in 2004. Early adopters could get in by invitation only at first, leaving millions to wait. Now Gmail is one of the most popular email services in the world.

Creating images out of text may never be as ubiquitous as email. But the technology is certainly having a moment, and part of its appeal is in the exclusivity.

Private research lab Midjourney requires people to fill out a form if they wish to experiment with its image-generation bot from a channel on the Discord chat app. Only a select group of people are using Imagen and posting pictures from it.

The text-to-picture services are sophisticated, identifying the most important parts of a user's prompts and then guessing the best way to illustrate those terms. Google trained its Imagen model with hundreds of its in-house AI chips on 460 million internal image-text pairs, in addition to outside data.

The interfaces are simple. There's generally a text box, a button to start the generation process and an area below to display images. To indicate the source, Google and OpenAI add watermarks in the bottom right corner of images from DALL-E 2 and Imagen.

The companies and groups building the software are justifiably concerned about having everyone storming the gates at once. Handling web requests to execute queries with these AI models can get expensive. More importantly, the models aren't perfect and don't always produce results that accurately represent the world.

Engineers trained the models on extensive collections of words and pictures from the web, including photos people posted on Flickr.

Recommend

“去超市别拿不认识的雪糕”，是谁在为几十元的冰淇淋买单？

Jim Keller透露AMD在其离开后取消了K12项目，认为是害怕改变做出的决定

凤凰岭熬狱沟 ~ 上方寺 ~ 南线泉水环线（9 公里爬升 900 米）

Linux下文件内容更新了文件夹时间戳却没变？

提供内容营销全链路解决方案，「爱设计」完成数千万元A2轮融资

Simple Python Motion Jpeg (mjpeg server) from webcam. Using: OpenCV,BaseHTTPServ...

AI越进化越跟人类大脑像！Meta找到了机器的"前额叶皮层"，AI学者和神经科学...

西班牙MareNostrum 5超算将采用英伟达架构，Grace芯片搭配H100计算卡

加密市场「大逃杀」：清算、抛售、挤兑

iPhone 14或提高全系售价上热搜！网友：去年就这么说

About Joyk