7

Apple researchers unveil 'Keyframer': An AI tool that animates still images usin...

 7 months ago
source link: https://venturebeat.com/ai/apple-researchers-unveil-keyframer-an-ai-tool-that-animates-still-images-using-llms/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Apple researchers unveil ‘Keyframer’: An AI tool that animates still images using LLMs

Credit: VentureBeat made with Midjourney
Credit: VentureBeat made with Midjourney

Apple researchers have unveiled a new AI tool called “Keyframer,” which harnesses the power of large language models (LLMs) to animate static images through natural language prompts.

This novel application, detailed in a new research paper published on arxiv.org, represents a giant leap in the integration of artificial intelligence into the creative process — and it may also hint at what’s to come in newer generations of Apple products such as the iPad Pro and Vision Pro.

The research paper, titled “Keyframer: Empowering Animation Design using Large Language Models,” explores uncharted territory in the application of LLMs to the animation industry, presenting unique challenges such as how to effectively describe motion in natural language.

Imagine this: You’re an animator with an idea that you want to explore. You’ve got static images and a story to tell, but the thought of countless hours bending over an iPad to breathe life into your creations is, well, exhausting. Enter Keyframer. With just a few sentences, those images can begin to dance across the screen, as if they’ve read your mind. Or rather, as if Apple’s large language models (LLMs) have.

VB Event

The AI Impact Tour – NYC

We’ll be in New York on February 29 in partnership with Microsoft to discuss how to balance risks and rewards of AI applications. Request an invite to the exclusive event below.

Request an invite
Rocket.jpg?resize=1779%2C1000&strip=all

credit. arxiv.org

How ‘Keyframer’ enhances the animation process through user feedback

Keyframer is powered by a large language model (in the study, they use GPT-4) that can generate CSS animation code from a static SVG image and prompt. “Large language models have the potential to impact a wide range of creative domains, but the application of LLMs to animation is under-explored and presents novel challenges such as how users might effectively describe motion in natural language,” the researchers explain. 

To create an animation, a user simply uploads an SVG image, types a text prompt like “Make the clouds drift slowly to the left,” and Keyframer will generate the code to make that animation happen. Users can then refine the animation by editing the CSS code directly or by adding new prompts in natural language. 

According to the paper, “Keyframer supports exploration and refinement of animations through the combination of prompting and direct editing of generated output.” This user-centered approach was informed by several interviews with professional animation designers and engineers who provided feedback on the research tool, all of whom emphasized iterative design and creativity.

“I think this was much faster than a lot of things I’ve done… I think doing something like this before would have just taken hours to do,” said one study participant interviewed for the paper.

Expanding the horizons of large language models

The researchers found that most users took an iterative, “decomposed” approach to prompting designs, adding new prompts to animate individual elements one by one. This allowed them to adapt their goals gradually in response to the AI’s output. 

“Keyframer enabled users to iteratively refine their designs through sequential prompting, rather than having to consider their entire design upfront,” the researchers explain in the paper. Direct code editing features also enabled granular creative control.

While AI animation tools have the potential to democratize design, researchers acknowledge concerns around losing creative control and satisfaction. But by combining prompting with editing, Keyframer aims to provide accessible prototyping while maintaining user agency.

“Through this work, we hope to inspire future animation design tools that combine the powerful generative capabilities of LLMs to expedite design prototyping with dynamic editors that enable creators to maintain creative control,” the researchers conclude.

The broader impact of ‘Keyframer’ in creative industries

Keyframer promises to transform the animation landscape, making it more accessible to a broad spectrum of creators. In what is seen as a significant leveling of the playing field, Keyframer offers non-experts the capacity to bring stories to life through animation—a task that once required considerable technical skill and resources. It is a testament to AI’s growing role as a collaborative force in the creative process, suggesting a shift in how technology is wielded across various sectors.

The implications of Keyframer extend to an anticipated cultural shift, where AI becomes a more intuitive and integral part of the human creative experience. It is not merely a technological leap, but a potential catalyst for reimagining the very fabric of our interaction with the digital realm. Apple’s move with Keyframer could well be a precursor to a new era where the boundaries between creator and creation become increasingly fluid, guided by the invisible hand of artificial intelligence.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK