OpenAI's Sora joins text-to-video AI content generation race

OpenAI today announced Sora, a new text-to-video model that can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt.

Text-to-video is arguably the next big thing in artificial intelligence and OpenAI isn’t the first to the party. Meta Platofrms Inc., Google LLC and Runway AI Inc., among others, also offer similar services. The challenge with all the services has been quality: Though the videos from some existing services make are highly impressive, the Holy Grail is making realistic videos, and not all get that close.

Sora is a diffusion model, a generative machine learning model that creates data such as images or videos by gradually refining random noise into structured patterns based on learned data distributions. Sora can generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model also understands not only what the user has asked for in the prompt but also how those things exist in the physical world.

According to OpenAI, the model has a deep understanding of language, enabling it to interpret prompts accurately and generate “compelling characters that express vibrant emotions.” The service can also create multiple shots within a single generated video that accurately portray characters and visual style.

To its credit, OpenAI has been open about the model’s flaws as well. Sora, at least as it stands in testing, has weaknesses, including issues with accurately simulating the physics of a complex scene and may not understand specific instances of cause and effect. The model may also confuse spatial details of a prompt, for example mixing up left and right, and may struggle with precise descriptions of events that take place over time, such as following a specific camera trajectory.

Those flaws are an issue, but the model is young and some of the first demonstrations are stunning.

The video above was made using the prompt, “A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.”

Although Sora looks great, ChatGPT users will have to wait to get their hands on it. As of today, Sora is only being released to available “red teamers” to assess critical areas for harm or risks. OpenAI is also granting access to a number of visual artists, designers and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals.

“We’re sharing our research progress early to start working with and getting feedback from people outside of OpenAI and to give the public a sense of what AI capabilities are on the horizon,” OpenAI said.

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU

OpenAI's Sora joins text-to-video AI content generation race

Image: OpenAI

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

Recommend

刷新率272Hz的Viewsonic OLED游戏显示器XG240-2K-OLED登陆美国

Chain-of-Thought Reasoning Without Prompting

VA allocators

西部落-放心的软件搜索引擎告别流氓捆绑广告

基于 render 免绑信用卡零成本搭建 memos

2024 年 3 月发布新款中端小米智能手机的更多证据浮出水面

6 Apps That Will Help You Stick To Your Budget In 2024

We Have to Start Over: From Atom to Zed

阿里云轻量可用 | CentOS/Debian/Ubuntu 一键网络DD重装脚本（纯净版）

OpenAI Sora：“原始版”世界模拟器，我们离黑客帝国还有多远？

About Joyk