A video generated by OpenAI's new AI text-to-video disruptor tool Sora.

OpenAI, the creator of ChatGPT, has unveiled a new form of artificial intelligence that creates realistic video based on text prompts, prompting stunned reactions online.

The text-to-video model, named Sora, has “a deep understanding of language” and can generate “compelling characters that express vibrant emotions,” OpenAI said in a blog post on Thursday.

SORA is just out of this world.
Story continues below Advertisement
Remove Ad
OpenAI’s new text-to-video model just dropped and it’s insane.
More examples below ⬇️ pic.twitter.com/qbMy5Rz5Mc

— Linus (●ᴗ●) (@LinusEkenstam) February 15, 2024

“The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.” It feels like the model has an advanced understanding of the 3D space in the scene along side subject, object and their correlation. It goes beyond creating moving pictures beyond 2D image. It maintains visual quality, character coherence, and stays loyal to the user's prompt. Multi-model is the future of AI, OpenAI CEO Sam Altman had said earlier.

Altman, on X (formerly Twitter), invited users to suggest prompts for Sora before posting results that included realistic videos of two golden retrievers podcasting on top of a mountain, a grandmother making gnocchi, and marine animals taking part in a bicycle race on top of the ocean.

Introducing Sora, our text-to-video model.
Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. https://t.co/7j2JN27M3W
Prompt: “Beautiful, snowy… pic.twitter.com/ruTEWn87vf

— OpenAI (@OpenAI) February 15, 2024

OpenAI said in its blog post that it would be taking several important safety steps before releasing Sora to the general public.

“We are working with red teamers – domain experts in areas like misinformation, hateful content, and bias - who will be adversarially testing the model,” the company said.

“We’re also building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora.”

While Sora seems to have OpenAI also acknowledged that Sora has weaknesses, including difficulty with continuity, following a specific camera trajectory and distinguishing left from right. Sora, sometimes, creates implausible conditions.

“For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark,” the San Francisco-based startup said.
OpenAI rivals Meta and Google have also demonstrated text-to-video AI technology, but their models have not produced results as realistic as Sora’s.

Despite these challenges, Sora promises immense potential. Five or 10 years from now, there can be a complete democratisation of filmmaking and self-expression, where anyone can create their own cinematic universe. Video games will become incredibly realistic with these advance models and upscalers. Although issues of safety and deep fakes will remain critical issues.

OpenAI drops 'Sora', new groundbreaking text-to-video tool, can make films in the future

Microsoft-backed startup & ChatGPT maker OpenAI's new disruptor Artificial Intelligence tool, Sora, stuns social media with hyper-realistic 60-second videos created using text prompts. The potential of Sora is massive.

A video generated by OpenAI's new AI text-to-video disruptor tool Sora.

Related stories