22 February 2024

Sora AI: OpenAI’s new text-to-video and video-to-video tool is causing massive rifts

Sora is an AI model that can create realistic and imaginative scenes from text instructions. This could be the future of media content. Though it still has problems.

Sora can create videos up to a minute long while maintaining visual quality and following the user's prompts. Sora can create complex scenes with multiple characters, specific types of motion, and precise details of the subject and background. The model understands not only what the user requested in the prompt, but also how those items exist in the real world. The model has a thorough understanding of language, allowing it to correctly interpret prompts and create compelling characters who express strong emotions. Sora can also create multiple shots within a single generated video that accurately represent the characters and visual style.

The current model has flaws. It may struggle to accurately simulate the physics of a complex scene and may not comprehend specific instances of cause and effect. For example, a person may take a bite out of a cookie, but the cookie may not show any bite marks afterwards.

The model may also misinterpret spatial details of a prompt, such as left and right, and struggle with precise descriptions of events that occur over time, such as following a specific camera trajectory. They plan to include C2PA metadata in the future if they deploy the model in an OpenAI product.

Update: Microsoft confirms Sora will come to Copilot!

Videos Generated by Sora (and their prompts)

Watch them in full screen for the best experience

Prompt: A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.

Prompt: Several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field.

Prompt: A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.

Prompt: Historical footage of California during the gold rush.

Prompt: The camera follows behind a white vintage SUV with a black roof rack as it speeds up a steep dirt road surrounded by pine trees on a steep mountain slope, dust kicks up from it’s tires, the sunlight shines on the SUV as it speeds along the dirt road, casting a warm glow over the scene. The dirt road curves gently into the distance, with no other cars or vehicles in sight. The trees on either side of the road are redwoods, with patches of greenery scattered throughout. The car is seen from the rear following the curve with ease, making it seem as if it is on a rugged drive through the rugged terrain. The dirt road itself is surrounded by steep hills and mountains, with a clear blue sky above with wispy clouds.

Prompt: An extreme close-up of an gray-haired man with a beard in his 60s, he is deep in thought pondering the history of the universe as he sits at a cafe in Paris, his eyes focus on people offscreen as they walk as he sits mostly motionless, he is dressed in a wool coat suit coat with a button-down shirt , he wears a brown beret and glasses and has a very professorial appearance, and the end he offers a subtle closed-mouth smile as if he found the answer to the mystery of life, the lighting is very cinematic with the golden light and the Parisian streets and city in the background, depth of field, cinematic 35mm film.

Here are some interesting tweets about Sora's content generation abilities.

