At the Bling Zoo, a tiger wears a giant gold medallion, a monkey sports a jeweled crown, and a turtle chews on a diamond bowl.
Unfortunately, this fantastic destination does not exist. It’s the brainchild of Sora, the new text-to-video AI program from ChatGPT maker OpenAI.
‘Bling Zoo’ was just one of a series of videos Sora created on Thursday when CEO Sam Altman asked his followers on X (formerly Twitter) to send commands that were generated in movies.
The results were so ultra-realistic that they led one observer to comment: “This one convinced me that the future is here and everything will be fine.”
One user requested that Sora create “A cooking instructional session for homemade gnocchi hosted by a social media influencer grandmother in a rustic Tuscan kitchen with cinematic lighting.”
This message led to the more realistic video containing a human that Altman posted on Thursday. Users marveled at how realistic the woman’s hands were, a notoriously difficult subject to recreate with AI images.
Altman started the stunt with a tweet, saying, “We’d like to show you what Sora can do, please reply with subtitles for the videos you’d like to see and we’ll start making some.” he wrote.
‘Don’t limit yourself to details or difficulty!’ he added she in a follow-up post.
The instructions began to arrive quickly:
“A wizard wearing a pointed hat and a blue robe with white stars casting a spell that shoots lightning from his hand and holding an old tome in the other,” wrote one answerer.
“A half-duck, half-dragon flies across a beautiful sunset with a hamster dressed in adventure gear on its back,” wrote another.
Altman delivered the results, publishing some of Sora’s creations, compiled in the following video:
One observer compared it to Interdimensional Cable, an episode of the science fiction television show Rick and Morty where a special cable box allowed viewers to glimpse television in alternate realities: a world where we are all made of corn, for example.
‘Bling Zoo’ and other similar videos were not far from that.
In response to Sora’s video of ‘A bicycle race in the ocean with different animals as athletes riding bicycles with drone camera view’, Sora provided a video that led one commenter to speculate about Sora’s supremacy over Dall-E , one of the existing generative AIs. art programs:
“I have a feeling that any frame taken of Sora is better than Dalle’s,” they wrote.
Sora created short videos of user prompts submitted through X, leading some to compare the results to a science fiction vision of an alternate universe.
The results were eerily realistic.
“A bike race in the ocean with different animals as athletes riding bikes with drone camera view,” asked one follower.
And Sora complied.
Same thing when someone asked Altman to show Sora “Two Golden Retrievers Podcasting on a Mountaintop.”
For a video, a commenter requested
The resulting video showed just that, and the AI-generated woman even waved her hand to show that she had normal fingers, which can be notoriously difficult for AI to do. Often, AI-generated people end up with too many or too few fingers.
‘Great flex waving those fingers in slow motion!’ wrote one commenter. ‘And there are only 10!’ answered another.
Sora will initially be released to select creators, Altman wrote on X. He and OpenAI have not announced when it will be released to the general public.
One particularly impressed X user concluded that “AGI is here.”
When an X user requested ‘Two golden retrievers podcasting on a mountaintop,’ Sora complied
The abbreviation stands for artificial general intelligence, an artificial intelligence system that can function on its own without human control, can understand itself and learn new skills.
Such a system could be able to solve complex mathematical or scientific problems that would take humans years to unravel, some scientists hope.
However, AGI is both a goal and a fear among computer scientists working with AI, as some worry that such a system could view humans as a threat that must be eliminated.
In the case of Sora, computing power is not going to destroy humanity, but to create 10-second videos based on people’s playful instructions.