ChatGPT, the revolutionary chatbot powered by artificial intelligence (AI), will soon be able to send much more than human-like text messages.
A Microsoft executive has revealed that the next version – to be released this week – will be able to turn text prompts into unique videos.
The tech giant has invested heavily in ChatGPT and has already unveiled a host of new products that integrate it as an AI assistant, such as search engine Bing.
But this updated version, dubbed GPT-4 and expected to launch Thursday, will have “multimodal models,” according to Andreas Braun, CTO of Microsoft Germany.
This means that it can generate content in multiple formats, such as audio clips, images, and video clips, from a text prompt.
A Microsoft executive revealed that the next version of ChatGPT will be able to turn text prompts into unique videos, and will be released this week
ChatGPT is a large language model trained on a huge amount of text data, which allows it to generate human-like text responses to a given prompt.
The current version, released in November by start-up OpenAI, is known as GPT3.5 and appears to have a huge range of capabilities.
For example, it has been used to take exams, deliver a sermon, write software, and provide relationship advice.
It was limited to providing answers as text, but Mr Braun revealed that is about to change at the ‘AI in Focus – Digital Kickoff’ event last Thursday.
According to heisehe said: ‘Next week we will introduce GPT-4, where we have multimodal models that offer completely different possibilities – for example videos.’
This isn’t a completely groundbreaking concept — in September, rival tech giant Meta unveiled its own AI system that generates videos from text prompts.
‘Make-A-Video’ is trained on captioned images to learn about the world and how it is described, and unlabeled videos to determine how the world moves.
While the resulting clips are impressive, they are often blurry and lack sound.
Make-A-Video has yet to be made available to the public, but the release of GPT-4.0 may change that.
Experts have said that the success of ChatGPT and OpenAI’s partnership with Microsoft prompted Google to release its own AI chatbot, Bard.
Speculation started when Bard got a question wrong in a promotional video – wiping out £100bn of his company’s value.


In September, rival tech giant Meta unveiled its own AI system that generates videos from text prompts. ‘Make-A-Video’ is trained on captioned images to learn about the world and how it is described, and unlabeled videos to determine how the world moves
While GPT-4 will be OpenAI’s first foray into video generation, it has already developed a text-to-image AI, DALL-E.
In 2020, the company also announced Jukebox, a tool that creates music from a prompt and can mimic the style of different artists.
While not specifically mentioning these tools, Mr. Braun said the new ChatGPT “will make the models comprehensive.”
During the “AI in Focus” event, which was broadcast to Microsoft partners and potential customers, Mr. Braun did not disclose whether GPT-4 would be released on its own or as part of a product.
The tech company does have an event scheduled for Thursday to showcase “the future of AI,” which may yield more information.
Rumors about what this update will look like have been circulating since 2021 Wired speculating that it will use 100 trillion parameters.
These give it many more options for ‘next word’ or ‘next sentence’ in a given context than it currently has, making it more human.
However, this has been discontinued by Sam Altman, CEO of OpenAI, who told StrictlyVC it was total bulls**t.
Others have said that GPT-4 will be better at generating computer code, handling longer text prompts, and outputting text, images, sounds, and videos.
Mr. Altman told the “AI for the Next Era” podcast: “I think we’re going to get more multimodal models before long, and that will open up new things.”

OpenAI CEO Sam Altman (pictured) told the ‘AI for the Next Era’ podcast: ‘I think we’re going to get more multi-modal models before long, and that will open up new things’
While inclusive, multimodal AI is a new concept, the impact of AI video generation has been debated for years, particularly with regard to “deepfakes.”
These are forms of AI that use “deep learning” to manipulate audio, images or video, creating hyper-realistic, but fake media content.
The term was coined in 2017 when a Reddit user posted manipulated porn videos on the forum.
The videos swapped the faces of celebrities like Gal Gadot, Taylor Swift, and Scarlett Johansson into porn stars.
Another notorious example of a deepfake or “cheapfake” was a crude impersonation of Volodymyr Zelensky appearing to surrender to Russia in a video that was widely circulated on Russian social media last year.
The clip shows the Ukrainian president speaking from his lectern as he calls on his troops to lay down their arms and agree with Putin’s invading forces.
Astute Internet users immediately noticed the differences between the color of Zelensky’s neck and face, the strange accent and the pixelation around his head.
Despite the entertainment value of deepfakes, some experts have warned of the dangers they can pose.
Dr. Tim Stevens, director of the Cyber Security Research Group at King’s College London, said deepfake AI has the potential to undermine democratic institutions and national security.
He said the widespread availability of these tools could be exploited by states like Russia to “troll” target groups in an effort to achieve foreign policy objectives and “undermine” countries’ national security.
He added: “The potential is there for AIs and deepfakes to compromise national security.
Not in the high level of defense and interstate warfare, but in the general undermining of trust in democratic institutions and the media.
“They could be exploited by autocracies like Russia to diminish trust in those institutions and organizations.”
In fact, it has been predicted that 90 percent of online content will be generated or created using artificial intelligence by 2025.