I leave ChatGPT Advanced voice mode is on as I type this article as an ambient AI companion. Every now and then, I ask it to provide a synonym for an oft-used word or some encouragement. About half an hour later, the chatbot breaks our silence and starts speaking to me in Spanish, without me asking. I laugh a little and ask what’s going on. “Just a little change? Gotta keep things interesting,” ChatGPT says, now back in English.
While testing the advanced voice mode as part of the initial alpha release, my interactions with ChatGPT’s new audio feature were entertaining, messy, and surprisingly varied. It’s worth noting, however, that the features I had access to were only half of what OpenAI demonstrated when it released the GPT-4o model in May. The vision aspect we saw in the live-streamed demo is now slated for a later release, and the enhanced Sky voice, which His Actress Scarlett Johanssen has spoken out against it, and has been removed from Advanced Voice Mode and is no longer an option for users.
So what’s the current feeling? Right now, the advanced voice mode reminds me of when the original text-based ChatGPT launched, in late 2022. Sometimes it leads to unimpressive dead ends or devolves into empty AI platitudes. But other times, the low-latency conversations work in a way that Apple’s Siri or Amazon’s Alexa never have for me, and I feel compelled to keep chatting for pleasure. It’s the kind of AI tool you’ll show your relatives over the holidays for a few laughs.
OpenAI gave access to the feature to some WIRED journalists a week after the initial announcement, but pulled it back the next morning, citing security concerns. Two months later, OpenAI soft-launched Advanced Voice Mode to a small group of users and released GPT-4o system carda white paper describing the red team’s efforts, what the company considers security risks, and the mitigation measures the company has taken to reduce damage.
Curious to try it out for yourself? Here’s what you need to know about the broader rollout of Advanced Voice Mode and my first impressions of ChatGPT’s new voice feature to help you get started.
So when will the full release be?
OpenAI launched an advanced audio-only voice mode for some ChatGPT Plus users in late July, and the alpha group still appears relatively small. The company currently plans to enable it for all subscribers sometime this fall. Niko Felix, an OpenAI spokesperson, did not share additional details when asked about the release timeline.
Screen and video sharing were a key part of the original demo, but they’re not available in this alpha test. OpenAI still plans to add those aspects in the future, but it’s also unclear when that will happen.
If you are a ChatGPT Plus subscriber, you will receive an email from OpenAI when the advanced voice mode is available to you. Once it is in your account, you will be able to switch between Standard and Advanced at the top of the app screen when ChatGPT’s voice mode is open. I was able to test the alpha version on an iPhone and a Galaxy Fold.
My first impressions of ChatGPT’s advanced voice mode
Within the first hour I spoke to him, I discovered that I love interrupting ChatGPT. It’s not like you’d talk to a human, but having the new ability to interrupt ChatGPT mid-sentence and request a different version of the result seems like a dynamic improvement and a standout feature.
Early adopters who were excited by the original demos may be frustrated to have access to a restricted version of Advanced Voice Mode with more restrictions than anticipated. For example, while generative AI singing was a key component in the launch demos, with whispered lullabies and multiple voices trying to harmonizeAI serenades are currently absent in the alpha version.