OpenAI is giving a small group of its ChatGPT Plus users access to the advanced version of voice mode.
The feature, which is being released in alpha, “offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions,” OpenAI said Tuesday. The company will continue inviting users to access the feature on a rolling basis over the next few weeks, and plans for all Plus users to have access to the feature in the fall. OpenAI added that video and screen sharing capabilities are coming in the future.
The voice capabilities of GPT-4o, OpenAI’s latest model that debuted in May, were tested with over 100 external red teamers in 45 languages. GPT-4o was trained to only speak in four preset voices to protect the privacy of voice actors, and is built to block outputs using voices that are not preset. Therefore, ChatGPT cannot be used to impersonate individuals and public figures, the company said. OpenAI also added guardrails to block requests for copyrighted audio, including music, and for violent or harmful content.
“Learnings from this alpha will help us make the Advanced Voice experience safer and more enjoyable for everyone,” OpenAI said, adding that it plans to share more details on GPT-4o in early August.
OpenAI launched voice capabilities for ChatGPT in September, offering five distinct voices named Breeze, Cove, Ember, Juniper, and Sky. Earlier this year, however, users began drawing comparisons between the Sky voice and actress Scarlett Johansson.
Johansson responded by releasing a letter saying she was “shocked, angered, and in disbelief” that the company would use a voice “eerily similar” to hers for the chatbot after she declined to work with OpenAI. The company later paused the Sky voice, and said it was not meant to be an imitation of Johansson.