OpenAI is adding even more features to its ChatGPT chatbot. Today, the company announced that it has begun rolling out new voice features to its mobile apps, along with ways to upload images that can be analyzed using ChatGPT.
in a blog post, OpenAI announced that ChatGPT users will soon be able to talk to the chatbot. Once the feature is available on the iOS and Android app, users can go to the settings menu and then tap on the new features selection. They can then tap to join the app’s voice calls. Finally, they can tap the headphone icon and choose from one of five voice options.
The new voice capability is powered by a new text-to-speech model, capable of producing human-like audio from just text and a few seconds of sample speech. We collaborated with professional voice actors to create each of the voices. We also use Whisper, our open source speech recognition system, to transcribe your spoken words into text.
ChatGPT mobile apps will soon be able to use the snap button to take a photo or select an already created photo. After that, ChatGPT can inspect the image and perform a number of different tasks, such as analyzing a graph for work, troubleshooting when a device is not working, and more.
Image understanding is enabled by the multimodal GPT-3.5 and GPT-4. These models apply their linguistic reasoning skills to a wide variety of images, such as photographs, screenshots, and documents that contain both text and images.
The new features are rolling out over the next few weeks and will be available first for ChatGPT Plus and Enterprise users. These features will be extended to developers and other ChatGPT users in the near future.
Last week OpenAI announced DALL-E 3, the next version of its AI image generator that will offer integration with ChatGPT. It will be officially launched in October.