Voice and image-based ChatGPT prompts are here

Neville Lahiru
4 Min Read

In its latest update, OpenAI announced that it’s rolling out voice and images to ChatGPT. Users will not be able to utilize voice and image-based prompts along with text on the popular AI bot. The new features will be available within the next couple of weeks to Plus and Enterprise users of the service.

Similar to Siri or Google Assistant, you can prompt ChatGPT with your voice. Here, the bot will convert it to text, process it through its Large Language Model, and form an audio response. The feature is part of OpenAI’s efforts to build more audio capabilities for its tools and services. The company states that its voice tech can create “realistic synthetic voices from just a few seconds of real speech” and its voice chat feature comes as a means of mitigating potential risks.

Spotify’s new voice translation feature uses OpenAI’s voice tech to help podcasters translate conversations to other languages in their voices

On the image front, the new additions will let users upload one or more images and ChatGPT will respond based on the related query. Powered by GPT-3.5 and GPT-4 models, users can additionally let the bot focus on a specific part of the image via the mobile app’s drawing tool. Similar to voice, image search will also be restricted to minimize certain risks. OpenAI mentions that ChatGPT’s “ability to analyze and make direct statements about people” is limited owing to privacy and accuracy concerns.

OpenAI recently unveiled its third iteration of DALL-E with researchers claiming that the AI image generator understands context much better. The updated version comes with more safety options and more importantly, it’s now integrated with ChatGPT itself. This means that users wouldn’t need to come up with their own detailed prompts to generate a specific image on DALL-E and can rely on ChatGPT instead. However, it’s worth noting that just like with voice and image-based search, the new version will only be available for Plus and Enterprise users in October, followed by research labs and API service.

But OpenAI isn’t the only one to make nifty upgrades to its AI tools. Google recently updated its own AI chatbot Google Bard to integrate with Google tools like GMail, Gdocs, Flights, and YouTube. The new feature aims to offer users better usability like being able to pull specific information from emails, summarize documents, and provide contextualized itineraries via the chatbot directly.

As of now, it remains to be seen how the new features will pan out in the long run. With more integrations and additional input forms coming into the picture, it will be interesting to see how the likes of OpenAI and Google will ensure the necessary protections are in place, as the technology continues to grow its foothold in the digital space.

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings