ChatGPT Can Now See,Hear & Speak

OpenAI has began rolling out new functionalities of its ChatGPT chatbot which will allow new voice and image capabilities.

Voice functionality

You can now use voice to engage in a back-and-forth conversation with your assistant. Speak with it on the go, request a bedtime story for your family, or settle a dinner table debate.

To get started with voice, head to Settings → New Features on the mobile app and opt into voice conversations. Then, tap the headphone button located in the top-right corner of the home screen and choose your preferred voice out of five different voices.

The new voice capability is powered by a new text-to-speech model, capable of generating human-like audio from just text and a few seconds of sample speech.

Images functionality

You can now show ChatGPT one or more images. Troubleshoot why your grill won’t start, explore the contents of your fridge to plan a meal, or analyze a complex graph for work-related data. To focus on a specific part of the image, you can use the drawing tool in the mobile app.

To get started, tap the photo button to capture or choose an image. If you’re on iOS or Android, tap the plus button first. You can also discuss multiple images or use our drawing tool to guide your assistant.

Image understanding is powered by multimodal GPT-3.5 and GPT-4. These models apply their language reasoning skills to a wide range of images, such as photographs, screenshots, and documents containing both text and images.

To see videos of how the functionalities work, please visit this link here.

Article first appeared on OpenAI blog.

Previous Post Next Post

AD

AD