OpenAI announces new multimodal desktop GPT with new voice and vision capabilities.

Authors :: Mearian, Lucas
Source :: Computerworld (Online Only). 5/13/2024, p1-6. 6p.
Publication Year :: 2024
Abstract: OpenAI has announced the release of GPT-4o, a new language model that can interact with users through text, voice, and visual prompts. The model can recognize and respond to screenshots, photos, documents, and charts, as well as facial expressions and handwritten information. It has improved response times, matching human conversation speeds, and offers better performance in vision and audio understanding. While some analysts believe OpenAI is catching up to competitors, GPT-4o demonstrates impressive conversational capabilities and multilingual proficiency. The model will be rolled out iteratively, with extended access for developers and new features for paying users. [Extracted from the article]

Subjects :: *NATURAL language processing
*LANGUAGE models
*GENERATIVE artificial intelligence
*GEMINI (Chatbot)
*CHATBOTS
*CHATGPT

Tools