AN EVEN MORE human-like version of the artificial intelligence model ChatGPT has launched and it has been made free to all users.
ChatGPT responds to commands and prompts such as explaining scientific concepts, writing scenes for a play, summarising lengthy articles or even writing lines of computer code.
But with its new GPT-4o model, it accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs.
The “o” stands for ‘”omni” and the new model is described as a “step towards much more natural human-computer interaction”.
“It feels like AI from the movies,” said OpenAI CEO Sam Altman in a blog post.
“Talking to a computer has never felt really natural for me; now it does,” he added.
It will be rolled out over the coming weeks, OpenAI said, with paid customers having unlimited access to the tool while ”limited access” will be available free of charge.
Videos posted on social media yesterday shows OpenAI, the owner of ChatGPT, putting the new model through its paces.
In one video, the new ChatGPT model admires the clothing worn by an OpenAI researcher and successfully interprets an employee’s surroundings through a smartphone camera.
“From what I can see it looks like you’re in some kind of recording or production set-up with lights, tripods… you might be gearing up to shoot a video or make an announcement?” the ChatGPT bot said, correctly guessing that announcement of the GPT-40 model.
Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN
— OpenAI (@OpenAI) May 13, 2024
Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx
In another video, GPT-40 is able to instantly translate a spoken conversation between a Spanish and English speaker, and sings a lullaby about “majestic potatoes”.
Lullabies and whispers with GPT-4o pic.twitter.com/5T7ob0ItuM
— OpenAI (@OpenAI) May 13, 2024
When the new ChatGPT model sees a slice of cake with a candle on it, it correctly guesses that it’s someone’s birthday and sings “Happy Birthday”.
In another video showcasing its capabilities, the new AI model can instantly decipher what is happening around them via a smartphone camera and instruct a blind person when it is safe to hail down a taxi.
GPT-4o as tested by @BeMyEyes: pic.twitter.com/WeAoVmxUFH
— Greg Brockman (@gdb) May 14, 2024
It is also shown harmonizing on a song, teaching Spanish to users, responding to dad jokes, and responding with sarcasm when asked to.
Sarcasm with GPT-4o pic.twitter.com/APrYJMvBFF
— OpenAI (@OpenAI) May 13, 2024
Making the new model available to all users may raise questions about OpenAI’s path to monetization amid doubts that everyday users are ready to pay a subscription.
Until now, only lower performing versions of OpenAI or Google’s chatbots were available to customers for free – “limited access” is available free of charge to the new GPT-4o model.
“We are a business and will find plenty of things to charge for,” Altman said on his blog.
The AI makers are also feeling pressure from publishers and creators, who are demanding payment for any content used to train the models.
OpenAI has signed content partnerships with the Associated Press and the Financial Times, but is also caught in a major lawsuit with The New York Times.
AI companies have also been confronted with separate lawsuits from artists, musicians, and authors in US courtrooms.
- With additional reporting from © AFP 2024