
GPT-4o / ChatGPT-4o
Part of a series on ChatGPT. [View Related Entries]
[View Related Sub-entries]
This submission is currently being researched & evaluated!
You can help confirm this entry by contributing facts, media, and other evidence of notability and mutation.
About
GPT-4o or ChatGPT-4o refers to an artificial intelligence assistant and interface by OpenAI that was introduced in May 2024. The "o" in ChatGPT 4o stands for "omni." This new version of ChatGPT responds more fluidly to human voice prompting and is able to use a phone camera to scan a room and talk about what it "sees." It also has a more realistic-sounding voice and appears to have been trained to be more of an AI companion, leading to many memes and jokes about the relationships users might form with the artificial intelligence.
History
OpenAI established itself as a leader in consumer-facing AI in 2022 and 2023, with the release of GPT-3 and ChatGPT. Subsequent releases and announcements, including Sora earlier in 2024, further heightened the company's profile.
In April and May 2024, OpenAI teased an upcoming demo day, inviting speculation and anticipation from users of its products. The demo was released on May 13th, 2024, as a livestream video on YouTube (seen below), which received over 2.5 million views and 80,000 upvotes in a day, with comments notably disabled.[1]
OpenAI also released a series of other videos showing demos of specific features of ChatGPT-4o, in particular its conversational capabilities (seen below, left and right).
Features
On its site,[2] OpenAI described the capabilities of GPT-4o as:
GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction--it accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time(opens in a new window) in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.
ChatGPT-4o is not just an AI generator, but possibly a new form of computer interface. Since it has almost no lag time in answering users in natural language, can "see" through a camera, and is able to perform tasks at a high level, the AI (or a more advanced version of something like it) could possibly replace screens and cursors as a method of using computers. Sam Altman said that 4o was "the best computer interface I've ever used" in a blog post published on May 13th, 2024.[4]
Altman also posted the word "her" to X on May 13th, around the time of the demo. This was likely a reference to the 2013 dystopian film titled Her about a man who enters into a doomed romance with an artificial intelligence voiced by Scarlett Johansson.[3] The post (seen below) earned over 37,000 likes in a day.

Scarlett Johansson Letter
On May 20th, 2024, new outlets published a letter by Scarlett Johansson in which she said she was approached by Sam Altman to voice Sky but refused (letter shown below). She then accused the company of imitating her voice without her consent. As a result of the threatened legal action and backlash, OpenAI shelved the "Sky" voice.

Online Reactions
"Dating a Model"
In mid-May 2024, many users made jokes about the release of ChatGPT-4o, particularly at the prospect of users developing infatuations with it due to the model's purportedly "flirty" tone. For example, X user @arithmoquine posted a joke on May 13th, 2024, about chatting with the new AI in a flirtatious manner, receiving over 22,000 likes in a day at the same time as the livestream was ongoing.[5]

Others joked about specific images and moments from the demos. X user @ericwdolan received over 6,300 likes in a day on May 13th for posting the meme (seen below) captioning the speech of an OpenAI employee in a demo, quoting the We Saw You From Across The Bar And Really Dig Your Vibe phrasal template.[6]

Various Examples



Search Interest
External References
Recent Videos
There are no videos currently available.
Top Comments
Kenetic Kups
May 14, 2024 at 08:06PM EDT
Akman
May 14, 2024 at 06:37PM EDT