GPT-4o demo depicting Sam Altman from OpenAI and a tweet.

GPT-4o / ChatGPT-4o

Part of a series on ChatGPT. [View Related Entries]
[View Related Sub-entries]

Updated May 21, 2024 at 11:00AM EDT by Adam.

Added May 14, 2024 at 05:03PM EDT by Aidan Walker.

PROTIP: Press 'i' to view the image gallery, 'v' to view the video gallery, or 'r' to view a random entry.

This submission is currently being researched & evaluated!

You can help confirm this entry by contributing facts, media, and other evidence of notability and mutation.

About

GPT-4o or ChatGPT-4o refers to an artificial intelligence assistant and interface by OpenAI that was introduced in May 2024. The "o" in ChatGPT 4o stands for "omni." This new version of ChatGPT responds more fluidly to human voice prompting and is able to use a phone camera to scan a room and talk about what it "sees." It also has a more realistic-sounding voice and appears to have been trained to be more of an AI companion, leading to many memes and jokes about the relationships users might form with the artificial intelligence.

History

OpenAI established itself as a leader in consumer-facing AI in 2022 and 2023, with the release of GPT-3 and ChatGPT. Subsequent releases and announcements, including Sora earlier in 2024, further heightened the company's profile.

In April and May 2024, OpenAI teased an upcoming demo day, inviting speculation and anticipation from users of its products. The demo was released on May 13th, 2024, as a livestream video on YouTube (seen below), which received over 2.5 million views and 80,000 upvotes in a day, with comments notably disabled.[1]



OpenAI also released a series of other videos showing demos of specific features of ChatGPT-4o, in particular its conversational capabilities (seen below, left and right).



Features

On its site,[2] OpenAI described the capabilities of GPT-4o as:

GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction--it accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time(opens in a new window) in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.

ChatGPT-4o is not just an AI generator, but possibly a new form of computer interface. Since it has almost no lag time in answering users in natural language, can "see" through a camera, and is able to perform tasks at a high level, the AI (or a more advanced version of something like it) could possibly replace screens and cursors as a method of using computers. Sam Altman said that 4o was "the best computer interface I've ever used" in a blog post published on May 13th, 2024.[4]

Altman also posted the word "her" to X on May 13th, around the time of the demo. This was likely a reference to the 2013 dystopian film titled Her about a man who enters into a doomed romance with an artificial intelligence voiced by Scarlett Johansson.[3] The post (seen below) earned over 37,000 likes in a day.


her Sam Altman @sama 1:45 PM May 13, 2024 5.3M Views 2.2K 6.8K 37K 1.8K

Scarlett Johansson Letter

On May 20th, 2024, new outlets published a letter by Scarlett Johansson in which she said she was approached by Sam Altman to voice Sky but refused (letter shown below). She then accused the company of imitating her voice without her consent. As a result of the threatened legal action and backlash, OpenAI shelved the "Sky" voice.


"Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system. He told me that he felt that by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with the seismic shift concerning humans and Al. He said he felt that my voice would be comforting to people. After much consideration and for personal reasons, I declined the offer. Nine months later, my friends, family and the general public all noted how much the newest system named "Sky" sounded like me. When I heard the released demo, I was shocked, angered and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine that my closest friends and news outlets could not tell the difference. Mr. Altman even insinuated that the similarity was intentional, tweeting a single word "her" a reference to the film in which I voiced a chat system, Samantha, who forms an intimate relationship with a human. Two days before the ChatGPT 4.0 demo was released, Mr. Altman contacted my agent, asking me to reconsider. Before we could connect, the system was out there. As a result of their actions, I was forced to hire legal counsel, who wrote two letters to Mr. Altman and OpenAl, setting out what they had done and asking them to detail the exact process by which they created the "Sky" voice. Consequently, OpenAl reluctantly agreed to take down the "Sky" voice. In a time when we are all grappling with deepfakes and the protection of our own likeness, our own work, our own identities, I believe these are questions that deserve absolute clarity. I look forward to resolution in the form of transparency and the passage of appropriate legislation to help ensure that individual rights are protected. "

Online Reactions

"Dating a Model"

In mid-May 2024, many users made jokes about the release of ChatGPT-4o, particularly at the prospect of users developing infatuations with it due to the model's purportedly "flirty" tone. For example, X user @arithmoquine posted a joke on May 13th, 2024, about chatting with the new AI in a flirtatious manner, receiving over 22,000 likes in a day at the same time as the livestream was ongoing.[5]


henry oh god... @arithmoquine • May 13 gpt-40 ✰ SYSTEM Enter system instructions USER Hello! ASSISTANT heyyyyyyy 86 7843 22K ill 919K

Others joked about specific images and moments from the demos. X user @ericwdolan received over 6,300 likes in a day on May 13th for posting the meme (seen below) captioning the speech of an OpenAI employee in a demo, quoting the We Saw You From Across The Bar And Really Dig Your Vibe phrasal template.[6]


dolan @ericwdolan • 20h "my ai gf and i dug your vibe from across the bar" OpenAl 37 259 6.2K ili 259K I

Various Examples


Frantastic-e/acc i highly recommend it @Frantastic_7 · 17h if you like.... the gpt-40 voice you should try talking to actual women imgflip.com 138 ИAM 265 4.5K ılı 139K vittorio *@IterIntellectus ⚫ 19h "oh yeah, my girlfriend is a model" OpenAI 57 17 258 3.2K ili 194K sophie @netcapgirl. May 13 shinzo abe in heaven watching birth rates plummet because gpt-40 has the scarlett johansson voice 116 1.8K 19K ili 1.2M 口企

alli @sonofalli May 13 "i'm dating a model" 50 1984 9.8K ılı 622K ↑ Cassie Evans @cassiecodes 10h This is giving such “female character as written by men” vibes. Why is she so obsequious and flirty? Mega ick. OpenAl @OpenAI • May 13 Say hello to GPT-40, our new flagship model which can reason across audio, vision, and text in real time: openai.com/index/hello-gp... Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. 1:10 OpenAI 304 255 2.7K ili 706K - Flowers from the future @futuristflower ⚫ 23h The laughter is so magnificent and seems so genuine. Many people will find it hard not to fall in love with them. 0:02 115 17 378 2.2K Ill 615K


Search Interest

External References

[1] YouTube – OpenAI

[2] OpenAI – Hello GPT4o

[3] X – @sama

[4] Sam Altman – Blog

[5] X – @arithmoquine

[6] X – @ericwdolan

Recent Videos

There are no videos currently available.

Recent Images 19 total


Top Comments


+ Add a Comment

Comments (19)


Display Comments

Add a Comment


Yo! You must login or signup first!