Just like that, we’re facing the imminent normalization of conversing with AI chatbots by talking out loud.
OpenAI today demoed a new AI model called GPT-4o and voice assistant that sounds human, reads your facial expressions and even gracefully handles interruptions (it stops talking then responds earnestly to your interruption comments — already superior to humans).
The tech press — taking its cue from OpenAI CEO Sam Altman, who tweeted the monosyllabic “her” — is implying that we all get to fall in love with AI Scarlett Johansson and that ChatGPT is now like that Joaquin Phoenix movie. Yeah, no, not really.
Speaking of repressed, lonely nerds, Google engineers tomorrow will introduce (among other AI things) their new multimodal voice assistant, which can converse about a live video taken by the phone about that video and the contents therein. They teased the announcement today in a video posted on Twitter.
All this casual chit-chat with AI reminds me of the Pi chatbot, which used to lead the natural-sounding audible conversation pack. I told you about Pi in this space and also said: This is the future of AI glasses, which this week’s announcement by major players bears out.
By the way, the Ray-Ban Meta glasses product was the Flavor of the Month ten minutes ago. Suddenly, chatting with Meta AI through the sunglasses seems slow, stilted and dated.
Three points stand out:
The OpenAI, Pi and Google chatbot voices are intensely Californian. Silicon Valley needs now to work on voices that don’t sound like Scarlett Johansson, or valley girls, or vocal-fry sorority girls. 99% of the world’s population doesn’t talk like that.
These demos are happening on smartphones, but the tech clearly belongs on glasses.
And, finally, it’s obvious that talking to a natural-sounding AI is going to eat the world, as an emergent human behavior.
Don’t believe me? Consider:
In the 90s, people felt self-conscious and embarrassed about talking on cell phones in public. It felt weird for a while. Until it didn’t. Talking on a phone anywhere, anytime became perfectly normal.
Once that became accepted, Bluetooth headsets emerged. While holding an unsightly Nokia 5110 against your head signaled “I’m talking on the phone now,” using a Bluetooth headset made wireless conversationalists indistinguishable from insane homeless drug addicts muttering to the voices in their heads. Over time, using a Bluetooth device in public to talk on your phone became normalized. So did insane homeless drug addicts.
In the early 2000s, camera phones got better, and for years people felt uncomfortable taking pictures with their phones — until everyone got used to it.
Then, after the iPhone came out (the iPhone shipped in 2007), people started taking a lot of selfies. The selfie-takers were mocked, criticized and ostracized, but everyone got used to that behavior, too.
Eventually, some brave pioneers started taking pictures in public with tablets. It was wrong and unacceptable then, and it will always be wrong and unacceptable.
In the past ten years, extreme posing, duck-faces, public twerking, emotional performances (crying for the camera and other stupid bullshit) and all manner of shameless pandering to an unseen audience and to-hell-with-the-people-around-me-physically became commonplace and normalized, though still mocked to hilarious effect by the Influencers in the Wild guy.
Point is: People initially think some new mobile device-enabled behavior is weird and bizarre, and then it becomes normal and accepted. It’s hard to believe now, but when the iPhone came out, the majority consensus was that nobody would use a phone without physical keys like the Blackberry had. Everybody hated on-screen keyboards. Now it’s the only acceptable option. “Normal” changes.
Which brings us back to the idea of people talking to AI in private, public and in any circumstance. This is not acceptable at present. In few months, it will become normal and stay that way until the end of time.
A conversant AI chatbot you talk to with your voice and listen to with your ears will become a daily fact of life. The percentage of people walking around and talking who are talking to AI, and not a human, will grow and grow. And, of course — as I’ve harped on mercilessly in this column — glasses, not smartphones, will enable the practice of conversing with machines for billions of people.
We’re talking to AI now. Get used to it.
Shameless Self-Promotion
A glimpse at the powerful future of information
Why you’ll soon have a digital clone of your own
Researchers develop malicious AI ‘worm’ targeting generative AI systems
Read ELGAN.COM for more!
Love food and travel? Check out our Gastronomad blog!
My Location: Sitges, Spain
(Why Mike is always traveling.)
I haven't found a reason to subscribe to any AI services like Perplexity or Gemini. What do you get out of it that's worth paying for? Maybe I'm missing something.
Are we going to find out in a year or so the new Machine-society blog could be a combo author and AI collaborate app at the production phase? Rhetorical...