Multimodal AI video glasses get closer
Meta rolled out Llama 3.1 today. Here's why that means multimodal AI video glasses are just around the corner.
At OpenAI’s “OpenAI Spring Update” on May 13, the company dazzled attendees with GPT-4o and its ability to do multimodal input with video as one of the inputs. (Instead of just typing a prompt, as with the original ChatGPT, the input with GPT-4o can include text, audio, pictures and streaming video.)
The next day at “Google I/O 2024,” Google freaked ever…
Keep reading with a 7-day free trial
Subscribe to Machine Society to keep reading this post and get 7 days of free access to the full post archives.