Astra is Google’s answer to the new ChatGPT

written by Elijah May 14, 2024 0 comments

Pulkit Agrawal, an assistant professor at MIT who works on artificial intelligence and robotics, says the latest demonstrations from Google and OpenAI are impressive and show how quickly multimodal AI models have advanced. OpenAI launched GPT-4V, a system capable of analyzing images in September 2023. He was impressed that Gemini could understand live video; for example, correctly interpreting changes made to a diagram on a whiteboard in real time. OpenAI’s new version of ChatGPT seems capable of doing the same.

Agrawal says the assistants demonstrated by Google and OpenAI could provide new training data for companies as users interact with models in the real world. “But they have to be useful,” she adds. “The big question is what people will use them for; it’s not very clear.”

Google says Astra will be available through a new interface called Gemini Live later this year. Hassabis said the company is still testing several prototypes of smart glasses and has not yet made a decision on whether to launch any of them.

Astra’s capabilities could give Google the opportunity to reboot a version of its ill-fated Glass smart glasses, although efforts to build hardware suitable for generative AI have so far failed. Despite impressive demonstrations from OpenAI and Google, multimodal modals cannot fully understand the physical world and the objects it contains, which places limitations on what they will be able to do.

“Being able to build a mental model of the physical world around you is absolutely essential to developing more human intelligence,” he says. Brenden Lakeassociate professor at New York University who uses AI to explore human intelligence.

Lake notes that today’s best AI models are still very language-focused because most of their learning comes from text extracted from books and the web. This is fundamentally different from how humans learn language, who learn it while interacting with the physical world. “It’s a step back from child development,” she says of the multimodal modeling process.

Hassabis believes that giving AI models a deeper understanding of the physical world will be key to further advancing AI and making systems like Astra more robust. Other frontiers of AI, including Google DeepMind’s work on AI programs for games, could help, he says. Hassabis and others hope this work could be revolutionary for robotics, an area Google is also investing in.

“A multimodal universal agent assistant is on the path to artificial general intelligence,” Hassabis said, referring to an expected but largely undefined future point at which machines can do anything a human mind can do. “This isn’t AGI or anything like that, but it’s the start of something.”

Astra is Google’s answer to the new ChatGPT

Hunter Biden’s latest attempt to scrap gun trial is rejected by judge

It’s the end of Google search as we know it

You may also like