- AiNews.com
- Posts
- Google Launches Gemini AI Real-Time Video & Screen Features
Google Launches Gemini AI Real-Time Video & Screen Features

Image Source: ChatGPT-4o
Google Launches Gemini AI Real-Time Video & Screen Features
Google is beginning to roll out powerful new features to Gemini Live, allowing the AI assistant to “see” and interpret a smartphone’s screen or camera feed in real-time. These capabilities are now available to some subscribers of the Google One AI Premium plan, Google spokesperson Alex Joseph confirmed to The Verge.
The rollout builds on Google's earlier demonstrations of its Project Astra technology—first showcased nearly a year ago—which underpins these new features.
How It Works
One of the key new tools is screen recognition. As demonstrated by a Reddit user who spotted the feature on their Xiaomi phone, Gemini can now analyze the contents of a user's screen and answer questions about what’s displayed. This screen-reading functionality was part of Google's March announcement, which confirmed it would gradually roll out to Gemini Advanced subscribers as part of the Google One AI Premium plan.
The second major feature is live video analysis. Using a smartphone’s camera feed, Gemini can interpret real-world visuals in real time. In a recent demonstration video, Google showed a user pointing their camera at a piece of pottery and asking Gemini to help choose a paint color—illustrating how the AI assistant can provide immediate feedback based on visual inputs.
Competitive Advantage in the AI Assistant Race
Google’s rollout of Gemini’s real-time screen and video capabilities positions it ahead of some competitors in the AI assistant space. While Amazon is preparing to debut its upgraded Alexa Plus and Apple has delayed enhancements to Siri, Google is moving quickly to integrate advanced, multimodal features.
ChatGPT already offers similar vision capabilities through its Advanced Voice Mode, allowing users to show objects or scenes to the assistant and receive real-time feedback. Gemini’s rollout mirrors this functionality by embedding screen reading and live video analysis into its assistant experience, giving users continuous, real-time visual interactions directly on their devices.
Samsung devices, which continue to ship with Bixby, have also leaned into Gemini by making it the default assistant on their phones—further extending Google's reach in the AI assistant market.
Looking Ahead
Google’s introduction of real-time visual recognition features underscores the rapid evolution of AI assistants toward truly multimodal interaction. With ChatGPT already offering similar vision capabilities through its Advanced Voice Mode, the competition is clearly intensifying, as leading platforms race to blend natural language understanding with real-world visual context.
As these assistants become more deeply integrated into smartphones and everyday workflows, questions remain about how users will adopt and trust these capabilities—particularly around privacy, transparency, and real-time data processing. Whether it’s Google’s Gemini, ChatGPT, or emerging alternatives, the next phase of AI assistants will likely be defined by how well they balance seamless functionality with user confidence and control.
Editor’s Note: This article was created by Alicia Shapiro, CMO of AiNews.com, with writing, image, and idea-generation support from ChatGPT, an AI assistant. However, the final perspective and editorial choices are solely Alicia Shapiro’s. Special thanks to ChatGPT for assistance with research and editorial support in crafting this article.