AiNews.com
Posts
Google Launches Gemini AI Real-Time Video & Screen Features

Google Launches Gemini AI Real-Time Video & Screen Features

Alicia Shapiro
March 24, 2025 • Estimated Reading Time: 5 minutes

A realistic smartphone screen showing an AI assistant interface analyzing a live camera feed in real time. The phone display includes clear visual overlays, identifying objects such as grocery items and providing instant suggestions or answers through text bubbles. A person holds the phone at eye level in a brightly lit, everyday setting, possibly a kitchen or store aisle, with shelves and products visible in the background. The interface highlights the seamless integration of AI vision capabilities, demonstrating how the assistant processes visual information and offers real-time, context-aware feedback.

Image Source: ChatGPT-4o

Google Launches Gemini AI Real-Time Video & Screen Features

Google is beginning to roll out powerful new features to Gemini Live, allowing the AI assistant to “see” and interpret a smartphone’s screen or camera feed in real-time. These capabilities are now available to some subscribers of the Google One AI Premium plan, Google spokesperson Alex Joseph confirmed to The Verge.

The rollout builds on Google's earlier demonstrations of its Project Astra technology—first showcased nearly a year ago—which underpins these new features.

How It Works

One of the key new tools is screen recognition. As demonstrated by a Reddit user who spotted the feature on their Xiaomi phone, Gemini can now analyze the contents of a user's screen and answer questions about what’s displayed. This screen-reading functionality was part of Google's March announcement, which confirmed it would gradually roll out to Gemini Advanced subscribers as part of the Google One AI Premium plan.

A screenshot of a Reddit post showing a demo of Google’s Project Astra. The post, shared by user Kien_PS in the r/Bard subreddit, features a smartphone home screen with the Gemini app icon visible. The caption reads, “A short demo of Project Astra (Share screen with Live).” The screen displays various app icons and system information, illustrating Gemini’s ability to share and interpret screen content in real time. The Reddit interface shows engagement metrics with upvotes, comments, and share options.

Reddit Demo of Google Gemini’s Project Astra Screen Sharing Feature. Image Source: Reddit User
Kien_PS Post

The second major feature is live video analysis. Using a smartphone’s camera feed, Gemini can interpret real-world visuals in real time. In a recent demonstration video, Google showed a user pointing their camera at a piece of pottery and asking Gemini to help choose a paint color—illustrating how the AI assistant can provide immediate feedback based on visual inputs.

Competitive Advantage in the AI Assistant Race

Google’s rollout of Gemini’s real-time screen and video capabilities positions it ahead of some competitors in the AI assistant space. While Amazon is preparing to debut its upgraded Alexa Plus and Apple has delayed enhancements to Siri, Google is moving quickly to integrate advanced, multimodal features.

ChatGPT already offers similar vision capabilities through its Advanced Voice Mode, allowing users to show objects or scenes to the assistant and receive real-time feedback. Gemini’s rollout mirrors this functionality by embedding screen reading and live video analysis into its assistant experience, giving users continuous, real-time visual interactions directly on their devices.

Samsung devices, which continue to ship with Bixby, have also leaned into Gemini by making it the default assistant on their phones—further extending Google's reach in the AI assistant market.

Looking Ahead

Google’s introduction of real-time visual recognition features underscores the rapid evolution of AI assistants toward truly multimodal interaction. With ChatGPT already offering similar vision capabilities through its Advanced Voice Mode, the competition is clearly intensifying, as leading platforms race to blend natural language understanding with real-world visual context.

As these assistants become more deeply integrated into smartphones and everyday workflows, questions remain about how users will adopt and trust these capabilities—particularly around privacy, transparency, and real-time data processing. Whether it’s Google’s Gemini, ChatGPT, or emerging alternatives, the next phase of AI assistants will likely be defined by how well they balance seamless functionality with user confidence and control.

Editor’s Note: This article was created by Alicia Shapiro, CMO of AiNews.com, with writing, image, and idea-generation support from ChatGPT, an AI assistant. However, the final perspective and editorial choices are solely Alicia Shapiro’s. Special thanks to ChatGPT for assistance with research and editorial support in crafting this article.