Gemini 3.5 Flash is now capable of viewing and managing your screen, and Google aims for businesses to have confidence in it.
TL;DR Google has integrated computer use as a built-in feature in Gemini 3.5 Flash, replacing the separate Gemini 2.5 standalone model with enterprise security measures. This new capability allows AI agents to interact with screens by clicking, typing, and scrolling across different devices and browsers, previously necessitating a dedicated model. Now, developers can easily activate computer use as one of several functions in Flash, alongside capabilities like code execution and search.
The original standalone model was released in October 2025, achieving about 70 percent accuracy on the Online-Mind2Web benchmark. The new integration simplifies the process by merging what were once two separate workflows into one. Google emphasizes that this tool can automate tasks beyond simple chatbot interactions, enabling software testing and assisting knowledge workers with complex browser tasks.
Google has implemented enhanced safety measures, including targeted adversarial training to combat prompt injection attacks, where malicious content may mislead AI agents. They offer two optional enterprise safeguards: one that requires user confirmation for sensitive actions and another that halts the agent upon detecting a potential prompt injection attempt. Both safeguards are not activated by default, and Google advises a layered approach to security.
The competitive landscape has evolved, with Anthropic's Claude Computer Use offering cross-platform versatility and enhanced desktop capabilities. Google's Flash seeks to extend similar functions beyond just Chrome. OpenAI has joined the market, leading to competition focused on safe execution within regulated environments.
There are currently no updated benchmark scores for the integrated tool, nor has Google shared information about enterprise adoption or provided case studies. The Gemini Enterprise Agent Platform offers a pay-as-you-go pricing model, making it potentially more affordable for extensive automation than previous models. However, the effectiveness of this cost advantage is contingent on the complexity of agent workflows and how frequently safety measures require user confirmation.
In the realm of AI, computer use remains in its infancy. While models can maneuver familiar interfaces, they often struggle with unexpected pop-ups and complex layouts. Google's decision to make this capability a built-in feature indicates a belief in its readiness for general use, yet the optional safety measures acknowledge that it’s not fully prepared for unsupervised operation.
Other articles
Gemini 3.5 Flash is now capable of viewing and managing your screen, and Google aims for businesses to have confidence in it.
Google has integrated computer usage into Gemini 3.5 Flash, moving away from the standalone model and incorporating enterprise safety measures.
