I tried out the world-understanding avatar of Gemini Live, and it was surprising.

I tried out the world-understanding avatar of Gemini Live, and it was surprising.

      It feels a bit unsettling to hear an AI speaking in a strangely friendly manner, prompting me to organize the clutter on my desk. I take some pride in it, but perhaps it's time to stack the randomly placed gadgets and sort out the tangled wires.

      My sister would likely concur. However, the more significant aspect is the promptness to act after an AI "sees" my workspace, recognizes the chaos, and offers domestic advice. Google’s Gemini AI chatbot is capable of that now, along with much more.

      The catalyst for this is a recent feature update named Project Astra. This has been in development for several years and finally launched earlier this month. The main aim is to deliver an all-seeing, all-hearing, and remarkably intelligent AI to your smartphone.

      Google markets these capabilities under a rather lackluster title: Gemini Live with camera and screen sharing. Developed by the company's DeepMind division, it started as a concept for a “universal AI assistant.” It's unfortunate that the final name lacks an aspirational quality.

      Let’s discuss accessibility. This feature is currently available for Pixel 9 and Galaxy S25 users. If you own an Android phone paired with a Gemini Advanced subscription, you can utilize the new toolkit.

      This subscription costs $20 per month, by the way. I tested it on those two mentioned devices and now have it available on my OnePlus 13 as well. The best part? There are no complex steps to access it.

      You only need a power/volume button combination or a corner swipe on the screen to summon Gemini, regardless of the app you’re using. You can access the new camera and screen-sharing features as an overlay throughout the OS.

      Understanding Your Environment

      I began by focusing the camera on a painting, inquiring about it. Gemini Live accurately identified it as a Madhubani style painting, elaborating on its vibrant colors and animal representations.

      It then provided a concise history lesson on the art form and its various evolutions. The information was spot-on, even at a detailed level. Thankfully, you can also engage in a text-based conversation with Gemini if you're in a setting where speaking aloud could be awkward.

      What I appreciate most about Gemini Live’s new camera and screen-sharing feature is its thoughtful brevity. It allows interruptions at any moment, enhancing the "natural" feel of the exchanges.

      I experimented with Gemini in multiple situations, and I was unprepared for its capabilities.

      Typically, the responses are brief, as if it encourages you to ask follow-up questions rather than overwhelming you with lengthy answers. It excels across various topics and visual scenarios, though there are some limitations.

      It currently lacks the ability to utilize Google Lens, meaning Gemini cannot compare images it sees on your screen with web matches. Additionally, it cannot access real-time information if you ask for updates on recent developments related to a topic or person.

      I queried about plant species, restaurant listings, information from notice boards, and deciphering a medical prescription I received after a recent flu. Gemini performed quite well, showcasing improvements compared to previous experiences with the AI chatbot.

      Accessing a Wealth of Knowledge

      Next, I challenged Gemini to help me comprehend complex academic material. I placed a book on Machine Learning in front of the camera. Gemini Live not only recognized it but also gave a summary of the book’s topics and key subjects.

      Interestingly, as I flipped through the pages and reached the chapter list, the AI recognized this progress, paused, and asked if I wanted to know more about any specific chapter.

      I was pleasantly surprised at that moment.

      When I requested explanations of a few intricate topics, the AI did a commendable job, even sourcing additional information from its vast knowledge database that went beyond the content on the page.

      For instance, when I asked about the introductory page of Bhisham Sahni’s notable novel, Tamas, the AI accurately noted the mention of the Sahitya Akademi Award. It then provided details not found on the page, including the year it received the award and a summary of the book.

      Conversely, the Hindi readout produced by Gemini Live was subpar. It wasn't just a poor accent; the AI often resorted to nonsensical or incorrect phrases. While reading Urdu, Persian, and Arabic, it performed notably better, yet frequently mixed up words from various lines.

      During my first attempt with Urdu poetry, it recognized the text and provided an accurate summary of the poem. The main challenge, once again, was the narration—listening to an anglicized rendition of Urdu was quite unpleasant.

      Surprising Strengths

      AI serves as a remarkable problem-solving tool, and numerous benchmarks support this. I tested it with physics problems involving thermodynamics, electrochemical equations, and statistical queries from a handwritten notebook. Gemini Live excelled in these tasks.

      It also performed well with creative requests. My sister, a fashion designer, presented one of her sketches for feedback and suggestions. Gemini Live started by complimenting the design

I tried out the world-understanding avatar of Gemini Live, and it was surprising. I tried out the world-understanding avatar of Gemini Live, and it was surprising. I tried out the world-understanding avatar of Gemini Live, and it was surprising. I tried out the world-understanding avatar of Gemini Live, and it was surprising. I tried out the world-understanding avatar of Gemini Live, and it was surprising. I tried out the world-understanding avatar of Gemini Live, and it was surprising. I tried out the world-understanding avatar of Gemini Live, and it was surprising. I tried out the world-understanding avatar of Gemini Live, and it was surprising. I tried out the world-understanding avatar of Gemini Live, and it was surprising.

Other articles

I tried out the world-understanding avatar of Gemini Live, and it was surprising.

I used the next-gen Gemini Live, which has camera and screen sharing features, for a few days. It has forever altered my daily expectations of AI.