Gemini, launched in December, will soon be woven into nearly all Google products. Pichai emphasized AI's significance by mentioning it 121 times during his keynote speech, showcasing how the technology will enhance user experiences.
One notable feature is Gemini's enhanced interaction with applications. Users will soon be able to drag and drop AI-generated images into messages, showcasing the model’s versatility. YouTube will also see the “Ask this video” feature, allowing users to extract specific information from videos through AI queries.
Gmail users can look forward to significant AI integration. With Gemini, users will be able to search, summarize, and draft emails with ease. The AI assistant can handle complex tasks like processing e-commerce returns by searching the inbox for receipts and filling out online forms, streamlining daily tasks.
A new experience, Gemini Live, will enable users to have “in-depth” voice chats with AI on their smartphones. This feature allows for real-time interaction, adapting to speech patterns and providing context-aware responses. Additionally, Gemini can interpret and respond to photos and videos taken on the device, making interactions more dynamic.
Google is advancing its AI capabilities by developing intelligent agents capable of reasoning, planning, and executing multi-step tasks. These multimodal advancements mean Gemini can handle text, image, audio, and video inputs, enhancing its functionality. Early use cases include automating shopping returns and exploring new cities.
Upcoming updates include replacing Google Assistant on Android with Gemini, fully integrating AI into the mobile operating system. A new “Ask Photos” feature will allow users to search their photo libraries using natural language queries, understanding context and recognizing objects and people. Google Maps will also benefit from AI-generated summaries of places and areas, utilizing insights from mapping data to provide detailed information.
With these enhancements, Google aims to make AI an indispensable part of everyday technology, significantly improving user interaction and productivity.