SANS
GERARD
Google Developer Expert
Developer Evangelist

International Speaker
Spoken 208 times in 43 countries


Family







1h
video
11h
audio
30K
LOC
800
pages
1 Million tokens

New Multimodal Architecture

Opening the world

Inspiration: NotebookML
Responsible AI

Reduce Biases
Safe
Accountable to people
Designed with Privacy
Scientific Excellence
Follow all Principles
Socially Beneficial
Not for Surveillance
Not Weaponised
Not Unlawful
Not Harmful
Google AI Learning Path
Vertex AI
Complexity
Features



1
2


Experimental
Gemini Live in Action



Talk
Show
Ask




Talk
8 Natural Voices
+45 Languages
2-15min
Show
Video
Attachments
Ask
Code Execution
Function Calling
Web Search
Imagine Create
Google Veo 2
Photographic portrait of a real life dragon resting peacefully in a zoo, curled up next to its pet sheep. Cinematic movie still, high quality DSLR photo.
Imagen 3






Stochastic parrot or AGI?
Common Myths for LLMs today
Antropomorphic
A program not a human.
Brain analogy is a myth. Very limited reasoning and planning.
Deterministic
Uncertain by design. Highly sensitive to inputs and learned patterns. Answers may change using an extra punctuation.
Prompt-driven
Output is dependent on training data. Prompts can't fix data issues or uneven distributions.
Neutral
Biased by training data. This is a blind spot. Difficult to find small mistakes.
Fact-retrieval
Not a database. Can't store or retrieve facts. Data cut-off. Not a search engine.
The Human-like Persona Illusion

This man doesn't exist.

The shadows are approximations.

This painting never dried.

This studio setup never happened.
Features in AI generated content are from the data not AI
Algorithmic Bias
Pick a number between 1 and 10 without saying it out loud.
Rise your hand if you picked the number
7
10
Seven is up to 10 times higher!

x10
7
10
Case 1: The impossible full glass

Case 2: Stuck in time Analog Clock

Context Contamination

Responsible AI: confirmation bias

Perceived accuracy: 100%
Perceived accuracy: 85%
Real accuracy: 20%
Can be dangerous in high-stakes contexts. Eg: health or finance.
Can be used when errors have little to no consequences.
Eg: summarising or rephrasing.
Safe to use in general. Perfect accuracy.


Global
Generative AI for Developers
Vertex AI
Complexity
Features



Your sandbox for prompts

Multimodal: exotic plant care

Great for transcripts and key events

Experiment with Gemini Live


Scan to access link.
Useful Links and Materials
What is an AI Agent?
Access to Tools and APIs
Python Sandbox
Google Search
Calculator

AI Agent: fundamentals
Python Sandbox
Google Search
Calculator
Chat History
Chat UI
AI Agent
LLM
Tool
Gemini API

AI Agent: fundamentals
Calculator
Google Search
Python Sandbox
Chat UI
AI Agent
LLM
Tool
Gemini API

AI Agent: fundamentals
Google Search
Python Sandbox
Chat UI
AI Agent
LLM
Tool
Gemini API

AI Agent: fundamentals
Python Sandbox
Chat UI
AI Agent
LLM
Tool
Gemini API

Google Knowledge Graph
Python Sandbox
Session Memory
Google Search
Image Generation


Pattern: Augmented LLM
Multi-Agent Systems
Context Agent
Priorisation
Agent
Execution Agent
User
Goal/Rules
Task Queue
Task
Task creation Agent
1. Provide objective
4. Complete task
6. Update tasks
2. Add new tasks
3. Query context
Memory
5. Store task/result
Tools




Gemini Advanced: Deep Research

Gemini models landscape

Access to Gemini

Create an API key in AI Studio

Demo: Web Console - Gemini Live
Voice-first use cases




Google DeepMind: Project Astra



24/7 Bookings using AI Voice Assistants

Cleaning Company
Robot Cafe





Building the next generation of AI Agents
By Gerard Sans
Building the next generation of AI Agents
During this session, we will introduce you to the latest Gemini AI models, including Gemini 2.5 Pro for advanced reasoning and decision-making, Gemini 2.0 Flash for image generation, and Gemini 2.0 Flash Live for real-time, voice-driven AI agents. These models highlight the power of multimodal AI capable of understanding and generating text, audio, images, and video, taking advantage of massive context for maximum performance. You’ll also see how to integrate tools like Google Search, Retrieval-Augmented Generation (RAG), function calling, and external APIs to build powerful, context-aware applications.
- 33