SPEAKER NOTES — SLIDE 1
Mastering AI Agents in 60 min

Mastering
AI Agents
in 60 min

From Zorro Cheng  ·  2026

Simon Sinek — Law of Diffusion of Innovation TED Talk · 2:26 — watch before we start
Plan Can't Keep up with Changes
計劃趕不上變化
Section 1
Introduction

Context, velocity, and why 2026 is different

Introduction

What will happen to AI Agents in 2026?

Time it took for platforms to reach 1 million users

1 hr 10 hr 100 hr 1,000 hr 10,000 hr ← log scale (hours to reach 1M users) Netflix (1991) 3.5 years Twitter (2006) 2 years Facebook (2004) 10 months Spotify (2008) 5 months Instagram (2010) 2.5 months AI era ↓ ChatGPT (2022) 5 days ⚡ Threads (2023) ~1 hour ⚡⚡ (bar is ~18px — log scale only goes so far)
Introduction

AI Agent is here now!

💻

No/Low code programming is fading

Traditional drag-and-drop coding approaches are becoming obsolete. Agents write and deploy code on your behalf — no GUI builder required.

🤖

RPA is dying

Coding Agents and Auto Browsers are transforming automation. They reason, adapt, and handle exceptions — not just replay fixed scripts.

Introduction — Everett Rogers

Diffusion of Innovation

THE CHASM Innovators 2.5% Early Adopters 13.5% Early Majority 34% Late Majority 34% Laggards 16%
Section 2
AI Ecosystem

Picking the right AI for the job

AI Ecosystem — The Analogy

Sport Shoes Shopping

Brands → Products → Stores (many-to-many connections)

Nike Adidas New Balance Air Jordan Air Force Ultraboost Yeezy 574 Nike Store Footlocker Gagasport Adidas Store Brands Products Stores
AI Ecosystem

AI Shopping 2026

Same structure — Vendors → Models → Surfaces

Google Anthropic OpenAI Gemini Flash Gemini Pro Claude Sonnet Claude Opus GPT-5 gemini.google.com poe.com perplexity.ai claude.ai chatgpt.com
AI Ecosystem — Extended

Sport Shoes 2026 — 4 Tiers

Adding the fourth tier: in-store experience zones

Nike Adidas New Balance Air Jordan Air Force Ultraboost Yeezy 574 Nike Store Footlocker Gagasport Adidas Store Zones Running Hub Basketball Zone Training Center Sneaker Culture Outdoor Sports Fitness Zone
AI Ecosystem

AI Shopping 2026 — 4 Tiers

The orange row = products/agents we'll come back to

Google Anthropic OpenAI Gemini Flash Gemini Pro Claude Sonnet Claude Opus GPT-5 gemini.google.com claude.ai perplexity.ai chatgpt.com NotebookLM Gemini Enterprise Claude Desktop Claude Code Codex ChatGPT Plus

Notice the orange row — these are where agents live. We'll come back to them in Half 2.

AI Ecosystem

One Model CAN'T Fit All

Build your AI toolbox — each model excels in unique ways

Gemini Pro

Multimodal champion with exceptional visual understanding and native Google Workspace integration

OpenAI GPT-5

Thoughtful problem-solver with strong analytical and reasoning capabilities across domains

Claude Opus

Go-to for complex coding, long documents, and nuanced instruction-following

DeepSeek

Specialized in Chinese-language tasks and cultural contexts; strong coding and reasoning

AI Ecosystem — Fundamentals

Understanding AI Model Fundamentals

When evaluating models, check these 2 critical attributes

🧠

Smartness

Accuracy without hallucination. How often does the model produce correct, reliable answers? Does it make up citations, invent facts, or confidently say wrong things?

💾

Memory Capacity

Context Window — how much text the model can process at once. Bigger = can handle longer documents, longer conversations, more context.

AI Ecosystem — Tokens

Understanding Tokens

🪙

Example

1 token ≈ 4 characters  ·  ¾ of an English word

💰

Cost

APIs charge per token consumed — both input (prompt) and output (response)

💬

Includes

Everything in your conversation: your messages, AI replies, documents pasted in

🔭

Context Window

The size of the AI model's working memory — all it can "see" right now

AI Ecosystem — How You Pay

Subscription vs API

🍱 Subscription — "All You Can Eat"

  • Flat ~US$20/month (ChatGPT Plus, Claude Pro, Gemini Advanced)
  • Usage quota — resets on a rolling window
  • Predictable cost, hits a wall when quota exceeded
  • Best for: individuals, daily chat, exploration

🔌 API — Pay As You Go (per token)

  • Claude Sonnet: ~$3/1M input tokens, ~$15/1M output
  • Sample Q&A: 10K input = $0.03 + 2K output = $0.03 → $0.06/req
  • No cap — cost scales with usage linearly
  • Best for: apps, automation, agents, variable workloads

When you move from chat to agents, you almost always move to API — agents burn tokens fast. This is the cost shift to expect in Half 2.

AI Ecosystem — Memory

AI Memory = Context Window

  • No long-term memory between sessions — each new chat starts fresh
  • Within a session, the context window IS its working memory
  • When the window fills up, the AI starts forgetting the earliest parts
  • Bigger model = bigger memory (Gemini Pro 1M, Claude 200K)
  • Long chats need handover summaries — same as a temp worker
context window User: Hello, can you help me... AI: Of course! Let me... User: What about the project... AI: The project requires... User: Can you summarize... AI: Here is the summary... ↑ FORGETTING
AI Ecosystem — Memory
《忘記和記》
"Forget and Remember"
Your AI lives this loop every conversation — bounded by its context window.
AI Ecosystem — Model Comparison

AI Model Memory Comparison

ModelContext WindowBest Use Case
Gemini Pro1,000,000 tokensMultimodal tasks, long-context coding
LLaMA10,000,000 tokensOpen-source applications, efficient processing
GPT-5400,000 tokensGeneral purpose, multimodal processing
Claude Sonnet/Opus200,000 tokensDocument analysis, research, complex reasoning
DeepSeek128,000 tokensCode generation, reasoning, large-context understanding

Larger context = more working memory per session, but usually higher cost per API call.

Section 3
How to Collaborate

Talking to AI effectively — the mindset shift

Human AI
How to Collaborate

Think of AI as your
MIT Graduate Intern

Brilliant — but new on the job.

Needs clear direction
Learns fast from examples and feedback
How to Collaborate

Working with Your AI Intern

🧠

Smarter model, more capable intern

Advanced models handle complex tasks with less supervision. Upgrade your model when you need more reliability.

⚖️

Set appropriate expectations

Understand strengths and limitations. AI excels at drafting, analysis, and synthesis — but verify anything critical.

💬

Clear, specific instructions

Examples yield far better results than vague descriptions. The more specific you are, the less revision you'll need.

🏗️

Build productive workflows

Systematic processes that leverage AI strengths while keeping humans in the review and decision loop.

How to Collaborate

The Temporary Worker Model

👤

AI chats are like a temp worker

Limited memory, starts fresh each session, can only see the current conversation

🎯

Provide clear goals upfront

Well-defined objectives help AI understand purpose from the very start

📋

Define a clear job scope

Setting boundaries helps AI stay relevant and avoid scope creep

📜

Ask for a handover document

Request summaries to preserve important information before starting a new session

🔄

The context window determines memory

Everything the AI can reference is what you've put in this conversation window

PROMPTING
101
Prompting 101

CAST: The Prompt Framework

C — Character or Target Audience

Define who the AI should be, or who the output is for.
"Act as an experienced data scientist..."

A — Aim or Goal

Clearly state what you want to accomplish.
"Create a marketing plan for..."

S — Specific Detail or Context

Provide industry context, constraints, and background — the more the better.

T — Template or Format

Specify how you want the answer presented.
"As a bulleted list," "In a table," "Under 200 words"

Using this structured approach helps your AI "intern" deliver exactly what you need — first time.

Prompting 101

Learning Prompting from AI

💡

Let AI teach you

Ask AI to analyze your prompts and suggest specific improvements to structure and clarity

✍️

Ask AI to write a prompt for you

For complex requests, ask AI to formulate the optimal prompt structure before you run it

🖼️

Especially useful for image AI

Image models (Midjourney, DALL·E, Imagen) have very specific prompt templates — ask them first

Build a prompt library

Identify patterns in successful prompts and save them as reusable templates in your workflow

Section 4
From LLM Chat
to Agent

Everything before: AI you talk to.
Everything after: AI that does.

pivot CHAT 👤 What's 3+3? The answer is 6. You ask · AI answers · You act AGENT 🧑‍💼 Goal: Draft report AI Agent 🔍 web 📄 file 📧 email ⚙ code You set goal · AI acts · You review
From LLM Chat to Agent

Chat Agent

💬 Chat

You ask. AI answers. You act.
  • Single turn: one question, one answer
  • You copy-paste the result manually
  • You close the loop yourself

🤖 Agent

You set a goal. AI plans, uses tools, acts. You review.
  • Multi-step: AI loops until done
  • AI takes real actions (web, files, email)
  • You review outputs, not every step

An agent = LLM + tools + a loop, working toward a goal.

From LLM Chat to Agent

The Agent Recipe

🧠

LLM

The brain — decides what to do next at each step

🛠️

Tools

The hands — web, files, code, APIs, databases

🔄

Loop

The persistence — plan → act → observe → repeat

PlanActObservePlan → ...
From LLM Chat to Agent

Agent Tools — Connectors

Give AI the tools they need to do your job — Gemini Enterprise (April 2026)

💬 Microsoft Teams

Chat and notify across enterprise channels

📁 Google Drive

Read, create, and organize documents in the cloud

🎫 Jira Cloud

File, update, and comment on tickets

📄 Confluence Cloud

Create and edit wiki pages and documentation

🧠 Notion

Manage databases, pages, and team knowledge bases

📧 Microsoft Outlook

Send email and schedule meetings on your behalf

These 6 are among the most-used Gemini Enterprise connectors. They connect agents to where your work actually lives.

From LLM Chat to Agent

Consumer Agents in 2026

Cursor

Coding agent built into your editor. Writes, reviews, and refactors code with full codebase context.

Claude Desktop

Anthropic's desktop agent for chat + tools. Can access files, run code, use MCP connectors.

Manus Desktop

Research and multi-step agent. Browses the web, synthesizes findings, produces documents.

Perplexity Computer

Browser-based agent that surfs and acts. Fills forms, extracts data, completes web-based tasks.

From LLM Chat to Agent

Enterprise Agents — Governance, Compliance, Scale

NotebookLM Enterprise

Secure document agents over your company's knowledge corpus

🛡 secure📋 audit

Gemini Enterprise

Google's agent platform with admin controls and full audit logging

🛡 secure🔒 compliant

Claude for Enterprise

Anthropic's enterprise tier with SSO, data isolation, and usage controls

🔒 isolated📋 audit

Microsoft Copilot Studio

Build, govern, and deploy custom agents across the Microsoft 365 stack

🛡 M365⚙ custom

Same agent capabilities — plus governance, compliance, identity management, and audit trails.

Advanced Framework

Your AI Mastery Journey — 5 Levels

Now that you've met agents, here's how far this goes

Level 1 Chat ✓ you've done this Level 2 Tools + Add-ons ✓ you do this daily Level 3 Agent ← You are here Level 4 Agent Team coming next ↑ Level 5 Agent Corp the frontier ↑

I held this map back until now — until you'd met an agent, the higher levels would have been abstract.

Advanced Framework

How We Engineer with AI — 4 Years, 4 Eras

2022
Prompt
Engineering
Get good at asking — craft the perfect question
2024
Vibe
Coding
Describe what you want and let AI build it
2025
Context
Engineering
Curate what the AI sees — manage its memory and inputs
2026
Harness
Engineering
Build the rails, tools, skills, and guardrails agents run on

Each year, the lever moves higher up the stack. The frontier today is Harness — building agent infrastructure.

Level 4
Agent Team

Orchestrated AI workforce — specialization + parallel execution

Level 4 — Agent Team

Solo Agent Agent Team

Solo Agent 😓

One brain doing everything — slow, context-overloaded, generalist output

🤖 One Agent
research + write + review + publish

Agent Team 🚀

One orchestrator + many specialists — parallel, focused, production quality

🎯 Orchestrator
🔍 Research
✍️ Write
👁 Review

Same job, divided across roles — like turning a freelancer into a startup.

Level 4 — Agent Team

Skills — SOPs for Your AI

One worker, many SOPs. One agent, many Skills.
🛠️

Capability Uplift

Give the agent a skill it's currently weak at. Example: a Skill that makes any agent follow your company's frontend design system precisely.

📋

Encoded Preference

Lock in your workflow your way. Example: a Skill that ensures every report follows your exact structure, tone, and approval chain.

Agent autonomy is now powered by Skills.
Level 4 — Agent Team

Claude Agent-Team — Orchestration Built In

🎯 Orchestrator + Sub-agents

Main agent delegates tasks to specialist sub-agents, then synthesizes their outputs

🔧 Skills attached per role

Each sub-agent gets the right SOPs — the researcher gets a web-search skill, the writer gets your brand voice

✨ Adhoc agent creation

Spawn new specialists on the fly when the task requires an unexpected capability

Orchestrator Researcher Writer Reviewer + Adhoc Agent spawned on demand

Don't define teams up front — let them form for the task at hand.

Level 4 — Agent Team

Metaskills — Skills That Build Skills

The agent doesn't just use your SOPs — it can write new ones.
🏗️

Skill Builder

A metaskill that generates a new Skill from your plain-language description. You describe the SOP; the agent writes and tests it.

🎭

Team Composer

Analyzes a job you describe and spawns the optimal agent team with the right Skills already attached to each role.

Skill → creates → Skill → creates → Skill → ...
Level 4 — Agent Team

Paperclip AI — Level 4 in Production

  • One product, multiple specialist agents (research / draft / publish)
  • Orchestrator routes each user request to the right sub-agent team
  • Adhoc agents spawned for unfamiliar or complex tasks on the fly
Built on Claude agent-team
Live in production today. This is what Level 4 looks like running in the wild.
Orchestrator Research Draft Publish + Adhoc Agent
Level 5
Agent Corp

AI as an organization — multiple teams, multiple orchestrators, one mission

Level 5 — Agent Corp

From Team to Organization

Agent Team (Level 4)

One orchestrator, many sub-agents, one mission. Like a startup with a CEO and functional staff.

Agent Corp (Level 5)

Many orchestrators, many teams, running an entire organization. Each team is a department.

CEO Agent Sales Orch. Eng. Orch. Mktg Orch. Ops Orch. Finance Orch.

Each "team" is a department. Together they form a company.

Level 5 — Agent Corp

ZorCorp — AI-First Operating Model

👤 Humans
Set goals
Review outputs
Intervene at decisions
Approve & ship
Marketing
Research Agent
Content Agent
Analytics Agent
Engineering
Code Agent
Test Agent
Deploy Agent
Operations
Ops Monitor
Report Agent
Research
Scout Agent
Synthesis Agent
Insight Agent

The "company" runs on agents. Humans steer.

Summary

You Just Climbed 5 Levels in 60 Minutes

  • Chat — you talked to AI
  • Tools — you saw the ecosystem & how to pay
  • Agent — you understand autonomy
  • Agent Team — you saw orchestration & Skills
  • Agent Corp — you see the destination
Chat Tools Agent Team Corp
The question isn't "Will AI take my job?"
It's "Which level am I building toward?"