Mastering AI Agents in 60 min

Mastering
AI Agents
in 60 min

From Zorro Cheng · 2026

▶

Simon Sinek — Law of Diffusion of Innovation TED Talk · 2:26 — watch before we start

Plan Can't Keep up with Changes

計劃趕不上變化

Section 1
Introduction

Context, velocity, and why 2026 is different

Introduction

What will happen to AI Agents in 2026?

Time it took for platforms to reach 1 million users

Introduction

AI Agent is here now!

💻

No/Low code programming is fading

Traditional drag-and-drop coding approaches are becoming obsolete. Agents write and deploy code on your behalf — no GUI builder required.

🤖

RPA is dying

Coding Agents and Auto Browsers are transforming automation. They reason, adapt, and handle exceptions — not just replay fixed scripts.

Introduction — Everett Rogers

Diffusion of Innovation

Section 2
AI Ecosystem

Picking the right AI for the job

AI Ecosystem — The Analogy

Sport Shoes Shopping

Brands → Products → Stores (many-to-many connections)

AI Ecosystem

AI Shopping 2026

Same structure — Vendors → Models → Surfaces

AI Ecosystem — Extended

Sport Shoes 2026 — 4 Tiers

Adding the fourth tier: in-store experience zones

AI Ecosystem

AI Shopping 2026 — 4 Tiers

The orange row = products/agents we'll come back to

Notice the orange row — these are where agents live. We'll come back to them in Half 2.

AI Ecosystem

One Model CAN'T Fit All

Build your AI toolbox — each model excels in unique ways

Gemini Pro

Multimodal champion with exceptional visual understanding and native Google Workspace integration

OpenAI GPT-5

Thoughtful problem-solver with strong analytical and reasoning capabilities across domains

Claude Opus

Go-to for complex coding, long documents, and nuanced instruction-following

DeepSeek

Specialized in Chinese-language tasks and cultural contexts; strong coding and reasoning

AI Ecosystem — Fundamentals

Understanding AI Model Fundamentals

When evaluating models, check these 2 critical attributes

🧠

Smartness

Accuracy without hallucination. How often does the model produce correct, reliable answers? Does it make up citations, invent facts, or confidently say wrong things?

💾

Memory Capacity

Context Window — how much text the model can process at once. Bigger = can handle longer documents, longer conversations, more context.

AI Ecosystem — Tokens

Understanding Tokens

🪙

Example

1 token ≈ 4 characters · ¾ of an English word

💰

Cost

APIs charge per token consumed — both input (prompt) and output (response)

💬

Includes

Everything in your conversation: your messages, AI replies, documents pasted in

🔭

Context Window

The size of the AI model's working memory — all it can "see" right now

AI Ecosystem — How You Pay

Subscription vs API

🍱 Subscription — "All You Can Eat"

Flat ~US$20/month (ChatGPT Plus, Claude Pro, Gemini Advanced)
Usage quota — resets on a rolling window
Predictable cost, hits a wall when quota exceeded
Best for: individuals, daily chat, exploration

🔌 API — Pay As You Go (per token)

Claude Sonnet: ~$3/1M input tokens, ~$15/1M output
Sample Q&A: 10K input = $0.03 + 2K output = $0.03 → $0.06/req
No cap — cost scales with usage linearly
Best for: apps, automation, agents, variable workloads

When you move from chat to agents, you almost always move to API — agents burn tokens fast. This is the cost shift to expect in Half 2.

AI Ecosystem — Memory

AI Memory = Context Window

No long-term memory between sessions — each new chat starts fresh
Within a session, the context window IS its working memory
When the window fills up, the AI starts forgetting the earliest parts
Bigger model = bigger memory (Gemini Pro 1M, Claude 200K)
Long chats need handover summaries — same as a temp worker

AI Ecosystem — Memory

《忘記和記》

"Forget and Remember"

Your AI lives this loop every conversation — bounded by its context window.

AI Ecosystem — Model Comparison

AI Model Memory Comparison

Model	Context Window	Best Use Case
Gemini Pro	1,000,000 tokens	Multimodal tasks, long-context coding
LLaMA	10,000,000 tokens	Open-source applications, efficient processing
GPT-5	400,000 tokens	General purpose, multimodal processing
Claude Sonnet/Opus	200,000 tokens	Document analysis, research, complex reasoning
DeepSeek	128,000 tokens	Code generation, reasoning, large-context understanding

Larger context = more working memory per session, but usually higher cost per API call.

Section 3
How to Collaborate

Talking to AI effectively — the mindset shift

How to Collaborate

Think of AI as your
MIT Graduate Intern

Brilliant — but new on the job.

→ Needs clear direction

→ Learns fast from examples and feedback

How to Collaborate

Working with Your AI Intern

🧠

Smarter model, more capable intern

Advanced models handle complex tasks with less supervision. Upgrade your model when you need more reliability.

⚖️

Set appropriate expectations

Understand strengths and limitations. AI excels at drafting, analysis, and synthesis — but verify anything critical.

💬

Clear, specific instructions

Examples yield far better results than vague descriptions. The more specific you are, the less revision you'll need.

🏗️

Build productive workflows

Systematic processes that leverage AI strengths while keeping humans in the review and decision loop.

How to Collaborate

The Temporary Worker Model

👤

AI chats are like a temp worker

Limited memory, starts fresh each session, can only see the current conversation

🎯

Provide clear goals upfront

Well-defined objectives help AI understand purpose from the very start

📋

Define a clear job scope

Setting boundaries helps AI stay relevant and avoid scope creep

📜

Ask for a handover document

Request summaries to preserve important information before starting a new session

🔄

The context window determines memory

Everything the AI can reference is what you've put in this conversation window

PROMPTING

101

Prompting 101

CAST: The Prompt Framework

C — Character or Target Audience

Define who the AI should be, or who the output is for.
"Act as an experienced data scientist..."

A — Aim or Goal

Clearly state what you want to accomplish.
"Create a marketing plan for..."

S — Specific Detail or Context

Provide industry context, constraints, and background — the more the better.

T — Template or Format

Specify how you want the answer presented.
"As a bulleted list," "In a table," "Under 200 words"

Using this structured approach helps your AI "intern" deliver exactly what you need — first time.

Prompting 101

Learning Prompting from AI

💡

Let AI teach you

Ask AI to analyze your prompts and suggest specific improvements to structure and clarity

✍️

Ask AI to write a prompt for you

For complex requests, ask AI to formulate the optimal prompt structure before you run it

🖼️

Especially useful for image AI

Image models (Midjourney, DALL·E, Imagen) have very specific prompt templates — ask them first

✨

Build a prompt library

Identify patterns in successful prompts and save them as reusable templates in your workflow

Section 4
From LLM Chat
to Agent

Everything before: AI you talk to.
Everything after: AI that does.

From LLM Chat to Agent

Chat → Agent

💬 Chat

You ask. AI answers. You act.

Single turn: one question, one answer
You copy-paste the result manually
You close the loop yourself

🤖 Agent

You set a goal. AI plans, uses tools, acts. You review.

Multi-step: AI loops until done
AI takes real actions (web, files, email)
You review outputs, not every step

An agent = LLM + tools + a loop, working toward a goal.

From LLM Chat to Agent

The Agent Recipe

🧠

LLM

The brain — decides what to do next at each step

🛠️

Tools

The hands — web, files, code, APIs, databases

🔄

Loop

The persistence — plan → act → observe → repeat

Plan → Act → Observe → Plan → ...

From LLM Chat to Agent

Agent Tools — Connectors

Give AI the tools they need to do your job — Gemini Enterprise (April 2026)

💬 Microsoft Teams

Chat and notify across enterprise channels

📁 Google Drive

Read, create, and organize documents in the cloud

🎫 Jira Cloud

File, update, and comment on tickets

📄 Confluence Cloud

Create and edit wiki pages and documentation

🧠 Notion

Manage databases, pages, and team knowledge bases

📧 Microsoft Outlook

Send email and schedule meetings on your behalf

These 6 are among the most-used Gemini Enterprise connectors. They connect agents to where your work actually lives.

From LLM Chat to Agent

Consumer Agents in 2026

Cursor

Coding agent built into your editor. Writes, reviews, and refactors code with full codebase context.

Claude Desktop

Anthropic's desktop agent for chat + tools. Can access files, run code, use MCP connectors.

Manus Desktop

Research and multi-step agent. Browses the web, synthesizes findings, produces documents.

Perplexity Computer

Browser-based agent that surfs and acts. Fills forms, extracts data, completes web-based tasks.

From LLM Chat to Agent

Enterprise Agents — Governance, Compliance, Scale

NotebookLM Enterprise

Secure document agents over your company's knowledge corpus

🛡 secure📋 audit

Gemini Enterprise

Google's agent platform with admin controls and full audit logging

🛡 secure🔒 compliant

Claude for Enterprise

Anthropic's enterprise tier with SSO, data isolation, and usage controls

🔒 isolated📋 audit

Microsoft Copilot Studio

Build, govern, and deploy custom agents across the Microsoft 365 stack

🛡 M365⚙ custom

Same agent capabilities — plus governance, compliance, identity management, and audit trails.

Advanced Framework

Your AI Mastery Journey — 5 Levels

Now that you've met agents, here's how far this goes

I held this map back until now — until you'd met an agent, the higher levels would have been abstract.

Advanced Framework

How We Engineer with AI — 4 Years, 4 Eras

2022

Prompt
Engineering

Get good at asking — craft the perfect question

2024

Vibe
Coding

Describe what you want and let AI build it

2025

Context
Engineering

Curate what the AI sees — manage its memory and inputs

2026

Harness
Engineering

Build the rails, tools, skills, and guardrails agents run on

Each year, the lever moves higher up the stack. The frontier today is Harness — building agent infrastructure.

Level 4
Agent Team

Orchestrated AI workforce — specialization + parallel execution

Level 4 — Agent Team

Solo Agent → Agent Team

Solo Agent 😓

One brain doing everything — slow, context-overloaded, generalist output

🤖 One Agent
research + write + review + publish

Agent Team 🚀

One orchestrator + many specialists — parallel, focused, production quality

🎯 Orchestrator

🔍 Research

✍️ Write

👁 Review

Same job, divided across roles — like turning a freelancer into a startup.

Level 4 — Agent Team

Skills — SOPs for Your AI

One worker, many SOPs. One agent, many Skills.

🛠️

Capability Uplift

Give the agent a skill it's currently weak at. Example: a Skill that makes any agent follow your company's frontend design system precisely.

📋

Encoded Preference

Lock in your workflow your way. Example: a Skill that ensures every report follows your exact structure, tone, and approval chain.

Agent autonomy is now powered by Skills.

Level 4 — Agent Team

Claude Agent-Team — Orchestration Built In

🎯 Orchestrator + Sub-agents

Main agent delegates tasks to specialist sub-agents, then synthesizes their outputs

🔧 Skills attached per role

Each sub-agent gets the right SOPs — the researcher gets a web-search skill, the writer gets your brand voice

✨ Adhoc agent creation

Spawn new specialists on the fly when the task requires an unexpected capability

Don't define teams up front — let them form for the task at hand.

Level 4 — Agent Team

Metaskills — Skills That Build Skills

The agent doesn't just use your SOPs — it can write new ones.

🏗️

Skill Builder

A metaskill that generates a new Skill from your plain-language description. You describe the SOP; the agent writes and tests it.

🎭

Team Composer

Analyzes a job you describe and spawns the optimal agent team with the right Skills already attached to each role.

Skill → creates → Skill → creates → Skill → ...

Level 4 — Agent Team

Paperclip AI — Level 4 in Production

One product, multiple specialist agents (research / draft / publish)
Orchestrator routes each user request to the right sub-agent team
Adhoc agents spawned for unfamiliar or complex tasks on the fly

Built on Claude agent-team

Live in production today. This is what Level 4 looks like running in the wild.

Level 5
Agent Corp

AI as an organization — multiple teams, multiple orchestrators, one mission

Level 5 — Agent Corp

From Team to Organization

Agent Team (Level 4)

One orchestrator, many sub-agents, one mission. Like a startup with a CEO and functional staff.

Agent Corp (Level 5)

Many orchestrators, many teams, running an entire organization. Each team is a department.

Each "team" is a department. Together they form a company.

Level 5 — Agent Corp

ZorCorp — AI-First Operating Model

👤 Humans

Set goals

→

Review outputs

→

Intervene at decisions

→

Approve & ship

Marketing

Research Agent

Content Agent

Analytics Agent

Engineering

Code Agent

Test Agent

Deploy Agent

Operations

Ops Monitor

Report Agent

Research

Scout Agent

Synthesis Agent

Insight Agent

The "company" runs on agents. Humans steer.

Summary

You Just Climbed 5 Levels in 60 Minutes

Chat — you talked to AI
Tools — you saw the ecosystem & how to pay
Agent — you understand autonomy
Agent Team — you saw orchestration & Skills
Agent Corp — you see the destination

The question isn't "Will AI take my job?"
It's "Which level am I building toward?"

MasteringAI Agentsin 60 min

What will happen to AI Agents in 2026?

AI Agent is here now!

No/Low code programming is fading

RPA is dying

Diffusion of Innovation

Sport Shoes Shopping

AI Shopping 2026

Sport Shoes 2026 — 4 Tiers

AI Shopping 2026 — 4 Tiers

One Model CAN'T Fit All

Gemini Pro

OpenAI GPT-5

Claude Opus

DeepSeek

Understanding AI Model Fundamentals

Smartness

Memory Capacity

Understanding Tokens

Example

Cost

Includes

Context Window

Subscription vs API

🍱 Subscription — "All You Can Eat"

🔌 API — Pay As You Go (per token)

AI Memory = Context Window

AI Model Memory Comparison

Think of AI as yourMIT Graduate Intern

Working with Your AI Intern

Smarter model, more capable intern

Set appropriate expectations

Clear, specific instructions

Build productive workflows

The Temporary Worker Model

AI chats are like a temp worker

Provide clear goals upfront

Define a clear job scope

Ask for a handover document

The context window determines memory

CAST: The Prompt Framework

C — Character or Target Audience

A — Aim or Goal

S — Specific Detail or Context

T — Template or Format

Learning Prompting from AI

Let AI teach you

Ask AI to write a prompt for you

Especially useful for image AI

Build a prompt library

Chat → Agent

💬 Chat

🤖 Agent

The Agent Recipe

LLM

Tools

Loop

Agent Tools — Connectors

💬 Microsoft Teams

📁 Google Drive

🎫 Jira Cloud

📄 Confluence Cloud

🧠 Notion

📧 Microsoft Outlook

Consumer Agents in 2026

Cursor

Claude Desktop

Manus Desktop

Perplexity Computer

Enterprise Agents — Governance, Compliance, Scale

NotebookLM Enterprise

Gemini Enterprise

Claude for Enterprise

Microsoft Copilot Studio

Your AI Mastery Journey — 5 Levels

How We Engineer with AI — 4 Years, 4 Eras

Solo Agent → Agent Team

Solo Agent 😓

Agent Team 🚀

Skills — SOPs for Your AI

Mastering
AI Agents
in 60 min

Think of AI as your
MIT Graduate Intern