Tool, Skill, Or Subagent?

Decomposing an agent that outgrew its prompt.

14:00 - 14:45 Karan Sampath / Anthropic Workshop IMG_7462, IMG_7469 to IMG_7474

Index

This page reconstructs the available second-half workshop photos. IMG_7463 through IMG_7468 are not present in the source folder, so the page starts with the live Managed Agents demo at IMG_7462 and continues from IMG_7469 through IMG_7474. Original images are appended beneath every section.

Session Frame

Source: session page

The session asks when logic belongs in a tool, a skill, or a subagent. The workshop frame is to inherit a large inventory agent, decompose it live on Claude Managed Agents, and run evals after each change.

Original Code w/ Claude session page

Live Demo: Workshop Food Agent

Source: IMG_7462

The Console demo shows a session named workshop-food running in Claude Managed Agents. The visible user task begins:

Create a 5-slide presentation introducing th...

The debug view shows a Bash tool call and a session resource named workshop-pptx, indicating the agent is creating presentation artifacts through tools/resources rather than only prompt text.

Three Ways To Give Your Agent A Forecaster

Source: IMG_7469

callable_agentsCustom client toolInline
What Declare a forecaster up front; coordinator delegates to it in an isolated session thread sharing the same sandbox. Define a spawn_subagent tool; your harness handles the call by creating a fresh session. Main agent reads the history and computes via Bash.
Why Native primitive, persistent threads, parallelizable. Dynamic prompts; works on any CMA tier; closest to Agent-SDK Task. Simplest; zero plumbing.
Watch out One level of delegation only; 20-agent cap. You own routing and lifecycle; most surface area. 90-day history sits in main context.
Docs .../managed-agents/multi-agent .../managed-agents/tools and custom tooling docs. -

Workshop prompt: after finishing one approach, redeploy with another and diff the F2 session threads. If F2 is still red, inspect the structure the forecaster hands back to the orchestrator; it may matter more than which transport you pick.

Final Shape: One Agent, Five Skills

Source: IMG_7470

15

lines of system prompt

5

skills loaded on demand

0

hardcoded subagents

92%

eval score: 11/12

The final agent is StockPilot on Claude Managed Agents, using Claude Sonnet 4.6 with a 15-line system prompt.

ComponentContents
Skillsreorder-policy, supplier-selection, forecasting, notify-templates, weekly-report
Agent toolsetBash, Read, Write
callable_agentsForecaster, using CMA's multiagent primitive

Summary line from the slide: 12 inline tools -> agent_toolset + 5 skills + callable_agents.

Fewer Tokens, Higher Score

Sources: IMG_7471, IMG_7472

MetricBeforeAfterDelta
Eval score71% (8.5/12)92% (11/12)+21 pt
R2521 s313 s1.7x faster
R3-R5~$6.50~$1.06-84%
R82 turns, 154 output tokens3 turns, 421 output tokens+267 output tokens
R92 turns, 5,937 output tokens4 turns, 1,290 output tokens-4,647 output tokens
F15 turns, 27,283 output tokens11 turns, 8,488 output tokens-18,795 output tokens
F27 turns, 42,619 output tokens, 102 tool calls20 turns, 9,327 output tokens, 3 scripts-33,292 output tokens
F34 turns, 7,076 output tokens16 turns, 10,079 output tokens+3,003 output tokens
R6-R75 turns, 7,345 output tokens, slow4 turns, <5,000 output tokens-2,345+ output tokens

Note shown on slide: after runs on CMA, wall times include ~40s/task session overhead.

Take What You Learned Back Home

Source: IMG_7473

1. Simple composable agents scale with model intelligence

Architecting around foundational primitives such as Bash, skills, or subagents improves agent capability while enabling more efficient context management.

2. Load organizational procedures on demand

Offload context from your system prompt into skills so agents only load what they need for a given task.

3. Evals should evolve with your product vision

As model capabilities continue to evolve, so should your evals.

Docs path shown: platform.claude.com/docs -> managed-agents, callable_agents, skills.

Your Whole Agent Is A Dict. Four Levers.

Source: IMG_7474

# my_agent.py
AGENT = dict(
  model = "claude-sonnet-4-6",
  system = "",
  skills = [# SKILL_MINING],
  mcp_servers = [# MCP_MINECRAFT_WIKI],
)

ALLOWED_TOOLS = None # or a subset

client.beta.agents.create(**AGENT)
LeverMeaning
systemThe prompt. Your main lever. Every sentence rides every turn.
modelHaiku, Sonnet, Opus.
skillsA markdown doc that rides every turn. Versioned, reusable across agents.
mcp_serversOptional tools, such as wiki lookup. Free until called; the agent must choose.

Footer note: Anthropic runs the loop. No while-loop, no schemas, no retries. Tools are auto-discovered over MCP.