Agents & MCP
The agent loop: how polybot's MCP tools compose
A worked example of an agent session: Claude analyses three strategies, proposes a change, gets approval, re-runs the evaluation. This is what 'agent core' looks like in practice.
Published Apr 11, 2026
It’s easy to say “MCP-native”. It’s useful to see a full session. This guide walks through a single evening’s agent session — the kind polybot enables daily — annotated to show what the platform is doing at each step.
Setup
- polybot running in
shadowmode for all strategies. - Claude Desktop configured with polybot MCP server.
- Operator (you) sitting at the terminal with the polybot CLI for approvals.
The operator types:
Hey Claude, review how my strategies did over the last 14 days. Pick the worst one, suggest a single concrete improvement, and if it’s reasonable, open a PR draft for me to review.
Step 1 — Discovery
Claude calls list_strategies(). Returns names, modes, enabled flags.
[
{name: "arbitrage", mode: "shadow", enabled: true},
{name: "stat_arb", mode: "shadow", enabled: true},
{name: "ai_model", mode: "shadow", enabled: true},
{name: "spread_farm", mode: "shadow", enabled: false},
...
]
Step 2 — Batch assessment
Claude calls compare_strategies(names=["arbitrage","stat_arb","ai_model"], window_days=14). polybot’s assessment service queries DuckDB and returns:
{
"arbitrage": {pnl_usd: 142.30, sharpe: 2.1, trades: 87, fill_rate: 0.94},
"stat_arb": {pnl_usd: -58.10, sharpe: -0.4, trades: 34, fill_rate: 0.62},
"ai_model": {pnl_usd: 12.50, sharpe: 0.3, trades: 21, fill_rate: 0.71}
}
Claude picks stat_arb as the worst performer.
Step 3 — Drill down
Claude calls analyze_strategy("stat_arb", window_days=14). Returns a richer report — per-pair P&L, entry-z distribution, hit rate by z-bucket, time-in-trade distribution.
...
"pair_breakdown": [
{"pair": "fed-cut-by-june / fed-cut-by-december", "pnl": -40, "trades": 8, "hit_rate": 0.25},
{"pair": "btc-100k-by-q3 / btc-100k-by-q4", "pnl": 10, "trades": 5, "hit_rate": 0.40},
...
],
"entry_z_distribution": [
{"bucket": "[2.0, 2.5)", "trades": 14, "pnl": -30},
{"bucket": "[2.5, 3.0)", "trades": 12, "pnl": 5},
{"bucket": "[3.0, 3.5)", "trades": 6, "pnl": 20}
]
A clear pattern: low-z entries (barely crossing the threshold) are losing; high-z entries (strong divergence) are winning. The current threshold is too loose.
Step 4 — Read the code
Claude calls get_strategy_code("stat_arb"). Gets the implementation. Identifies the entry_z = 2.0 parameter and the min_short_window_correlation = 0.4 filter. Confirms its hypothesis — the threshold is where its intuition said it’d be.
Step 5 — Propose a change
Claude calls suggest_strategy_improvements("stat_arb"). This itself is LLM-powered, but with the strategy’s code + performance report + entry-z distribution in its context, it’s grounded. Output:
Proposal: raise entry_z from 2.0 to 2.8.
Evidence:
- Trades with entry_z in [2.0, 2.5) produced -30 USD over 14 trades.
- Trades with entry_z >= 2.5 produced +25 USD over 18 trades.
- The cleaner threshold at 2.5 alone would have closed the current PnL gap.
- Proposed 2.8 for margin; revisit in 30 days.
Risk: you will miss marginal opportunities. Expect ~40% reduction in trade count.
Step 6 — Claude summarises for you
Claude outputs a summary in chat:
stat_arb is down $58 over the last two weeks. Looking at the entry-z breakdown, trades where the z-score barely crosses 2.0 are the losers; anything above 2.5 is profitable. I’d raise
entry_zfrom 2.0 to 2.8 as a safer margin. Want me to draft the config change?
You say yes.
Step 7 — The gated action
Claude calls propose_config_change(strategy="stat_arb", changes={"entry_z": 2.8}). polybot writes a proposal file in ~/.polybot/proposals/, does not apply it, and returns a proposal ID.
On your terminal:
$ polybot proposal list
ID STRATEGY AUTHOR CHANGES STATUS
p-0142 stat_arb claude:mcp entry_z: 2.0 → 2.8 pending
You inspect:
$ polybot proposal show p-0142
The proposal includes Claude’s full reasoning. You approve:
$ polybot proposal apply p-0142
Applied. stat_arb restarted in shadow mode.
The config change is now live in shadow. Claude didn’t touch your config. The platform did, on your command.
Step 8 — Follow-up loop
Claude calls analyze_strategy("stat_arb", window_days=14) again. Reports no change yet (the change just applied). Notes the baseline for comparison in a week.
Session ends. You go to bed. In a week, the same loop re-runs and reports whether the change helped.
Why this shape matters
Three things about this loop are only possible because the tools are MCP-native:
- The agent composes freely. It called
list_strategies→compare_strategies→analyze_strategy→get_strategy_code→suggest_strategy_improvements→propose_config_changein an order it decided. No hard-coded workflow. - The agent never mutated state without approval. The only write call —
propose_config_change— creates a proposal, not a change. The platform enforces the gate. - The agent’s reasoning is auditable. Every tool call is logged with inputs, outputs, and timing.
polybot mcp audit --since 14d --client claudegives you the session transcript.
Compare this to “Claude wired into an n8n workflow”. In the workflow case, Claude is a step in a scripted sequence you designed. In polybot’s case, Claude is an operator, bounded by the platform’s invariants. The difference is exactly the difference between “integration” and “agent”.
What makes an agent-ready platform
From this example, four properties:
- Typed, composable tools. Not one mega-tool. Many small ones the agent can combine.
- Explicit separation of read and write. Writes go through proposals or approval gates.
- State accessible, not inferred. The agent reads the real performance, not a prose summary.
- Auditability for free. Every call logged, every session replayable.
polybot built these for trading. The same properties apply to any domain where you’d want an AI agent to do real work. That’s the template worth stealing.
Need an agent system built like this?
Cryptuon builds production AI agents, MCP integrations, and trading systems. polybot is our open-source showcase.