MAF — User Manual
A practical walkthrough from "I've never used this" to "I just ran a real arena and read its verdict." Every described function below is linked to its source so you can confirm it exists and behaves as documented.
Table of contents
- What MAF does
- First five minutes
- The Dashboard tour
- Running an arena
- Configuring an arena (Setup tab)
- Smart triggers — auto-run on stream events
- Watching a symbol
- Reading decisions
- Triggering from the command line
- Common questions
What MAF does
MAF runs multi-agent deliberations over data streams. Each deliberation is an arena — a small graph of LLM-powered specialist agents that look at the same target through different lenses, then a synthesis pass that reconciles their signals into a verdict. The deliberation is logged end-to-end so you can audit why an arena said what it said.
Two arena families ship by default:
- Trading arenas (target = ticker symbol). Verdict is BUY / HOLD /
SELL with confidence. Verdict + size are published to
maf:actions:outfor a downstream order router to consume. - Discussion arenas (target = question, document, RFC, …). Verdict
is free-form ("approve" / "needs_revision" / etc.). Published to
maf:decisions:out.
Routing between the two streams happens automatically based on each
arena's target_key config — see
MAFApp._publish_arena_output.
Arenas shipped today: trading_intelligence, market_pulse,
alpaca_live, report_to_action, equity_research, mastermind,
research_debate, crowd_simulation.
Borrowed agent prompts — Anthropic Financial Services
Several agents use prompts vendored from
anthropics/financial-services —
domain-tuned "senior research associate" personas for equity research,
investment banking, and earnings analysis. These live under
src/maf/prompts/anthropic_fs/
and are referenced from arena YAML via system_prompt_file:.
Currently in use:
| Arena | Agent | Vendored prompt |
|---|---|---|
report_to_action |
earnings_reviewer |
earnings_reviewer.md — senior-equity-research view when the report is an earnings event |
equity_research |
sector_reader |
sector_reader.md — extracts market-size + landscape facts with strict citations |
equity_research |
comps_spreader |
comps_spreader.md — peer-set multiples with outlier flags |
equity_research |
note_writer |
note_writer.md — synthesises upstream signals into a research-note-quality deliverable |
Tool references in upstream prompts (mcp__factset__*, mcp__capiq__*)
are translated to MAF source bindings per
MAF_TOOL_TRANSLATION.md.
The vendored agents have access to three data backbones:
- fomo2 —
fomo2_items,fomo2_knowledge,fomo2_requestfor enriched items, cached graph context, and on-demand deep extracts. - trtools2 —
trtools2_bars,trtools2_news,trtools2_apifor live OHLCV, ticker-tagged news, and HTTP queries against the trtools2 dashboard's QuestDB-backed history. - EODHD (via MCP) —
eodhdadapter pinned tohttps://mcpv2.eodhd.dev/v1/mcp. 77 institutional-grade tools (get_fundamentals,get_earnings_calendar,get_news,get_eod_data,resolve_ticker, ...). SetEODHD_API_KEYin.env; degrades silently when missing.
For any other MCP server (FactSet, CapIQ, internal), bind the generic
mcp_remote adapter
with a url + tool config.
Caveat: prompts were tuned for claude-opus-4-7; they work on Ollama
Cloud's gpt-oss:120b but verdicts may drift — A/B test before
relying on numerical outputs.
First five minutes
The dashboard is the way in. From any project root:
python -m maf doctor # preflight: redis, ollama, configs
python -m maf --dashboard --port 8420 # web UI on :8420
Open http://localhost:8420/. The header shows green pills for each
dependency (Redis, Ollama Cloud, trtools2, fomo2, mirofish, kronos
refresher, mirofish refresher). The dashboard cards list every
configured arena. Click Run on any card.
The dialog adapts to what the arena needs. Trading arenas ask for a
ticker; the research_debate arena asks for a proposal text. Pick one,
choose a mode (Log only / Queue for review / Auto-execute), click
Dispatch. The result lands inline in seconds-to-minutes, and the
last-decision badge on the card updates.
If you only have five minutes, do that. The rest of this manual is about understanding what just happened.
The Dashboard tour
The top nav is flat — no dropdowns. Items are ordered by user journey:
| Tab | What it answers |
|---|---|
Dashboard (/) |
"What arenas exist, what did they say last time, how do I run one?" |
Live (/live) |
"What is MAF doing right now?" — WebSocket tail of every lifecycle event from EventBus. |
Channels (/channels) |
"What's actually in the data streams?" — categorised list of every Redis Stream, schema inferred from real entries. Backed by discover_channels + preview_stream. |
Data (/data) |
"Are the streams healthy, and what data does each arena consume?" — stream length + age + per-arena source bindings with plain-English config labels. |
Mastermind (/mastermind) |
Tail of deliberation envelopes from maf:arena:mastermind:output. |
Oracle (/oracle) |
Tail of crowd-simulation envelopes from maf:arena:crowd_simulation:output. |
Sources (/sources) |
Every registered data adapter, grouped by module, with which arenas use it. |
Modules (/modules) |
Toggle the high-level data modules (fomo2 / trtools2 / kronos / mirofish / web). |
LLM (/llm) |
OpenRouter rankings × Ollama Cloud catalog. Shows what the picker would choose per profile. |
Wizard (/wizard) |
Scaffold a new arena from a free-text description. |
Docs (/docs) |
This manual + architecture + API reference + runbook, with source links. |
Every page has a status pill row up top. Two pills are easy to overlook but important:
- kronos_refresher — green when the service-mode worker is writing its heartbeat (3× cadence TTL). Red with "not running" if you only launched the dashboard.
- mirofish_refresher — same shape, fed by the mirofish heartbeat.
If those go red, your watched symbols won't get fresh forecasts /
crowd-sims. Start the worker (python -m maf — no --dashboard flag)
to bring them up.
Running an arena
The Run dialog
Clicking Run opens a modal whose body changes with the arena's
target_key. Three shapes today:
- ticker (default — trading arenas). Big ticker field with quick-pick chips. Date is hidden under Advanced ▾ because most live runs use the most-recent bars; date is only for backtest-style replays.
- question_id (
research_debate). Free-text "What needs to be decided?" — the stakeholder personas read this. Title is optional. - question (mastermind-style). Single question textarea.
Mode picker — what the three options mean
After picking a target you choose a mode. These describe what MAF does with the verdict:
- Log only (safe default) — MAF deliberates and writes the decision to the trail. No trade is placed, no human gets paged.
- Queue for review — MAF flags the decision for a human to approve before any execution. Use when you're sanity-checking the arena.
- Auto-execute — MAF hands the action to the
RiskGatewhich places the order if it passes the policy (size cap, exposure cap, kill switch).
The mode is recorded on the published action so you can audit later who asked for what.
What happens under the hood
When you click Dispatch the dashboard POST /api/arenas/{name}/run
which calls MAFApp.run_arena. That:
- Loads the arena's
Arenainstance, which builds aPhaseGraphon first use and reuses it after. - Runs each phase in sequence. Each phase emits lifecycle events to
maf:eventsvia theEventBus— that's what the Live tab tails. - After the synthesis phase, the
ReplanAgentchecks confidence + gap markers. If the verdict is shaky, it loops back to analysis with extra sources enabled (max_iterationscap on the arena config). - The emit phase publishes the verdict via
MAFApp._publish_arena_output— to the trading-actions outbox for ticker arenas, to the decisions outbox otherwise.
Configuring an arena (Setup tab)
Each arena card on the dashboard has a Configure button. It opens the arena's per-arena page on the Setup tab — a structured editor with four sections:
Metadata
Description, schedule (cron expression), max replan iterations, target_key (dropdown). Most arenas only need a description tweak; the schedule is for arenas that should run on a cadence rather than event-triggered.
Data sources
Table where each row is name | adapter | parameters. Adapter is a
dropdown of every registered adapter on the server (filtered to
non-deprecated ones). Parameters are removable chips — + param adds a
new key/value.
Each row carries a freshness badge showing the underlying data
state right now: live · 12s ago · 80 entries / stale · 1.6h /
empty / external API. This auto-refreshes every 15 s and is backed
by arena_freshness.
Removing a source from this table also removes it from every agent's
sources list — no orphan references.
Agents
Cards per agent (collapsible). Each one shows name, role (8-value dropdown — analyst / specialist / synthesis / debater / judge / executor / watcher / replan), LLM tier (quick / deep), max ReAct steps, the source-picker chips (click to toggle which bound sources the agent can call as a tool), and the system prompt textarea.
Phases
Cards per phase. Pattern dropdown (parallel / sequential / debate),
max rounds (debate only), transition (name of the next phase or
END). Agents in this phase are shown as read-only chips — moving
agents between phases still needs raw YAML.
Smart triggers
See the next section.
Save behaviour
Three guard rails on PUT /api/arenas/{name}/config
(code):
- Pydantic validation — round-trips through
ArenaConfig. Any invalid role, missing field, or wrong type is rejected with 422 before the file is touched. - ETag / If-Match —
GET /configreturns a 16-char sha256 ETag header. If the dashboard's saved If-Match doesn't match the current on-disk ETag, you get 412 Precondition Failed ("YAML was modified by another writer"). Refresh and retry. - Atomic write — config is written to a tempfile next to the target YAML, then renamed. Either the new content fully replaces the old or nothing changes; no half-written YAML.
The Raw YAML tab (last tab on the arena page) is still available as a power-user escape hatch.
Smart triggers — auto-run on stream events
Instead of running an arena manually, you can declare that it should
auto-fire when a Redis Stream event matches a rule. The trigger
dispatcher tails the configured streams, evaluates a when: predicate
with safe_eval, applies a
per-(arena, target) cooldown, and XADDs to maf:control:in — which the
control plane picks up and runs the arena.
The Setup tab has a Smart triggers section with:
- Trigger library picker — drop-down of 7 prebuilt templates loaded
from
config/trigger_templates.yaml. Hot-reloads on file change (no server restart). Pick one, click Apply template — it lands in your trigger list as an editable row. Edit the YAML to add new templates; no Python change needed. - Per-rule editor — name, on_stream, when expression, target
template (
{payload.field}interpolation), cooldown seconds, action mode (manual / semi / auto). - Live validation — Test expression runs
POST /api/triggers/validate(code) which runs your expression through the actual safe_eval parser against a stream-appropriate sample payload. The row turns green with the result or red with the safe_eval error.
The 7 shipped templates cover the common scenarios — Kronos prob shift, Kronos 1h horizon flip, Kronos high-confidence directional call, MiroFish crowd-sim tickers, trtools2 strategy BUY/SELL, high-impact news sentiment, fomo2 report emitted.
Watching a symbol
The watch list is the single source of truth for "what is interesting right now". Anything expensive that runs in the background (Kronos forecasts, MiroFish crowd-sims) only fires for watched targets.
Add a ticker via the dashboard's /api/watch endpoint:
curl -X POST -H 'Content-Type: application/json' \
-d '{"target_id":"NVDA","kind":"symbol","ttl_seconds":21600}' \
http://localhost:8420/api/watch
Once added — and assuming the worker is running (python -m maf with
no --dashboard), check the kronos_refresher status pill —
KronosRefresher
starts producing forecasts every 60 s (1m horizon) and every 5 min (1h
horizon). Forecasts land at Redis keys
kronos:forecast:{symbol}:{timeframe} and a compact event is emitted
to kronos:forecasts:emitted whenever direction flips or prob_up
moves > 0.05.
Underlying class:
WatchList. Items decay automatically
after their TTL — no garbage builds up.
Reading decisions
Two streams carry decisions, depending on the arena type:
maf:actions:out— trading verdicts asTradingActionenvelopes.maf:decisions:out— generic verdicts asGenericDecisionenvelopes.
The Channels tab is the easiest way to see the latest. Click either stream to see the recent payloads + inferred schema.
For trading actions specifically, the downstream consumer is
ActionConsumer — it
reads the actions stream, applies the
RiskGate, and publishes its
decision (execute / queue / log / reject) to maf:executions:out. The
ExecutionHarvester then
correlates fills + closes back to the originating decision and updates
the DecisionMemory so
the next arena run's recall finds them.
Triggering from the command line
Two flavours.
Via the control plane (XADD to a Redis stream)
redis-cli XADD maf:control:in '*' data '{
"command": "run_arena",
"correlation_id": "manual-1",
"args": {
"arena": "market_pulse",
"target": {"ticker": "NVDA"},
"action_mode": "manual"
}
}'
ControlInbox picks it up, runs the
arena, acks on maf:control:out keyed by the correlation_id.
Via the Python client
from maf.control.client import ControlClient
client = ControlClient()
ack = await client.send("run_arena", {
"arena": "market_pulse",
"target": {"ticker": "NVDA"},
"action_mode": "manual",
})
print(ack["result"]["synthesis_verdict"])
Wrapper: ControlClient.
Via the CLI
python -m maf trigger market_pulse --ticker NVDA --action-mode manual
python -m maf events --filter market_pulse
Common questions
Why did my arena run return verdict=HOLD with confidence 0.0? Either
the LLM specialists hit a rate limit (check the dashboard's status pill
for ollama) or all the source fetches failed. Check the Setup tab's
freshness badges — green pills next to each source mean data is live;
red / stale / empty pills tell you upstream is the issue. Inspect the
run on /arenas/{name}/<trail_id> to see per-agent reports.
The Setup tab shows everything stale. Check the kronos_refresher
/ mirofish_refresher status pills at the top. If they're red, the
service-mode worker isn't running — start it with python -m maf (no
--dashboard flag). Heartbeat keys land in Redis within ~60s and the
pills turn green.
Save returned 412. Someone else (another tab, another machine) saved this arena's config while you were editing. Reload the Setup tab to pull the latest ETag, re-apply your edits, save again.
Save returned 422. Your edits failed Pydantic validation. The
response body's detail field has the exact field path and reason. The
on-disk YAML is untouched — fix the offending field and retry.
What's "stale_kronos_forecast"? A specialist saw the cached Kronos
forecast was older than the freshness budget (5× refresh cadence by
default) and added a gap marker. The
ReplanAgent reads markers and
triggers a re-run with fresh sources.
My arena card shows target: question_id. What do I put in? That's
a deliberation arena. The Run dialog gives you a textarea for the
proposal or RFC text — paste it in, optionally name it, click Dispatch.
The question_id is auto-generated; you'll see it on the resulting
decision envelope.
How do I add Alpaca data to a new arena? Don't bind the alpaca
adapter — it's deprecated. Bind trtools2_bars / trtools2_news (live
Redis streams, populated by trtools2's feed engine) or trtools2_api
(HTTP client for richer queries against trtools2's dashboard). See the
alpaca_live arena for a complete example.
Where do I see what an adapter returns? Channels tab: pick the stream the adapter reads from, expand a recent entry. Or hit Sample on a source binding in the Data tab.
How do I write a new arena? Drop a YAML in config/arenas/. The
loader scans on startup
(load_config). Use an existing arena as
a template — most copy from trading_intelligence.yaml
or research_debate.yaml.
Or use the Wizard tab to scaffold one from a description.
How do I confirm the docs match the code? Click any function link above — it opens the source viewer at the exact line. The doc-link checker test (code) runs in CI and fails if any link points at a missing file or out-of-range line, so this manual stays honest.