Mission Control API Design — REST + GraphQL Hybrid
This page explains the hybrid REST + GraphQL split chosen for Mission Control:
reads (dashboard, compliance matrix, recursive EvalResult tree, evaluator
registry) flow through GraphQL; binary streams (PDF reports, agent traces,
history JSONL) and the version endpoint stay on REST.
TL;DR
| Concern | Endpoint | Why |
|---|---|---|
Reads (dashboard, compliance matrix, recursive EvalResult tree, evaluator registry) |
POST /graphql (single endpoint) |
Recursive trees, sparse selection, schema-driven UI |
| Ingest / writes | POST /api/v1/runs (REST; idempotent) — Mode C (deferred) |
Flat records, content-hash audit chain, retry semantics |
| Binary downloads | GET /api/v1/.../report.pdf, GET /api/v1/runs/{id}/trace |
Streams |
| Live updates | SSE (Mode A/B) / SignalR (Mode C) — deferred | Read-only push (Mode A/B) vs bidirectional (Mode C) |
Phase 1 ships the read surface only: GraphQL at
/graphql, the REST binary
- version surface listed below, and the SPA. Ingest (
POST /api/v1/runs), live updates (SSE + SignalR), and auth land with Mode C in a later phase.
This isn't exotic — GitHub does it (REST v3 + GraphQL v4), Stripe does it (REST primary + GraphQL for dashboard), Shopify does it.
Three reasons the read path wants GraphQL
1. The recursive EvalResult tree is graph-shaped
EvalResult.SubResults is IReadOnlyList<EvalResult>? — composites contain composites contain atoms. REST has three bad answers:
- Send the whole tree always — expensive on large composites (a GDPR full-calibration tree is ~50 KB JSON × dozens of scenarios).
- Add
?depth=N— congratulations, you reinvented GraphQL's field selection, badly. - Provide separate "leaf"/"branch" endpoints — over-engineered.
GraphQL handles this with one fragment:
fragment EvalResultRecursive on EvalResult {
metric { key name }
score { value passed }
details {
subResults { ...EvalResultRecursive } # ← recursive!
}
}
One round-trip, exactly the shape the UI wants.
2. The compliance matrix is genuinely 1+N+M
Subjects (N) × controls (M) × evidence per cell. A REST fat-endpoint works at v1 scale (~50×20=1,000 cells, ~200 KB), but at "100 subjects × 50 controls" becomes a 1 MB response where the UI uses 80% of it.
GraphQL's sparse selection means the matrix overview asks for {status, passRate} while the detail drill-down asks for {lastEvidence{...}} — same backend, different shape per view.
3. Schema-driven UI is GraphQL's exact selling point
The EvaluatorCard JSON files (one per evaluator) drive the registry view automatically. With GraphQL, the schema is the registry; codegen produces typed React hooks; introspection drives the registry view. Without GraphQL, we'd ship a separate EvaluatorCard registry endpoint AND bespoke ?fields= projection AND hand-rolled typed React hooks. We were already paying the cost of formalising metadata; GraphQL gives us the runtime for free.
Three reasons the write path stays REST
1. Ingest is not graph-shaped
POST /api/v1/runs is a flat record write with (solutionId, runId) idempotency, content-hash validation, and audit-chain enforcement. GraphQL mutations work but add zero value here. HTTP idempotency keys, retry semantics, and the standard 409 Conflict flow are REST-native.
2. PDF / trace / large-blob streaming
GET /api/v1/.../report.pdf and GET /api/v1/runs/{id}/trace are byte streams with content-type negotiation. GraphQL doesn't do streams cleanly. Apollo's file-upload spec exists but is grafted on. REST is the right answer for binary endpoints.
3. HTTP caching
CDN / browser / proxy caching of GET requests is automatic and free. GraphQL queries are mostly POST + non-cacheable (Apollo persisted queries fix this but add complexity).
Operational costs of GraphQL (honest read)
- Self-hoster debugging is harder than
curl /api/v1/recent. They need GraphiQL or to know the schema. Mitigation: ship Hot Chocolate's embedded Nitro UI at/graphql(off in production). - Query-depth and -complexity limits are required to prevent "give me 1000-deep recursion" attacks. Hot Chocolate ships these (
MaxAllowedExecutionDepth, complexity analyzer); Mission Control configuresMaxAllowedExecutionDepth = 10(raised from 8 in Wave 8a to accommodate the SPA's 3-level scenario-tree drill-down at depth 9; deeper attack queries are still rejected). - N+1 resolver risk is real. Mitigation: GreenDonut (DataLoader port for .NET, ships with Hot Chocolate) handles batched per-request loading. Deferred to MC1.4.7 —
Query.evaluatorTimelineis the documented N+1 candidate, currently bounded by its ownmaxScan: 200cap. - Persisted queries become real work if we want HTTP caching of GraphQL responses. Deferred to v1.6+.
REST surface (/api/v1/*)
# Shipped today — server metadata + binary streams (Mode A)
GET /api/v1/version
GET /api/v1/runs/{runId}/trace
GET /api/v1/runs/{runId}/reports/{format} # markdown / html / junit / sarif
GET /api/v1/compliance/{regulation}/{subject}/{ts}/report.pdf
GET /api/v1/compliance/{regulation}/schema
GET /api/v1/subjects/{kind}/{name}/history # application/x-ndjson stream
# Deferred — Mode C ingest (Phase 2)
POST /api/v1/runs # idempotent on (solutionId, runId)
POST /api/v1/runs/bulk
POST /api/v1/runs/{runId}/trace # multipart, large
POST /api/v1/compliance/evidence
POST /api/v1/red-team/campaigns
# Deferred — Mode C auth (Phase 2)
POST /api/v1/auth/tokens
GET /api/v1/auth/tokens
DELETE /api/v1/auth/tokens/{id}
GET /api/v1/auth/me
# Deferred — Mode C admin (Phase 2)
GET /api/v1/admin/audit-log
DELETE /api/v1/solutions/{id}/data?subject={name} # right-to-be-forgotten
# Deferred — live updates (Mode A/B SSE + Mode C SignalR)
GET /api/v1/events?topics= # SSE; not implemented in Phase 1
WS /hub/notifications # SignalR; Mode C only
The shipped block above matches src/AgentEval.MissionControl/Rest/BinaryEndpoints.cs and McHost.cs exactly.
GraphQL surface (POST /graphql)
Schema is auto-discovered from C# records (src/AgentEval.MissionControl/GraphQL/Query.cs). Top-level resolvers (Phase 1):
Query.solution SolutionInfo?
Query.workspace WorkspaceState! ← drives first-run landing
Query.subjects(kind?) [SubjectInfo!]!
Query.subject(kind, name) SubjectInfo?
Query.recentRuns(count = 50) [RunPointer!]! ← max 500
Query.run(runId) RunManifest?
Query.runSummary(runId) RunSummary?
Query.runCostBreakdown(runId) RunCostBreakdown? ← per-tier cost
Query.scenarios(runId) [ScenarioResult!]!
Query.scenario(runId, scenarioId) ScenarioResult?
Query.scenarioTree(runId, scenarioId) EvalResult? ← recursive!
Query.compliance [ComplianceRegulationSummary!]!
Query.complianceMatrix(regulation) ComplianceMatrix! ← killer feature
Query.complianceEvidence(reg, kind, name, ts) ComplianceEvidence?
Query.evaluators(category?, costTier?) [EvaluatorCard!]!
Query.evaluator(key) EvaluatorCard?
Query.evaluatorTimeline(key, count = 30) [EvaluatorTimelinePoint!]! ← drift / calibration surface
Query.ping / Query.agentEvalVersion smoke
In Phase 2 the schema extends with Query.workspaces, Query.search, Query.crossRegulationOverlap, Query.redTeamCampaigns, etc. — see plan-07 §8.1 for the full forward-looking shape.
Versioning policy
- REST:
/api/v1/. Major bumps (/api/v2/) only on breaking schema change. Additive changes don't bump. - GraphQL: no version segment in the URL. Schema evolves via
@deprecateddirectives on fields that are moving out; clients keep working until they migrate. Breaking-removal of a field is a coordinated release-note event.
Both surfaces share the data model; you cannot get different data shapes from REST vs GraphQL — they're views over the same IOutputStoreReader reads.
See also
getting-started.md— sample queries to get going.charting.md— chart library choice + render hints.portal-ready-evaluators.md— how to make an evaluator render as a first-class card in the portal.