A model that hid its own weak points would be less trustworthy, not more. So this page keeps the known tensions and the not-yet-covered scope honestly visible. Treat every item as an open edge of the model, not a settled answer.
01 - Tensions to keep honest
These don't go away - they get managed, and watching them is part of running the model.
A model that hid its own weak points would be less trustworthy, not more.
Bus-factor risk - one builder owning a whole product. Sub-agents mitigate it; they don't remove it.
Verification is the bottleneck - self-verification is the design's weakest point.
Governance capture - whoever defines "high-risk" controls the org's velocity.
Tragedy of the commons - on the shared platform, each builder is incentivised to duplicate rather than negotiate shared structure.
Cascade specificity wars - a deep scope hierarchy makes "which rule wins" hard to reason about. Keep precedence total and the resolved set inspectable.
The test-oracle problem - a wrong test confidently passes wrong code. That's why the human four-eyes is never fully replaceable.
Verification is a staffing constraint - build speed is wasted if testers can't match delivery cadence. The human bottleneck must be funded, not assumed.
Verification is the real bottleneck; self-verification is the weakest point in the design.
02 - Not yet covered
What the model doesn't yet address - some real conceptual holes, some placeholders for later.
The transition path - brownfield, legacy, migrating an existing org in incrementally. The model is greenfield-shaped: a destination with no road to it.
The talent pipeline - the model concentrates seniority while entry-level work shrinks. Where do new builders come from? (Started in the proxy-maintenance page.)
Cost ownership (FinOps for AI) - compute is the dominant cost, yet no role owns the AI budget.
(Operations and the portfolio/leadership layer left this list; both are now full pages.)
The model is greenfield-shaped; a destination with no road to it.
Observability of the AI itself - drift, sub-agent performance, escalation rate, context cost.
Trust calibration / graduated autonomy - how an agent earns and loses autonomy over time.
Runtime security - agents that act and RAG that pulls external content widen the attack surface (e.g. prompt injection).
The non-functional quality bar - performance, accessibility, i18n, scalability, each often needing its own encoded specialists and budgets.
How documentation, the build/deploy pipeline, and testing work in practice - environments, data, infrastructure.
03 - The backlog, in order
The suggested order for closing the remaining gaps.
Brownfield, legacy, incremental migration of an existing org.
How product builders are grown.
Who owns and allocates the compute budget.
Author, certify, version, monitor, retire, transfer; unifies eval ownership with trust calibration and observability.
Observability, trust calibration, runtime security, the non-functional quality bar.
Documentation mechanics, pipeline internals, test infrastructure.
Everything above is provisional - meant to be edited, argued with, and extended.
This is the end of the current series.