Key Design Decisions
Full Validation Before Comparison
Each channel's plan must first pass full validation; the comparator uses the normalized and validated plan instead of the original output.
Registry-Driven Behavior
The capability registry is the single source of truth for domain resources/actions, etc. Components like prompt builders and semantic validators depend on it, and updates are propagated automatically.
Separation of Semantic and Infrastructure Decisions
"What to do" (tasks/sequence/constraints) is decided during the planning and validation phase; "how to do it" (endpoints/connections) is decided after the plan is approved, avoiding runtime conditions from contaminating the planning.
Observability as a Design Attribute
Each component emits structured events (with trace_id), recording details of corrections/failures/discrepancies, allowing reconstruction of the request history.
End-User Feedback as a First-Class Signal
Final response ratings/endorsements are the only signal for the entire validation chain. The semantic context cache uses this to improve resource selection, and approved entries become candidates for the golden dataset.