A governance protocol for safer, stoppable, and accountable AI.
Guardrails for autonomous AI agents. ΣΛ defines what must never happen, what must be true before acting, and when execution must stop. Designed to be parsed by both humans and AI agents without ambiguity.
The Problem
As AI systems become more autonomous, persistent, and operationally integrated, the dominant risk is no longer model error — it is semantic drift.
Modern AI systems act over long horizons, cross domain boundaries, optimize goals aggressively, and execute irreversible actions. Most failures occur not because systems are unintelligent, but because human intent, safety boundaries, and operational rules are encoded implicitly — relying on inference, good faith, or documentation that does not survive automation.
Inferred Intent
Agents interpret ambiguous instructions by inferring what the operator "probably meant." Inference substitutes the agent's model of intent for the operator's actual intent.
Silent Scope Expansion
An agent authorized for Task A discovers that Task B is a logical prerequisite. Without explicit boundaries, the agent expands its footprint without authorization.
Optimization Past Boundaries
Agents are optimizers. If safety constraints exist only as soft guidelines, the agent treats them as preferences to weigh, not boundaries to respect.
Absent Halting Semantics
No formal specification of when an agent must stop. Without halting semantics, agents default to continuation — the wrong default for safety-critical systems.
ΣΛ addresses this by making boundaries explicit and enforceable.
Why It Matters
Prevents "helpful but dangerous" automation
AI systems that optimize beyond their mandate cause the most damage when they're technically correct but violate implicit boundaries.
Enables clean refusal and halting
Agents prove compliance or halt. Refusal is a first-class outcome, not an error state.
Makes AI systems auditable and defensible
Every action is traceable to a policy clause. Every decision is logged with evidence.
Stops scope expansion and silent reinterpretation
Explicit constraints prevent agents from drifting beyond their authority or reinterpreting intent.
Documentation
Core Principles
Policy
What must be true and what must never happen. Declarative constraints that define the boundaries of acceptable system behavior.
Execution
How systems act within defined constraints. Operational logic that respects policy boundaries and enables clean refusal.
Evidence
What actually occurred. Immutable trace records that provide auditability and support governance requirements.
Architecture
ΣΛ separates concerns into three distinct layers that communicate through well-defined interfaces. This separation enables independent verification, auditing, and evolution of each component.
Change Control
Policy changes follow a controlled process that ensures traceability and prevents unauthorized modifications. All changes must be versioned, reviewed, and attested before activation.
Version Control
All policies are versioned using semantic versioning. Changes to invariants or prohibited actions require a major version increment.
Review Requirements
Policy modifications require attestation from designated reviewers before activation. The number of required approvals is configurable per environment.
Activation Trace
When a policy is activated, a trace record is emitted containing the policy version, activation timestamp, and attestation signatures.
Trace Format
Traces provide immutable evidence of system decisions. Each trace record contains the decision made, the policy context, and sufficient evidence to reconstruct the reasoning.
{
"trace_id": "tr_8f4b2a1c",
"timestamp": "2026-01-31T14:32:08.421Z",
"decision": {
"type": "refusal",
"action_requested": "transfer",
"reason": "risk_score_exceeded"
},
"policy": {
"name": "transaction-limits",
"version": "1.0.0",
"invariant_violated": "user.risk_score <= 0.8"
},
"evidence": {
"user_risk_score": 0.92,
"threshold": 0.8,
"computed_at": "2026-01-31T14:32:08.418Z"
},
"attestation": {
"signature": "sig_a7c3e9...",
"signer": "executor-node-01"
}
}Examples
AI Agent Autonomy
An agent autonomy policy that defines boundaries for AI coding assistants operating on production systems.
[DEF-001] mutate(X) ≡ modify(X) ∨ overwrite(X) ∨ replace(X) ∨ delete(X)
[DEF-002] shell_cmd ≡ bash(X) ∨ powershell(X) ∨ exec(X)
[CL-001] shell_cmd ∧ ¬approved(operator) ⊢ ⊥
[CL-002] mutate(production_data) ⊢ ⊥
[CL-003] token_usage > budget_limit ⇒ halt(session)
[CL-004] action(X) ∧ ¬confirmed(intent, user) ⊢ ⊥Production Deployment
A deployment policy that protects critical infrastructure during production releases.
[DEF-001] destroy(X) ≡ remove(X) ∨ drop(X) ∨ truncate(X)
[CL-001] delete(audit_logs) ⊢ ⊥
[CL-002] deploy(production) ⇒ exists(backup)
[CL-003] destroy(critical_db) ⊢ ⊥
[CL-004] modify(live_port) ⊢ ⊥Trace Evidence
A trace record documenting a successful generation with full evidence chain.
[2026-02-01T00:15:32Z] SESSION agent-session-001
[2026-02-01T00:15:32Z] POLICY agent_autonomy.ΣΛ.md v1.0
[2026-02-01T00:15:33Z] CHECK [CL-001] shell_cmd ∧ ¬approved → PASS (no shell)
[2026-02-01T00:15:34Z] CHECK [CL-002] mutate(production_data) → PASS
[2026-02-01T00:15:34Z] CHECK [CL-003] token_usage (1,247) < budget (10,000) → PASS
[2026-02-01T00:15:35Z] CHECK [CL-004] intent confirmed → PASS
[2026-02-01T00:15:35Z] RESULT: COMPLIANT — session complete