protocol
KLIS-7: Control Plane Requirements
Minimum guarantees for a KLIS authoritative server.
1. Purpose
The Control Plane is the authoritative server (or cluster) that maintains the Global Intent State. KLIS-7 defines the minimum API and state guarantees required to host a KLIS-compliant ecosystem.
2. Responsibilities
- State Persistence: Store active Leases and Agent Sessions.
- Conflict Evaluation: Execute the logic from KLIS-2 (in O(1)).
- Timekeeping: Be the source of truth for Lease Expiry (KLIS-3).
- Notification: Publish events (
LeaseFreed,ContentionDetected).
3. Required API Concepts
The Control Plane MUST expose endpoints equivalent to:
POST /v1/manifest: Submit a NIM. ReturnsGranted | Denied | Wait.POST /v1/leases/heartbeat: Bulk renew active leases.POST /v1/leases/release: Explicitly release resources.POST /v1/leases/reconcile: Resurrection API. Accepts an array of Lease IDs and an Agent ID. Returns a boolean map of validity.GET /v1/resource/{id}/state: Get current locks (who holds this?).GET /v1/contention: Get hotspot metrics.
4. State Guarantees
- Linearizability: Lease operations MUST be atomic. Multiple concurrent requests for the same resource MUST be serialized.
- Durability: Leases SHOULD be stored in memory for speed (Redis/Memcached) but critical session state SHOULD be persisted.
- Correction: Since Leases are ephemeral (TTL), in-memory (with WAL) is usually sufficient.
5. Multi-Tenant Considerations
- Namespaces: The Control Plane MUST support isolation between unrelated tenants.
AppA's/configis notAppB's/config. - Quotas: Rate-limiting on
Acquirerequests to prevent DDOS.
6. Observability
The Control Plane MUST provide a "God View" of the system:
- Who is blocked?
- What are the hotspots?
- Who are the "Ghost Agents" (high timeout rate)?
7. Non-Goals
- Data Storage: The Control Plane stores metadata (Intents), not the actual file contents.
- Execution: The Control Plane does not run the agents.
8. Telemetry of Death
The Control Plane MUST track "Death Counts" (how many times an agent has been killed via Wait-Die). If an agent exceeds a "Starvation Threshold" (e.g., 50 deaths), the Control Plane MUST issue a PAUSED verdict for manual intervention.