Composable Security: Where Proofs Break in Real Systems

Monthly research note. Theme: Deep Systems Notes.

TL;DR

Composable Security: Where Proofs Break in Real Systems as an engineering constraint: write down assumptions, make invariants executable, and design operational recovery as part of correctness.

Key insight

If the spec is implicit, the implementation becomes the spec—and you’ll learn it during incidents.

Key takeaways

Operational behavior is part of correctness: rollout, rollback, and evidence.
Contracts need enforcement: tests, assertions, and monitoring—not documentation.
Integration boundaries are where proofs evaporate; treat them as first-class.
Bind security decisions to evidence (audit, invariants, telemetry).
Treat retries, reordering, and partial failure as default conditions.

Why this matters

Security becomes optional through configuration drift unless enforced.
Most real failures happen at integration boundaries, not inside components.
Resilience requires making failure modes explicit and bounded.
Operational behavior is part of correctness (rollouts, rollbacks, drift).

Key questions

Which assumptions leak across boundaries (time, randomness, identity, ordering)?
Which proofs are worth maintaining vs replacing with tests and monitoring?
What are your compositional failure modes (partial deploys, mixed versions)?
How do you keep ‘security properties’ visible to operators and SREs?
How do you prevent 'optional security' from appearing via config drift?
Where does 'correctness' become an operational contract (SLOs, budgets, policy)?

Assumptions

Components are built by different teams with different threat models.
Integration happens under time pressure; defaults become de facto policy.
Adversaries exploit ambiguity between systems, not within them.
Observability is imperfect; you debug from partial evidence.

Non-goals

Relying on “tribal knowledge” to connect assumptions across layers.
Allowing config to silently weaken security properties.

Attack surface

Observability pipelines can be attacked (cardinality explosions, log injection). Protect them.

Model & invariants

Interface contracts are predicates:

\text{caller obeys } P \Rightarrow \text{callee guarantees } Q.

Make assumptions executable: encode them as assertions, tests, and run-time checks.

Treat config as code: version it, review it, and monitor drift.

Invariant

Invariants must be checkable from evidence you actually have (state + logs + counters).

Security properties

Authenticity: actions are bound to identity and purpose.
Replay resistance: duplicated inputs do not change outcomes.
Integrity: invalid transitions are rejected (and detectable).
Downgrade resistance: negotiation can’t silently weaken security posture.

Failure modes

Resource exhaustion (CPU/bandwidth/storage) turning into correctness failures.
Timeout ambiguity causing double-apply or partial state transitions.
Recovery paths that only work when nothing is broken.
Mixed-version behavior that violates assumptions silently.

Pitfall

Mixed-version deployments create states you never tested—plan for them explicitly.

Design sketch

flowchart LR
  boundary["Boundary"] --> contract["Contract (P -> Q)"]
  contract --> test["Tests"]
  test --> monitor["Monitoring"]
  monitor --> incident["Incident"]
  incident --> contract

Implementation notes

Operational constraints are part of the design: deploy, rollback, and drift.

Rule of thumb

Acknowledge only after durability (or make “ack” explicitly best-effort).

// Integration note: treat FFI/service boundaries as an API with invariants.
// Encode invariants as types where possible, assertions otherwise.

Verification strategy

End-to-end property tests for the smallest meaningful workflow.
Invariant monitoring tied to incident response playbooks.
Upgrade tests for mixed-version and rollback scenarios.
Contract tests at boundaries with adversarial inputs and skew.
Fault injection at seams (queues, caches, RPC) not only components.

Operational notes

Use canaries for protocol and crypto changes; define rollback triggers.
Maintain runbooks that reference invariants, not just symptoms.
Store evidence: audit logs, config diffs, and deployment metadata.
Treat config drift as an incident: detect, alert, and remediate.
Make security and correctness properties observable (metrics + alerts).

Operational note

Make degraded modes explicit: fail closed vs fail open is a policy choice.

What to monitor

Authz failures and policy denials (unexpected spikes).
Rollback events and the conditions that triggered them.
Error budget burn + tail latency under load.
Invariant violation rate (should be ~0).
Retry/timeout rates by endpoint and client cohort.

Rollback plan

Keep dual-write / dual-verify windows where appropriate.
Define an explicit rollback trigger (metrics + thresholds).
Prefer backward-compatible changes; avoid “flag day” upgrades.
Preserve evidence (configs, artifacts, audit logs) to reconstruct what changed.
Use canaries and staged rollout; stop early when signals degrade.

Evidence

Site Reliability Engineering (Google) (1) — Error budgets, incident response, and reliability as an engineering discipline.
- Evidence: Error budgets and incident response are correctness controls; tie monitoring and rollback triggers to SLO burn.
Jepsen (2) — Integration-focused fault testing and correctness thinking.
- Evidence: Turn faults into test cases; prioritize partition and clock-skew scenarios that violate user-visible guarantees.

Open questions

Where can config silently weaken security properties today?
Which assumptions do you currently enforce only through convention?
What boundary is most likely to be bypassed under incident pressure?
Which properties can be proven locally vs only tested end-to-end?

Checklist

Costs bounded (CPU/memory/bandwidth) under adversarial inputs.
Telemetry captures correctness signals.
Rollback plan rehearsed and automated.
Assumptions listed and reviewed.
Safety properties stated as invariants.
Failure modes enumerated with mitigations.

TL;DR

Key takeaways

Why this matters

Key questions

Assumptions

Non-goals

Model & invariants

Security properties

Failure modes

Design sketch

Implementation notes

Verification strategy

Operational notes

What to monitor

Rollback plan

Evidence

Open questions

Checklist

Further reading