Verifiable Computation as Infrastructure: Proof Systems at Scale

Monthly research note. Theme: Deep Systems Notes.

TL;DR

Verifiable Computation as Infrastructure: Proof Systems at Scale as an engineering constraint: write down assumptions, make invariants executable, and design operational recovery as part of correctness.

Key insight

If the spec is implicit, the implementation becomes the spec—and you’ll learn it during incidents.

Key takeaways

Interfaces must carry assumptions: time, randomness, identity, and ordering.
Operational behavior is part of correctness: rollout, rollback, and evidence.
Integration boundaries are where proofs evaporate; treat them as first-class.
Define safety properties before performance goals.
Write assumptions down; treat them as interfaces.

Why this matters

Most real failures happen at integration boundaries, not inside components.
Operational behavior is part of correctness (rollouts, rollbacks, drift).
Mixed-version operation creates states you didn’t model.
Resilience requires making failure modes explicit and bounded.

Key questions

How do you keep ‘security properties’ visible to operators and SREs?
Where does 'correctness' become an operational contract (SLOs, budgets, policy)?
What are your compositional failure modes (partial deploys, mixed versions)?
What is the smallest integration test that can falsify your assumptions?
Which proofs are worth maintaining vs replacing with tests and monitoring?
How do you prevent 'optional security' from appearing via config drift?

Assumptions

Components are built by different teams with different threat models.
Upgrades are incremental; compatibility is a security boundary.
Integration happens under time pressure; defaults become de facto policy.
Observability is imperfect; you debug from partial evidence.

Non-goals

Allowing config to silently weaken security properties.
Relying on “tribal knowledge” to connect assumptions across layers.

Attack surface

Observability pipelines can be attacked (cardinality explosions, log injection). Protect them.

Model & invariants

Composability is the promise that proofs survive integration:

\mathrm{Adv}_{\Pi_1\circ \Pi_2} \le \mathrm{Adv}_{\Pi_1} + \mathrm{Adv}_{\Pi_2}.

Make assumptions executable: encode them as assertions, tests, and run-time checks.

Choose what to prove and what to monitor. Both are necessary in practice.

Invariant

Monotonicity beats timestamps: counters and epochs survive clock skew.

Security properties

Integrity: invalid transitions are rejected (and detectable).
Authenticity: actions are bound to identity and purpose.
Evidence: critical actions emit verifiable audit events.
Downgrade resistance: negotiation can’t silently weaken security posture.

Failure modes

Mixed-version behavior that violates assumptions silently.
Timeout ambiguity causing double-apply or partial state transitions.
Resource exhaustion (CPU/bandwidth/storage) turning into correctness failures.
Config drift that weakens security posture over time.

Pitfall

A recovery plan that isn’t exercised will fail when you need it.

Design sketch

flowchart LR
  boundary["Boundary"] --> contract["Contract (P -> Q)"]
  contract --> test["Tests"]
  test --> monitor["Monitoring"]
  monitor --> incident["Incident"]
  incident --> contract

Implementation notes

If it’s not enforced, it’s not a contract.

Rule of thumb

Acknowledge only after durability (or make “ack” explicitly best-effort).

Boundary contract template:
Preconditions (P):
- input validation, size limits, auth context
- monotonic versions / idempotency keys
Postconditions (Q):
- durable state transitions
- evidence emitted (audit/metrics)
Failure modes:
- explicit, typed, and observable

Verification strategy

End-to-end property tests for the smallest meaningful workflow.
Upgrade tests for mixed-version and rollback scenarios.
Fault injection at seams (queues, caches, RPC) not only components.
Invariant monitoring tied to incident response playbooks.
Contract tests at boundaries with adversarial inputs and skew.

Operational notes

Treat config drift as an incident: detect, alert, and remediate.
Maintain runbooks that reference invariants, not just symptoms.
Store evidence: audit logs, config diffs, and deployment metadata.
Make security and correctness properties observable (metrics + alerts).
Use canaries for protocol and crypto changes; define rollback triggers.

Operational note

Keep audit and config history queryable during incidents—evidence beats intuition.

What to monitor

Error budget burn + tail latency under load.
Rollback events and the conditions that triggered them.
Admission-control / rate-limit rejections (by reason).
Retry/timeout rates by endpoint and client cohort.
Authz failures and policy denials (unexpected spikes).

Rollback plan

Use canaries and staged rollout; stop early when signals degrade.
Define an explicit rollback trigger (metrics + thresholds).
Keep dual-write / dual-verify windows where appropriate.
Preserve evidence (configs, artifacts, audit logs) to reconstruct what changed.
Prefer backward-compatible changes; avoid “flag day” upgrades.

Evidence

Learn TLA+ (1) — Practical entry point for specification and model checking.
- Evidence: Model the smallest thing that can break; use model checking to validate invariants before optimizing.
Site Reliability Engineering (Google) (2) — Error budgets, incident response, and reliability as an engineering discipline.
- Evidence: Error budgets and incident response are correctness controls; tie monitoring and rollback triggers to SLO burn.

Open questions

What boundary is most likely to be bypassed under incident pressure?
Which assumptions do you currently enforce only through convention?
Which properties can be proven locally vs only tested end-to-end?
Where can config silently weaken security properties today?

Checklist

Safety properties stated as invariants.
Assumptions listed and reviewed.
Costs bounded (CPU/memory/bandwidth) under adversarial inputs.
Rollback plan rehearsed and automated.
Telemetry captures correctness signals.
Failure modes enumerated with mitigations.

TL;DR

Key takeaways

Why this matters

Key questions

Assumptions

Non-goals

Model & invariants

Security properties

Failure modes

Design sketch

Implementation notes

Verification strategy

Operational notes

What to monitor

Rollback plan

Evidence

Open questions

Checklist

Further reading