Designing APIs for Correctness: Types, Lifetimes, and Capabilities

Monthly research note. Theme: Formal Methods & Verification.

TL;DR

Designing APIs for Correctness: Types, Lifetimes, and Capabilities as an engineering constraint: write down assumptions, make invariants executable, and design operational recovery as part of correctness.

Key insight

If the spec is implicit, the implementation becomes the spec—and you’ll learn it during incidents.

Key takeaways

Keep models small enough to run in seconds or they will rot.
Refinement boundaries prevent spec drift between paper and code.
Write properties in plain language next to the formal statement.
Write assumptions down; treat them as interfaces.
Measure correctness signals, not only latency/throughput.

Why this matters

Formal models force you to name assumptions (time, ordering, failure).
Refinement boundaries prevent “spec drift” between paper and code.
Counterexamples are better than intuition—they are executable bug reports.
The goal is not a perfect proof—it’s reducing the space of unknown failure modes.

Key questions

How do you handle state explosion (symmetry, abstraction, bounds)?
How do you ensure proofs stay valid through refactors and upgrades?
What is the environment model (adversary actions, scheduling, failures)?
Which invariants must hold under every interleaving and crash point?
Which properties belong in the model vs in tests vs in monitoring?
What is the refinement boundary between spec and implementation?

Assumptions

Adversaries choose the worst schedule, not the average one.
Most systems have implicit assumptions about timeouts and ordering.
Concurrency introduces interleavings humans don’t reason about reliably.
Specifications omit details; implementations invent them. That gap is risk.

Non-goals

Assuming the spec and the code share the same definitions implicitly.
Writing models that can’t produce counterexamples quickly.

Attack surface

Parsing is an attacker-controlled interface—validate early and fail fast.

Model & invariants

Refinement is a simulation relation between spec and impl:

\mathrm{Impl} \sqsubseteq \mathrm{Spec}\quad\Rightarrow\quad \forall \text{behaviors}(\mathrm{Impl}) \subseteq \text{behaviors}(\mathrm{Spec}).

Keep the model small enough to run in seconds; large models rot.

Treat counterexamples as regression tests: reduce, encode, and replay.

Invariant

Monotonicity beats timestamps: counters and epochs survive clock skew.

Security properties

Authenticity: actions are bound to identity and purpose.
Integrity: invalid transitions are rejected (and detectable).
Replay resistance: duplicated inputs do not change outcomes.
Least authority: privileges are scoped by purpose and time.

Failure modes

Config drift that weakens security posture over time.
Resource exhaustion (CPU/bandwidth/storage) turning into correctness failures.
Timeout ambiguity causing double-apply or partial state transitions.
Mixed-version behavior that violates assumptions silently.

Pitfall

Sampling hides the rare schedule that breaks your invariants.

Design sketch

flowchart LR
  spec["Spec (TLA+/PlusCal)"] --> mc["Model Check"]
  mc --> refine["Refinement / Invariants"]
  refine --> impl["Implementation (Rust/Go)"]
  impl --> tests["Fuzz / PBT / Differential"]
  tests --> spec

Implementation notes

Make the model executable enough to generate counterexamples quickly.

Rule of thumb

Bound work per request: parse, validate, and cap cost before you allocate heavy resources.

// Practical tip: make the model "executable" enough to emit traces you can replay.
// Then treat traces as regression inputs for your implementation.

Verification strategy

Refinement tests: compare model traces to implementation traces.
Property-based tests derived from invariants.
Model checking bounded versions of the core protocol.
Differential tests against other implementations/specs.
Runtime assertions for invariants that are cheap to check.

Operational notes

Version properties and invariants like code; review changes carefully.
Use models to evaluate protocol upgrades before shipping.
Treat counterexamples as incidents: track, root-cause, regression-test.
Run the model checker in CI with explicit timeouts and bounds.
Keep a library of “known hard schedules” from past failures.

Operational note

Attach explicit rollout/rollback triggers to changes that touch security or correctness.

What to monitor

Retry/timeout rates by endpoint and client cohort.
Error budget burn + tail latency under load.
Invariant violation rate (should be ~0).
Authz failures and policy denials (unexpected spikes).
Admission-control / rate-limit rejections (by reason).

Rollback plan

Prefer backward-compatible changes; avoid “flag day” upgrades.
Preserve evidence (configs, artifacts, audit logs) to reconstruct what changed.
Keep dual-write / dual-verify windows where appropriate.
Define an explicit rollback trigger (metrics + thresholds).
Use canaries and staged rollout; stop early when signals degrade.

Evidence

Learn TLA+ (1) — Practical workflow and examples.
- Evidence: Model the smallest thing that can break; use model checking to validate invariants before optimizing.
Jepsen (2) — Fault injection and correctness testing for distributed systems.
- Evidence: Turn faults into test cases; prioritize partition and clock-skew scenarios that violate user-visible guarantees.

Open questions

Which properties are you currently assuming but not testing or proving?
Which invariants are cheap enough to monitor in production?
How will you keep models aligned during rapid iteration?
What is the smallest model that reproduces your worst incident class?

Checklist

Costs bounded (CPU/memory/bandwidth) under adversarial inputs.
Rollback plan rehearsed and automated.
Telemetry captures correctness signals.
Safety properties stated as invariants.
Assumptions listed and reviewed.
Failure modes enumerated with mitigations.

TL;DR

Key takeaways

Why this matters

Key questions

Assumptions

Non-goals

Model & invariants

Security properties

Failure modes

Design sketch

Implementation notes

Verification strategy

Operational notes

What to monitor

Rollback plan

Evidence

Open questions

Checklist

Further reading