Benchmarking PQC: What to Measure (and What Not To)

Monthly research note. Theme: Post-Quantum Cryptography & Migration.

TL;DR

Benchmarking PQC: What to Measure (and What Not To) as an engineering constraint: write down assumptions, make invariants executable, and design operational recovery as part of correctness.

Key insight

If the spec is implicit, the implementation becomes the spec—and you’ll learn it during incidents.

Key takeaways

Interop is the migration plan—test matrices are more important than whitepapers.
Hybrid composition must be explicit and transcript-bound to resist downgrade.
Migration is mixed-version for years: compatibility and rollback are security features.
Measure correctness signals, not only latency/throughput.
Write assumptions down; treat them as interfaces.

Why this matters

Migration will be mixed-version for years; plan for it explicitly.
Constant-time constraints are harder under large primitives.
Operationalization (monitoring, rollback) determines success more than crypto choice.
Interop is the real risk: multiple stacks, vendors, and versions.

Key questions

How do you handle failures: decryption failures, invalid ciphertexts, malformed keys?
What telemetry proves PQC is working (not just enabled)?
How do you bind hybrid secrets to prevent downgrade and mix-and-match attacks?
Which parts must be constant-time, and how will you validate that?
What does interoperability testing look like across vendors and stacks?
How do you rotate algorithms safely (crypto agility without chaos)?

Assumptions

Vendors vary: implementations and defaults differ.
Side channels exist: timing and cache behavior leak information.
Deployments are mixed; old clients must interoperate or fail safely.
Bandwidth is limited in some environments; larger handshakes matter.

Non-goals

Treating migration as a single flag flip.
Ignoring DoS implications of large primitives.

Attack surface

Observability pipelines can be attacked (cardinality explosions, log injection). Protect them.

Model & invariants

A KEM gives you shared secrets without discrete-log assumptions:

(\mathrm{pk},\mathrm{sk})\leftarrow \mathrm{KeyGen}();\ (\mathrm{ct},\mathrm{ss})\leftarrow \mathrm{Enc}(\mathrm{pk});\ \mathrm{ss}\leftarrow \mathrm{Dec}(\mathrm{sk},\mathrm{ct}).

Binding is the whole game: make the transcript an input to the KDF.

Treat algorithm negotiation as adversarial: explicit downgrade resistance.

Invariant

Make the “impossible state” observable: a metric or alert that fires when invariants drift.

Security properties

Least authority: privileges are scoped by purpose and time.
Evidence: critical actions emit verifiable audit events.
Replay resistance: duplicated inputs do not change outcomes.
Downgrade resistance: negotiation can’t silently weaken security posture.

Failure modes

Observability gaps during incidents (missing evidence).
Recovery paths that only work when nothing is broken.
Config drift that weakens security posture over time.
Mixed-version behavior that violates assumptions silently.

Pitfall

A recovery plan that isn’t exercised will fail when you need it.

Design sketch

sequenceDiagram
  participant A as Initiator
  participant B as Responder
  A->>B: classical_keyshare + pqc_pk
  B-->>A: classical_keyshare + pqc_ct + sig
  A-->>B: sig
  Note over A,B: ss = HKDF(ss_classical || ss_pqc, transcript)

Implementation notes

Explicit binding prevents downgrade and mix-and-match. Don’t leave it implicit.

Rule of thumb

Acknowledge only after durability (or make “ack” explicitly best-effort).

// Hybrid binding sketch (pseudocode):
// ss = HKDF(ss_classical || ss_pqc, info=transcript_hash)
// Then derive traffic keys from ss.

Verification strategy

Side-channel tests where tooling exists; constant-time audits.
Downgrade tests: active attacker manipulates negotiation.
Chaos deploys: mixed versions + rollback during partial outages.
DoS tests: measure CPU/bandwidth amplification and mitigation impact.
Interop matrices across vendors/versions and failure modes.

Operational notes

Cap handshake cost per peer/IP; use stateless cookies when needed.
Document supported algorithm sets and deprecation timelines.
Inventory long-lived secrets and migrate the highest-risk first.
Add telemetry for negotiation outcomes, failures, and client cohorts.
Roll out with canaries and explicit rollback triggers.

Operational note

Keep audit and config history queryable during incidents—evidence beats intuition.

What to monitor

Error budget burn + tail latency under load.
Authz failures and policy denials (unexpected spikes).
Retry/timeout rates by endpoint and client cohort.
Invariant violation rate (should be ~0).
Rollback events and the conditions that triggered them.

Rollback plan

Preserve evidence (configs, artifacts, audit logs) to reconstruct what changed.
Prefer backward-compatible changes; avoid “flag day” upgrades.
Define an explicit rollback trigger (metrics + thresholds).
Use canaries and staged rollout; stop early when signals degrade.
Keep dual-write / dual-verify windows where appropriate.

Evidence

Learn TLA+ (1) — Practical entry point for specification and model checking.
- Evidence: Model the smallest thing that can break; use model checking to validate invariants before optimizing.
NIST Post-Quantum Cryptography Project (2) — Standardization process and algorithm selections.
- Evidence: Treat PQ migration as a program (inventory, interop, rollback). Use NIST status to drive prioritization and timelines.

Open questions

What is the worst-case handshake cost under attack?
How do you rotate algorithms without introducing configuration chaos?
Which clients will fail first, and what is the safe fallback behavior?
Where would a downgrade be visible today, and how would you detect it?

Checklist

Telemetry captures correctness signals.
Costs bounded (CPU/memory/bandwidth) under adversarial inputs.
Rollback plan rehearsed and automated.
Assumptions listed and reviewed.
Failure modes enumerated with mitigations.
Safety properties stated as invariants.

TL;DR

Key takeaways

Why this matters

Key questions

Assumptions

Non-goals

Model & invariants

Security properties

Failure modes

Design sketch

Implementation notes

Verification strategy

Operational notes

What to monitor

Rollback plan

Evidence

Open questions

Checklist

Further reading