Paper-driven research note. Theme: PQC migration that fails at the certificate boundary.

TL;DR

If you read post-quantum TLS plans as “replace ECDSA with some PQ signature”, you will build the wrong system.

In TLS 1.3, the end-entity (leaf) certificate key is not just an identity anchor — it is the key that signs the live handshake transcript (CertificateVerify). That single design fact turns “which signature algorithm lives in the leaf” into a hot-path engineering decision with direct DoS consequences.

Delgado Jiménez (arXiv:2604.06100) runs a clean local experiment matrix on OpenSSL 3 + oqsprovider, varying where ML-DSA and SLH-DSA appear in the certificate hierarchy. The result is a discontinuity you cannot hand-wave away:

  • a fully-ML baseline is ~0.809 ms mean handshake latency, ~0.562 ms server task-clock per handshake
  • moving SLH-DSA into the server leaf produces ~1402 ms mean latency, ~1401 ms server task-clock per handshake (≈ 1733× the baseline)
  • the bytes transferred only grow ~1.69×, so the collapse is not “just bigger certificates” — it is online signing cost. (1)
Key insight

In PQ TLS, “placement” is a performance and security boundary. The leaf algorithm determines online server signing cost; upper-layer algorithms mostly shift validation work to clients. This is cost concentration, not algorithm substitution.

Key takeaways

  • Leaf SLH-DSA is an online CPU collapse. In the paper’s matrix, leaf-SLH jumps from ~0.8 ms to ~1400 ms mean latency (≈ 1733×). (1)
  • Upper-layer SLH-DSA is penalized but plausible. Root-SLH / leaf-ML increases latency to ~2.133 ms (≈ 2.64×) while server task-clock rises only ~1.19×. (1)
  • Transport size is a second-order effect in the heavy regime. Leaf-SLH reads ~27,015 bytes vs ~16,008 baseline (≈ 1.69×) while server CPU rises ≈ 2494×. (1)
  • Client/server work distribution changes by placement. Upper-layer SLH shifts active work toward client validation; leaf-SLH becomes overwhelmingly server-bound. (1)
  • PQC rollout must be evaluated as PKI+TLS design. Chain exposure, depth, caching, compression, and resumption interact with cryptographic cost in ways primitive benchmarks cannot predict. (2) (3) (4)

Introduction (pragmatic abstract: the infrastructure problem)

The real question your pager asks is not “is ML-DSA post-quantum secure?”.

It is: “Can my TLS front-end authenticate at peak load without becoming a self-inflicted CPU DoS?”

In classical TLS deployments, RSA/ECDSA/Ed25519 signing costs are low enough that we tend to blame handshakes on network RTT, certificate chain size, or cache misses. Post-quantum signatures break that mental model because they are not a single family with smooth tradeoffs:

  • ML-DSA (FIPS 204) is lattice-based and engineered to be deployable in interactive authentication. (5)
  • SLH-DSA (FIPS 205) is stateless hash-based and conservative, but its performance profile is fundamentally different. (6)

The paper’s claim is not theoretical: it is deployment-shaped.

“Post-quantum migration in TLS 1.3 should not be understood as a flat substitution problem … [it] depends on where it appears in the certification hierarchy … and how cryptographic burden is distributed across client and server roles.” (1)

That is the right framing. TLS is not “a signature benchmark”; it is an authenticated key-establishment protocol with roles, state, and adversaries.

Assumptions

  • TLS 1.3 full handshakes with certificate-based server authentication. (2)
  • X.509 certification hierarchies (root → intermediate → leaf). (3)
  • Threat model includes adversarial handshakes (flooding, forced full handshakes, cache bypass). Availability is a security property.
  • I treat the paper’s lab measurements as a signal, not as a universal constant: implementation quality and hardware matter, but order-of-magnitude discontinuities are not noise.
  • Focus is on server-authentication (no mutual TLS), because that is where “internet scale” lives.

Non-goals

  • Re-proving TLS 1.3 security. This is about operational correctness under PQ parameter sets.
  • Modeling global internet pathologies (loss, reordering, congestion collapse). The paper’s lab is local; I’ll critique that explicitly.
  • Claiming “SLH-DSA is unusable”. The claim is narrower: SLH-DSA in the interactive leaf is operationally toxic for front-ends in the measured regimes.

Security properties

TLS security is not only confidentiality/authenticity. Under active adversaries, availability is a cryptographic boundary because authentication work is attacker-triggerable.

S1 — Authentication correctness

The server must prove possession of the private key corresponding to the presented leaf certificate during the handshake:

Verify(cert_chain)VerifySigAleaf(transcript,σCV)\mathrm{Verify}(\mathrm{cert\_chain}) \wedge \mathrm{VerifySig}_{A_{\mathrm{leaf}}}(\mathrm{transcript}, \sigma_{\mathrm{CV}})

where AleafA_{\mathrm{leaf}} is the leaf signature algorithm and σCV\sigma_{\mathrm{CV}} is the CertificateVerify signature. (2)

S2 — Bounded attacker-triggerable work (DoS-resilience invariant)

Let CsrvC_{\mathrm{srv}} be the server CPU time spent in cryptographic operations per full handshake. A front-end that must survive adversarial connection rates needs:

handshakes h:    Csrv(h)Cmax\forall \text{handshakes } h:\;\; C_{\mathrm{srv}}(h) \le C_{\max}

and the system-level stability constraint (multi-core queueing approximation):

ρλE[Csrv]k<1,\rho \equiv \frac{\lambda \cdot \E[C_{\mathrm{srv}}]}{k} < 1,

where λ\lambda is handshake arrival rate and kk is effective parallelism (cores dedicated to handshake crypto).

This is the invariant leaf-SLH breaks: it moves E[Csrv]\E[C_{\mathrm{srv}}] from sub-millisecond to ~1.4 seconds in the paper’s measurements. (1)

Invariant

Hot-path crypto budget: the signature algorithm used for CertificateVerify must keep server per-handshake CPU under a fixed bound; otherwise availability collapses under adversarial handshakes.

S3 — Cryptographic agility without silent downgrade

PQC migration in TLS is long-lived and mixed-mode. The negotiation must prevent “compatibility” from becoming a downgrade vector:

  • explicit policy for acceptable signature algorithms,
  • telemetry that reveals negotiated algorithms,
  • rollback that preserves safety properties (no “enable PQ in prod” without escape hatch).

Failure modes

  • CPU collapse at the leaf: server spends ~1400 ms signing/verifying per handshake; handshake rate collapses; queue grows; timeouts cascade. (1)
  • Client validation overload: upper-layer SLH increases client task-clock materially (validation-skewed regime); low-end clients and IIoT gateways regress first. (1)
  • Size-induced latency amplification: certificate chains grow; slow-start, fragmentation, retransmits, and handshake flighting add RTTs (the paper’s local setup underestimates this). (4)
  • Cache illusions: resumption hides cost only for honest traffic. An attacker can force full handshakes by rotating SNI, disabling tickets, or exploiting client diversity.
  • Mixed deployment drift: partial rollouts and heterogeneous client capabilities force policy forks; “support both” becomes “accept the weakest under pressure”.
Attack surface

Handshake authentication is attacker-triggerable compute. If you put an expensive signer in the leaf, you have built a CPU amplification primitive into your perimeter.

What to monitor

  • Per-handshake crypto time split by phase: chain validation vs CertificateVerify signing/verification.
  • Negotiated signature algorithm distribution (by SNI, region, client cohort).
  • Handshake latency (p50/p95/p99) and timeout/retry rates.
  • CPU saturation signatures: run-queue length, softirq pressure, context switch rate.
  • Handshake queue depth at the load balancer / accept queue.
  • Bytes per handshake and certificate chain lengths (especially with PQ chains).
  • Resumption ratio vs full handshake ratio; alert on drops.

Rollback plan

  • Feature-flag placement policy: ability to move SLH-DSA out of the interactive leaf without redeploying the entire fleet.
  • Dual chain strategy (operationally plausible): keep ML-DSA in the leaf and place SLH-DSA in upper trust layers (root/intermediate), matching the “bounded penalty” regime observed in the paper. (1)
  • Client capability gating: enforce per-cohort policies; do not let “one legacy client” dictate global acceptance rules.
  • Emergency mode: prefer classical leaf fallback only as a last resort (explicitly logged and time-boxed), because “availability now” often becomes “downgrade forever”.
Operational note

Rollback has to be faster than the incident. If changing the leaf algorithm requires a CA ceremony and multi-day issuance, you do not have an operational rollback plan.

The Mathematical Anatomy of the Problem

The paper’s core point can be expressed as a simple decomposition: not all certificate signatures are equal in the TLS protocol.

Let a chain be root → intermediate → leaf.

During a TLS 1.3 full handshake, the client does:

  1. verify the certificate chain signatures (issuer algorithms),
  2. verify the live handshake signature CertificateVerify (leaf algorithm).

The server does:

  1. generate the live CertificateVerify signature (leaf algorithm).

Abstract the per-handshake costs:

  • Sign(A) = cost to sign using algorithm A
  • Verify(A) = cost to verify using algorithm A

Then, ignoring key exchange and symmetric crypto:

CsrvSign(Aleaf)C_{\mathrm{srv}} \approx \mathrm{Sign}(A_{\mathrm{leaf}})
CcliVerify(Aleaf)+Verify(Aint)+Verify(Aroot)C_{\mathrm{cli}} \approx \mathrm{Verify}(A_{\mathrm{leaf}}) + \mathrm{Verify}(A_{\mathrm{int}}) + \mathrm{Verify}(A_{\mathrm{root}})

That is the placement lever. Putting SLH-DSA at the root or intermediate raises Verify(SLH) costs on the client side. Putting SLH-DSA at the leaf raises Sign(SLH) on the server side — and that hits your perimeter at scale.

Evidence from the paper’s strategy matrix

Under a common hybrid key-establishment baseline (x25519 + ML-KEM-768), the paper reports (Campaign B): (1)

Scenario Placement Mean latency Mean server task-clock Bytes read
x25519mlkem768__ml_root__ml_int__ml_leaf ML/ML/ML 0.809 ms 0.562 ms 16,008
x25519mlkem768__slh_root__ml_int__ml_leaf SLH/ML/ML 2.133 ms 0.667 ms 28,947
x25519mlkem768__ml_root__ml_int__slh_leaf ML/ML/SLH 1402.486 ms 1401.169 ms 27,015

Two points matter operationally:

  1. Upper-layer SLH increases latency without collapsing server CPU. That is a validation-skewed regime.
  2. Leaf SLH is a server-dominated regime. The system is not “a bit slower”; it is in a different stability class.

Service capacity as an invariant

If the mean server crypto time per full handshake is SS seconds, a single core can sustain at most:

μcore1S  handshakes/sec.\mu_{\text{core}} \le \frac{1}{S}\;\text{handshakes/sec}.

With S1.401S \approx 1.401 seconds (leaf-SLH server task-clock), that is μcore0.71\mu_{\text{core}} \approx 0.71 handshakes/sec. Even with k=32k=32 effective cores, you are in the tens of handshakes per second regime — below the baseline assumptions of modern TLS termination.

This is why the paper’s conclusion is correct: the collapse is not explained by chain size, but by where the expensive signer lives. (1)

From Measurements to Deployment: the engineering gap

The paper is experimental, but the deployment implication is structural:

  • CertificateVerify is a live signature over a transcript; you cannot precompute it.
  • resumption reduces exposure but does not eliminate attacker-triggerable full handshakes.
  • certificate compression reduces bytes, not signing cost. (4)

So the migration strategy must treat the certificate hierarchy as a design surface:

  • keep a conservative (hash-based) algorithm in long-lived trust anchors,
  • keep a performant algorithm in the interactive leaf,
  • and plan for key agility with short-lived leaf certificates.

Critique (what the paper does not prove)

  • Local lab ≠ internet. The paper’s results isolate compute effects, but the real internet will make PQ chain size penalties worse via RTT amplification. This strengthens (not weakens) the “don’t do leaf-SLH” conclusion for front-ends.
  • Implementation quality matters. oqsprovider and OpenSSL integration are moving targets. But the observed 10^3× gap is too large to dismiss as mere optimization debt. (7)
  • Client heterogeneity is under-modeled. Many clients are constrained (mobile, embedded, IIoT gateways). Validation-skewed regimes can still be unacceptable in those populations.
  • Mutual TLS will magnify costs. If both sides sign, the placement problem becomes bilateral; you must reason about who signs online and under what rate limits.

Evidence

Open questions

  • What is the cleanest “PQ root + fast leaf” strategy that preserves long-term trust while keeping hot-path CPU bounded?
  • Can we formalize a deployment constraint language: “these algorithms are allowed in offline issuance vs online authentication”?
  • How do we make downgrade resistance auditable at scale (per-cohort policy + telemetry + enforcement)?

Checklist

  • Leaf algorithm chosen with an explicit per-handshake CPU budget.
  • Chain design separates offline issuance from online authentication costs.
  • Certificate compression evaluated (bytes) but not used as a proxy for CPU. (4)
  • Rate limiting and handshake queuing modeled under adversarial load.
  • Negotiated algorithms logged and monitored (per cohort / SNI).
  • Rollback plan does not require a multi-day CA ceremony.

Further reading

1.
Delgado Jiménez JL. Signature Placement in Post-Quantum TLS Certificate Hierarchies: An Experimental Study of ML-DSA and SLH-DSA in TLS 1.3 Authentication [Internet]. arXiv:2604.06100; 2026. Available from: https://arxiv.org/abs/2604.06100
2.
Rescorla E. The Transport Layer Security (TLS) Protocol Version 1.3 [Internet]. RFC Editor; 2018. Report No.: 8446. Available from: https://www.rfc-editor.org/rfc/rfc8446
3.
Cooper D, Santesson S, Farrell S, Boeyen S, Housley R, Polk T. Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile [Internet]. RFC Editor; 2008. Report No.: 5280. Available from: https://www.rfc-editor.org/rfc/rfc5280
4.
Ghedini A, Muthukuru VK, Goessens R. TLS Certificate Compression [Internet]. RFC Editor; 2021. Report No.: 8879. Available from: https://www.rfc-editor.org/rfc/rfc8879
5.
National Institute of Standards and Technology (NIST). FIPS 204: Module-Lattice-Based Digital Signature Standard (ML-DSA) [Internet]. Web; 2024. Available from: https://csrc.nist.gov/pubs/fips/204/final
6.
National Institute of Standards and Technology (NIST). FIPS 205: Stateless Hash-Based Digital Signature Standard (SLH-DSA) [Internet]. Web; 2024. Available from: https://csrc.nist.gov/pubs/fips/205/final
7.
Open Quantum Safe. Open Quantum Safe: oqs-provider [Internet]. Web; Available from: https://github.com/open-quantum-safe/oqs-provider
8.
National Institute of Standards and Technology (NIST). Post-Quantum Cryptography [Internet]. Web; Available from: https://csrc.nist.gov/projects/post-quantum-cryptography