Systems note. Theme: hybrid crypto is a protocol design problem.
TL;DR
Hybrid key establishment is not “post-quantum TLS” in the broad sense; it is a very specific hedge against harvest-now-decrypt-later (HNDL).
In the dominant X25519MLKEM768 design for TLS 1.3, the client sends an X25519 ephemeral share plus an ML-KEM-768 encapsulation key in one key_share. The server replies with an X25519 share plus an ML-KEM ciphertext. The two component secrets are concatenated and fed into the existing TLS 1.3 key schedule via HKDF extraction. The intended confidentiality claim is conditional: the session secret remains secure as long as at least one component remains secure and the combiner / schedule is sound. Authentication, however, remains whatever certificate/signature system the deployment already uses; hybrid KEX upgrades key establishment first — it does not automatically provide post-quantum authentication. (1) (2) (3) (4) (5)
In Noise-style systems, the practical split is between (a) “minimal graft” hybrids (HFS-style extensions that add an additional forward-secrecy contribution) and (b) KEM-token hybrids (PQNoise-style designs) that are cleaner when post-quantum authentication becomes the objective. The engineering trade-off is straightforward: HFS is easier to bolt onto a classical design; KEM-token hybrids are harder to deploy but remove ambiguity about what is protected at each stage.
Protocol agility decides whether hybrid deployment is a manageable migration or a flag day. Robust designs expose suites and versions explicitly, authenticate the negotiated choice in the transcript, bind keys to suite identifiers and version identifiers, and treat legacy codepoints as distinct algorithms rather than “equivalent spellings.” TLS 1.3 already demonstrates the core pattern: offer a capability vector, select one suite, bind the selection into the transcript, and abort if the peer selects something outside policy. (1) (3) (6)
Dual-signature transition schemes need the same discipline. The safe semantics are logical AND, not OR: both component signatures cover the same canonical context, both must verify, and the signed context must bind protocol version, suite identifier, key identifier, and replay state. Anything weaker turns “hybrid” into a downgrade gadget.
Hybrid is a combiner + transcript story. If suite identity is not explicit and authenticated, you didn’t build “agility” — you built an algorithm-confusion surface.
Key takeaways
- Hybrid KEX protects confidentiality against HNDL; it does not automatically fix authentication.
- “Secure if at least one survives” only holds if the combiner is versioned and context-bound.
- Agility must be negotiated, transcript-bound, and fail-closed; aliases are downgrade oracles.
- Dual signatures must be AND semantics with a canonical signed context; anything else is compat glue.
- MTU/first-flight growth is an availability hazard; measure packetization, loss, and middlebox behavior.
Assumptions
The relevant adversary is not “Shor” in isolation. It is a composite attacker who can:
- record traffic today and attempt cryptanalysis later (HNDL),
- actively interfere with negotiation during migration (downgrade/algorithm confusion),
- exploit implementation failures (timing leakage, RNG failure, malformed-input oracles),
- and use operational behavior (retries, fallbacks, legacy code paths) as attack surface. (7) (6)
On the systems side, assume:
- partial deployments happen (mixed versions, mixed suites),
- middleboxes exist and still break things,
- and availability is a security property because handshakes are attacker-triggerable compute. (1)
Non-goals
- Proving the cryptographic security of TLS 1.3 itself (done elsewhere). (1)
- Replacing PKI / certificate models; the focus is KEX + protocol agility boundaries.
- Exhaustive benchmarking; the objective is correctness constraints and operational failure modes.
Security properties
Hybrid deployments only make sense if you state the security goal precisely enough that you can falsify it.
P1 — Hybrid confidentiality under HNDL (conditional)
Let the two component shared secrets be and (e.g., X25519 and ML-KEM), and let the combiner be the TLS 1.3 extraction step.
In the simplest model:
The intended claim is:
If at least one of or remains computationally indistinguishable from random, then is indistinguishable from random (under the extractor’s assumptions) and session confidentiality holds. (2) (1) (3) (6)
That claim is not “marketing”. It is a conditional reduction that depends on context binding: which schemes, which parameters, which order, which transcript.
P2 — Negotiation soundness (no silent downgrade)
Algorithm agility is only safe if the negotiation is part of the authenticated transcript. TLS 1.3’s shape is the reference:
- client offers
supported_groups/key_share, - server selects one,
Finishedauthenticates the transcript (including selection),- abort if selection violates policy or isn’t in the offered set. (1) (3)
P3 — Authentication separation (don’t lie to yourself)
Hybrid KEX can preserve secrecy of the handshake secret even if a classical KEX breaks later — but if the certificate signature scheme is breakable, active impersonation remains possible.
This separation is operationally non-negotiable: a post-quantum key agreement does not imply post-quantum authentication. (1) (8) (9)
P4 — Dual-signature correctness (AND semantics)
For a signed context and two signature algorithms :
Anything weaker (e.g., OR semantics, or signatures over different contexts) is a downgrade gadget.
Suite-bound transcript: keys and signatures are derived/verified over a context that binds (version, suite_id, component order, parameter sets). If that binding is not explicit, “hybrid” becomes algorithm confusion.
Hybrid X25519 and ML-KEM in TLS 1.3
The modern standardized vocabulary is different from legacy deployment vocabulary. What many operators still call “Kyber” is standardized as ML-KEM (FIPS 203). Treat standardized and draft codepoints as distinct negotiated algorithms, not silent aliases. (5) (3) (4)
In TLS 1.3 hybrid design, the construction is intentionally conservative. A hybrid construction is represented as a single NamedGroup. The client and server transmit component public values by concatenating them inside a normal KeyShareEntry.key_exchange, and component shared secrets are concatenated before being fed into the TLS 1.3 HKDF extraction flow. (1) (2) (3)
For X25519MLKEM768, the draft specifies:
- client
key_exchangesize: 1216 bytes (32-byte X25519 share + 1184-byte ML-KEM encapsulation key) - server
key_exchangesize: 1120 bytes (32-byte X25519 share + 1088-byte ML-KEM ciphertext) (3) (4)
sequenceDiagram
participant C as Client
participant S as Server
C->>S: ClientHello
Note over C,S: supported_groups = [X25519MLKEM768, X25519, ...]
Note over C,S: key_share[X25519MLKEM768] = x25519_pub_C || mlkem_ek_C
S->>S: ss_x = X25519(x25519_sk_S, x25519_pub_C)
S->>S: (mlkem_ct_S, ss_m) = ML-KEM.Encaps(mlkem_ek_C)
S->>C: ServerHello
Note over C,S: selected_group = X25519MLKEM768
Note over C,S: key_share = x25519_pub_S || mlkem_ct_S
C->>C: ss_x = X25519(x25519_sk_C, x25519_pub_S)
C->>C: ss_m = ML-KEM.Decaps(mlkem_dk_C, mlkem_ct_S)
C->>C: ss = ss_x || ss_m
S->>S: ss = ss_x || ss_m
Note over C,S: HandshakeSecret = HKDF-Extract(DeriveSecret(...), ss)
S->>C: EncryptedExtensions
S->>C: Certificate
S->>C: CertificateVerify(transcript)
S->>C: Finished
C->>S: FinishedThe consequence is subtle but operationally critical:
- the confidentiality of session keys is intended to survive a break of either X25519 (RFC 7748) or ML-KEM (FIPS 203),
- but authentication remains the usual TLS 1.3 transcript authentication:
CertificateVerifyandFinishedbind negotiated choices through the transcript hash. If your certificate signature is not post-quantum secure, active impersonation remains possible under that failure model. (1) (10) (5) (8) (9)
TLS 1.3 also provides the downgrade defenses custom protocols should emulate. Negotiation is safe only because the negotiation itself is bound into the authenticated transcript. (1) (3)
Protocol agility without hard forks
The objective is not “support many algorithms”. The objective is to make algorithm selection a versioned, authenticated protocol object.
Minimal wire requirements:
- explicit protocol version,
- offered capability set,
- unambiguous selected suite,
- transcript binding: authentication covers offer + selection,
- fail-closed on negotiation failure. (1) (6)
A sound rollover strategy follows concrete rules:
- Suite identifiers encode ordered composition, not unordered bags.
X25519MLKEM768is not the same asMLKEM768X25519unless specified. - The selected suite is one element of the offered capability vector, never inferred.
- Key identifiers and version identifiers are bound into transcript and KDF context.
- Negotiation failure is fail-closed.
- Fallback is explicit policy, never a silent parser trick. (1) (6)
flowchart TD
A["Client: version list + capability vector"] --> B["Server: select one suite and echo exact choice"]
B --> C["Both: transcript hash over offer + selection"]
C --> D["Authenticator binds (suite_id, key_id, version)"]
D --> E{"Selection in offer and policy satisfied?"}
E -- No --> F["Abort"]
E -- Yes --> G["Derive traffic keys with suite-bound context"]
G --> H["Emit telemetry: negotiated suite + fallback count"]Two constraints that bite in production:
- Message bloat during capability advertisement. Over-advertising hybrid suites can duplicate large PQ material inside one offer. Prefer a short, policy-driven ordered list and use retry mechanisms (like
HelloRetryRequest) when you must. (1) (3) - Legacy draft compatibility. Treat draft codepoints as separate suites with separate policy and telemetry. Normalizing them internally creates downgrade oracles. (3) (4)
Dual-signature packet and header design
Transition schemes that attach both classical and PQ signatures must preserve acceptance semantics.
The safe semantics are AND: both signatures verify over the same canonical context.
For transport and application protocols, the safest wire format is a must-understand TLV signature block and a canonical signed context that includes:
- protocol identifier,
- protocol version,
- suite identifier,
- sender key identifier,
- sequence number / epoch,
- replay state,
- a hash of the payload,
- and (when relevant) hashes of both verification credentials (to prevent replay across mixed bundles). (6) (2)
struct SignedRecord {
uint8 version_major;
uint8 version_minor;
uint16 suite_id;
uint32 key_id;
uint64 seq_no;
uint16 flags; // includes must_understand_dual_sig
uint16 payload_len;
opaque payload[payload_len];
uint8 sig_count; // 1 in shadow mode, 2 in enforced dual mode
repeated SignatureTLV {
uint16 alg_id;
uint16 sig_len;
opaque signature[sig_len];
}
}DoS management: verify the cheaper signature first as reject-fast, then verify the second signature and accept only if both succeed. The correctness rule never changes: success is AND.
Raw algorithm sizes matter for packetization. These are primitive sizes, not certificate-chain overhead: (11) (8) (9)
| Signature profile | Classical key+sig | PQ key+sig | Combined raw overhead | Migration assessment |
|---|---|---|---|---|
| Ed25519 only | 32 + 64 bytes | — | 96 bytes | Baseline classical profile |
| Ed25519 + ML-DSA-65 | 32 + 64 bytes | 1952 + 3309 bytes | 5357 bytes | Practical general-purpose dual-signature profile |
| Ed25519 + ML-DSA-44 | 32 + 64 bytes | 1312 + 2420 bytes | 3828 bytes | Smaller but lower security category than ML-DSA-65 |
| Ed25519 + SLH-DSA-128s | 32 + 64 bytes | 32 + 7856 bytes | 7984 bytes | Conservative hash-based hedge; significant MTU pressure |
Failure modes
- Algorithm confusion: treating draft and standardized codepoints as aliases. This creates downgrade gadgets in “compatibility” code.
- Unbound suite identity: the KDF transcript doesn’t bind suite id, version, or component order → cross-protocol key reuse risks.
- OR semantics for dual signatures: accepting either signature is not “hybrid”; it is a unilateral downgrade.
- Ambiguous canonicalization: “sign raw packet as received” + multiple encodings → replay across versions/suites.
- First-flight fragmentation: PQ material pushes handshake flights across packets; packet loss + retransmits amplify latency.
- Randomness reuse / secret retention: caching ephemeral KEM keys or keeping decapsulation keys alive beyond session lifetime.
- Silent fallback under incident pressure: “temporary compatibility” becomes the permanent weakest link.
Agility that isn’t measurable becomes folklore. If you can’t answer “which suites were negotiated, where, and why?” you can’t manage migration or incident response.
What to monitor
- Negotiated suite distribution (by endpoint/cohort/region).
- Fallback counts and reasons (policy vs interop vs parsing failure).
- Handshake flight sizes, fragmentation rate, retransmit rate.
- Verification and signing CPU time per handshake/record (cheap vs expensive signature order).
- Error budgets: p95/p99 handshake latency and timeout rates.
- Rate-limit effectiveness on handshake paths (DoS resilience).
Rollback plan
- Feature-flag suite acceptance (per cohort), not just global toggles.
- Shadow mode for dual signatures: generate+verify both, enforce one, log mismatch, then flip to AND enforcement.
- Emergency compatibility policy is explicit, logged, and time-boxed (no silent downgrade).
- Remove legacy draft suites deliberately with telemetry-driven cutoff.
- Document “kill switches” that do not require multi-day PKI ceremonies.
Evidence
- Hybrid key exchange in TLS 1.3 (IETF draft) (3)
- Evidence:
X25519MLKEM768is represented as oneNamedGroup; shares are concatenated; secrets are combined via the TLS key schedule.
- Evidence:
- ML-KEM for TLS 1.3 (IETF draft) (4)
- Evidence: standardized naming/codepoint discipline matters; draft vs final identifiers are distinct negotiated objects.
- TLS 1.3 (RFC 8446) (1)
- Evidence: the negotiation is bound into the authenticated transcript via
Finished; downgrade defenses are structural, not optional.
- Evidence: the negotiation is bound into the authenticated transcript via
- HKDF (RFC 5869) (2)
- Evidence: extraction gives a principled combiner; ad hoc concatenation without context binding is not defensible.
- FIPS 203 (ML-KEM) (5)
- Evidence: ML-KEM is standardized; composite constructions require explicit design discipline.
- NIST SP 800-227 (KEM recommendations) (6)
- Evidence: multi-algorithm KEM establishment needs explicit, context-bound combiners; implementation hygiene is not “nice to have”.
- FIPS 204/205 (ML-DSA / SLH-DSA) (8)
- Evidence: signature size and compute profiles are not interchangeable; placement matters.
Checklist
- Inventory algorithm touchpoints (suite ids, version fields, validators, telemetry).
- Introduce a suite registry: composition is named, ordered, and versioned.
- Bind negotiation into authentication (offer + selection in transcript).
- Bind suite id + version into KDF / combiner context (domain separation).
- Deploy hybrid KEX explicitly as confidentiality hedge; document auth status honestly.
- Run dual signatures in shadow mode first; then enforce AND semantics.
- Treat MTU/first-flight growth as a release blocker; test middleboxes.
- Ban randomness reuse; enforce erasure of ephemeral material.
- Limit advertised combinations; prefer policy-driven short lists + retry.
- Sunset legacy draft suites deliberately, with telemetry and dates.
Further reading
- RFC 8446: TLS 1.3 (1)
- RFC 5869: HKDF (2)
- RFC 7748: X25519 (10)
- FIPS 203: ML-KEM (5)
- NIST SP 800-227: KEM Recommendations (6)
- IETF: Hybrid key exchange in TLS 1.3 (3)
- IETF: ML-KEM for TLS 1.3 (4)
- IETF: ECDHE-MLKEM for TLS 1.3 (12)
- IETF: X-Wing hybrid KEM (13)
- Open Quantum Safe: oqs-provider (14)