TL;DR — TLS 1.3 0‑RTT resumption can shave 30‑70 ms off the first request of a client‑server interaction, but it demands careful ticket lifecycle management, replay mitigation, and tight integration with your edge stack. This post walks through a production‑ready architecture, networking tricks, and security hardening steps you can copy into Envoy, NGINX, or any modern reverse proxy.
In today’s ultra‑low‑latency world, the extra round‑trip of a full TLS handshake is no longer acceptable for high‑frequency APIs, mobile back‑ends, or edge‑driven microservices. TLS 1.3 introduced zero round‑trip time (0‑RTT) resumption, allowing a client to send encrypted application data on the first flight after presenting a pre‑issued session ticket. While the performance gains are compelling, the feature also opens a narrow attack surface—replay attacks, ticket leakage, and clock‑drift issues. This article shows how to design a robust 0‑RTT pipeline, embed it into common production components, and enforce best‑in‑class security controls.
Why Zero Round‑Trip Resumption Matters
- Latency impact – A typical TCP + TLS handshake on a 100 ms RTT network adds ~2 × RTT (client hello → server hello → finished). 0‑RTT eliminates the server‑side round‑trip, reducing first‑request latency by up to 70 ms on a 100 ms link and even more on satellite or high‑latency mobile links.
- Throughput boost – Fewer handshake packets mean less congestion on the TLS‑handshake path, freeing bandwidth for actual payload. In micro‑service fabrics where each request traverses multiple TLS‑terminated hops, the cumulative savings become measurable.
- User experience – For mobile apps, the “cold start” latency often determines churn. A 30 ms improvement in TLS latency can translate into a 1‑2 % lift in conversion rates, as shown in several A/B tests published by large e‑commerce platforms.
Because the benefit is quantifiable, many enterprises now require 0‑RTT for any service classified as “latency‑critical.” The challenge is to achieve that without compromising the confidentiality guarantees TLS provides.
Architecture Overview
At a high level, 0‑RTT resumption involves three moving parts:
- Ticket Issuer – The TLS terminator (Envoy, NGINX, HAProxy, etc.) encrypts a session ticket that contains the master secret, selected cipher suite, and optional application‑specific data (e.g., user ID, feature flags).
- Ticket Store – An external, highly‑available key‑value store (Redis, Consul, etc.) holds the ticket‑encryption keys and optionally the raw ticket metadata for audit.
- Replay Detector – A fast, in‑memory cache that tracks recent 0‑RTT nonces or ticket identifiers to reject duplicate uses within a configurable window.
Below is a diagram (conceptual, not rendered here) that illustrates the data flow:
Client --> [TLS Handshake] --> Envoy (Ticket Issuer) --> Redis (Key Store)
^ |
| v
<-- 0‑RTT Data (encrypted) <-- Replay Detector
The architecture is deliberately stateless from the client’s perspective: the client never contacts the ticket store directly; it simply presents the opaque ticket it received earlier. All state lives on the server side, allowing you to rotate keys without breaking active sessions.
Session Ticket Lifecycle
- Issuance – After a full handshake, Envoy encrypts the ticket using a ticket‑encryption key (TEK) derived from a master secret. The TEK itself is rotated every
Nminutes (commonly 12 h) and stored in Redis. - Distribution – The encrypted ticket is sent to the client in the
NewSessionTicketTLS extension. The client stores it in its TLS cache. - Resumption – On the next connection, the client includes the ticket in the
ClientHellounder thepre_shared_keyextension. The server decrypts the ticket, derives the symmetric keys, and immediately processes any 0‑RTT data. - Revocation / Expiration – Tickets have a short lifetime (e.g., 24 h). Expired tickets are rejected automatically by the server after decryption fails or the ticket’s timestamp is out of bounds.
Integration with Reverse Proxies (Envoy)
Envoy’s TLS filter (tls_context) supports 0‑RTT out of the box, but you need to configure it to point at an external key manager. Below is a minimal envoy.yaml snippet that enables 0‑RTT and wires the ticket keys to Redis via the tls_certificate SDS API.
static_resources:
listeners:
- name: listener_https
address:
socket_address:
address: 0.0.0.0
port_value: 443
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: https_ingress
route_config:
name: local_route
virtual_hosts:
- name: backend
domains: ["*"]
routes:
- match: { prefix: "/" }
route: { cluster: backend_cluster }
http_filters:
- name: envoy.filters.http.router
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
common_tls_context:
tls_certificates:
- certificate_chain:
filename: "/etc/envoy/certs/server.crt"
private_key:
filename: "/etc/envoy/certs/server.key"
# Enable 0‑RTT
enable_early_data: true
# External key manager for ticket encryption
session_ticket_keys:
sds_secret_config:
name: "tls_ticket_key"
sds_config:
path: "/etc/envoy/sds/ticket_keys.yaml"
The session_ticket_keys block tells Envoy to fetch the TEK from an SDS (Secret Discovery Service) file that is periodically refreshed by a sidecar process. That sidecar pulls the latest key version from Redis:
# ticket_key_updater.py
import redis, yaml, time, os
r = redis.Redis(host='redis', port=6379, db=0)
KEY_PATH = "/etc/envoy/sds/ticket_keys.yaml"
INTERVAL = 300 # seconds
def fetch_key():
# The key is stored as a base64‑encoded 48‑byte value
raw = r.get("tls:ticket:key")
if not raw:
raise RuntimeError("Ticket key missing in Redis")
return {"key": raw.decode()}
while True:
key_data = fetch_key()
with open(KEY_PATH, "w") as f:
yaml.safe_dump({"keys": [key_data]}, f)
os.chmod(KEY_PATH, 0o600)
time.sleep(INTERVAL)
Running this script as a sidecar guarantees that every Envoy instance uses the same TEK and rotates synchronously.
State Store Choices (Redis, Consul, etc.)
| Store | Pros | Cons | Typical Use‑Case |
|---|---|---|---|
| Redis | Sub‑millisecond latency, built‑in TTL, easy clustering | In‑memory cost, requires persistence tuning for durability | High‑traffic edge where ticket keys must be fetched on every connection |
| Consul KV | Strong consistency, service‑mesh integration | Higher latency (~1 ms), limited key size | Smaller deployments or when you already run Consul for service discovery |
| etcd | Strong consistency, audit logs | Write‑heavy workloads can cause contention | Environments already standardizing on etcd for config |
In practice, most large SaaS providers choose Redis because the ticket‑key read path is read‑heavy and latency‑sensitive. Use Redis’ EXPIRE feature to automatically retire keys after the rotation window.
Low‑Latency Networking Patterns
Zero‑RTT alone does not guarantee the lowest possible latency. You must align the TLS layer with the network stack and edge topology.
UDP‑Based QUIC vs TCP
QUIC (RFC 9000) ships native 0‑RTT support and avoids the TLS handshake entirely because TLS is embedded in the protocol. If your service can tolerate UDP (most HTTP/3 services can), consider moving to QUIC:
- Pros: One RTT for the first request, built‑in connection migration, better loss recovery.
- Cons: Requires a CDN or load balancer that terminates QUIC (e.g., Cloudflare, NGINX 1.23+ with
quicmodule).
When you cannot switch to QUIC (e.g., legacy databases that only speak TCP), keep the TLS 0‑RTT path but still apply the following optimizations.
Connection Coalescing
If you operate multiple logical services behind the same domain (e.g., api.example.com, auth.example.com), you can coalesce connections at the client side by reusing the same TLS session ticket across sub‑domains that share the same certificate. This reduces the number of tickets you need to manage and improves cache hit rates.
Implementation tip for browsers: set the TLS SessionTicket cookie with Domain=.example.com. For native clients, expose a shared ticket cache in a library (e.g., tls-ticket-pool in Go).
Batching Ticket Encryption
Ticket encryption is CPU‑intensive because it uses AEAD ciphers (AES‑GCM or ChaCha20‑Poly1305). In a high‑QPS edge, you can batch the encryption/decryption calls:
// batch_encrypt.go
package ticket
import (
"crypto/aes"
"crypto/cipher"
"sync"
)
var (
pool = sync.Pool{
New: func() interface{} {
key := loadCurrentTicketKey() // 32‑byte AES‑256 key
block, _ := aes.NewCipher(key)
return cipher.NewGCM(block)
},
}
)
func EncryptTicket(plaintext []byte) ([]byte, error) {
aead := pool.Get().(cipher.AEAD)
defer pool.Put(aead)
nonce := make([]byte, aead.NonceSize())
// Fill nonce with crypto/rand...
return aead.Seal(nil, nonce, plaintext, nil), nil
}
By reusing the cipher.AEAD object across goroutines, you avoid per‑request key schedule re‑derivation, shaving ~5‑10 µs per ticket.
Security Best Practices
Performance gains are meaningless if a replay attack compromises user data. TLS 1.3 0‑RTT intentionally trades perfect forward secrecy for speed: the early data is encrypted with the same key derived from the ticket, which the server cannot retroactively verify. Mitigation strategies are therefore mandatory.
Replay Mitigation
- Stateless Token – Embed a monotonically increasing nonce inside the ticket payload, signed with the TEK. On receipt, the server checks the nonce against a short‑lived Bloom filter stored in Redis. Duplicate nonces indicate a replay.
- Idempotent Endpoints – Design APIs that can safely be replayed (e.g., GET, POST with idempotency keys). For non‑idempotent operations (e.g., financial transfers), reject 0‑RTT entirely.
- Application‑Level Checks – For sensitive actions, require a second factor (e.g., a short‑lived JWT) that is not sent in 0‑RTT data.
Example of a Bloom filter check in Python:
# replay_filter.py
import redis, hashlib, math
r = redis.Redis(host='redis', port=6379, db=0)
def _hashes(nonce):
h1 = int(hashlib.sha256(b"salt1"+nonce).hexdigest(), 16)
h2 = int(hashlib.sha256(b"salt2"+nonce).hexdigest(), 16)
for i in range(5): # 5 hash functions
yield (h1 + i * h2) % 2**20 # 1 MiB bit array
def is_replay(nonce: bytes) -> bool:
bits = list(_hashes(nonce))
pipe = r.pipeline()
for b in bits:
pipe.getbit("replay:bloom", b)
results = pipe.execute()
if all(results):
return True
# Not a replay, set bits now
pipe = r.pipeline()
for b in bits:
pipe.setbit("replay:bloom", b, 1)
pipe.expire("replay:bloom", 30) # 30 s window
pipe.execute()
return False
Ticket Lifetime and Rotation
- Short Lifetime – Keep ticket TTL ≤ 24 h. Shorter lifetimes reduce the window for replay and limit the impact of a leaked key.
- Key Rotation – Rotate the TEK every 6–12 h. Keep the previous key for a grace period (e.g., 30 min) to allow in‑flight handshakes to finish.
- Key Compromise Procedure – If a key leak is detected, invalidate all tickets by:
- Deleting the key from Redis.
- Flushing the replay Bloom filter.
- Issuing a new TEK and forcing a full handshake on all clients (e.g., by setting the
TLSsession_ticketextension to0via a HTTPCache-Control: no-storeheader on the first response).
Auditing and Observability
- Metrics – Export Prometheus counters:
tls_0rtt_resumed_totaltls_0rtt_replay_detected_totaltls_ticket_key_rotation_seconds
- Logs – Include the ticket identifier (first 8 bytes of the encrypted ticket) in structured logs for every resumed connection.
- Alerting – Trigger an alert if
tls_0rtt_replay_detected_totalspikes > 5 per minute, which often indicates a mass replay attempt.
Envoy can emit these metrics automatically when you enable the tls_stats filter:
http_filters:
- name: envoy.filters.http.tls_inspector
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.tls_inspector.v3.TlsInspector
Key Takeaways
- TLS 1.3 0‑RTT can cut first‑request latency by 30‑70 ms, but you must enforce strict ticket lifetimes and replay detection.
- Centralize ticket‑encryption keys in a low‑latency store (Redis is the de‑facto choice) and rotate them every 6‑12 hours.
- Use a sidecar or SDS process to push the current key to Envoy, NGINX, or any TLS terminator that supports external key managers.
- Pair 0‑RTT with network‑level tricks—QUIC when possible, connection coalescing across sub‑domains, and batched AEAD encryptions—to maximize throughput.
- Harden security with nonce‑based Bloom filters, idempotent API design, and robust observability (Prometheus metrics + structured logs).
Further Reading
- TLS 1.3 RFC 8446 – The official specification describing 0‑RTT and its security considerations.
- Cloudflare Blog: “Understanding TLS 1.3 0‑RTT” – Real‑world performance numbers and mitigation strategies from a global CDN.
- Envoy Documentation: TLS Inspector and Early Data – How to configure 0‑RTT in Envoy and expose relevant metrics.