Conversation
Add PROXY protocol support to the gateway with two server-side config options instead of client-controlled SNI suffixes: - inbound_pp_enabled: read PP headers from upstream load balancers - outbound_pp_enabled: send PP headers to backend apps The original PR#361 used a 'p' suffix in the SNI subdomain to toggle outbound PP per-connection. This is a security flaw: a client could connect to a PP-expecting port without sending PP headers, allowing source address spoofing. Both flags are now server-side config only.
446483d to
d5a137c
Compare
Replace the global outbound_pp_enabled switch with a per-(instance, port) lookup so different ports of the same backend can have different PP behaviour. PP is declared by the app and reported to the gateway through authenticated channels — never by client SNI. Pipeline: 1. dstack-types::AppCompose grows a "ports" array. Each entry carries a port number and a "pp" flag. Because it's part of app-compose.json it is measured into compose_hash and attested. 2. RegisterCvmRequest grows an optional PortAttrsList. New CVMs include their port_attrs at WireGuard registration time. The optional wrapper lets the gateway distinguish "not reported" (legacy CVM) from "reported empty" (new CVM with no PP-enabled port). 3. The gateway stores port_attrs on InstanceInfo and persists/syncs it via WaveKV (InstanceData), keyed by instance_id (different instances of the same app may run different code). 4. AddressInfo now carries instance_id, and connect_multiple_hosts returns the winner's instance_id. The proxy looks up that instance's port_attrs to decide whether to send a PROXY header. 5. Backward compat: if an instance has no port_attrs (legacy CVM), the gateway lazily fetches them via the agent's Info() RPC, parses tcb_info.app_compose, and caches the result in WaveKV. PROXY protocol module is unchanged; only the *decision* of whether to send a header moves from a global config to a per-instance lookup.
A re-registration from a legacy CVM carries port_attrs=None, which previously wiped any value learned at an earlier registration or lazy fetch. Gateway restart + CVM re-register would then force a redundant Info() fetch. Keep cached attrs unless the caller actively reports new ones; same instance_id implies same compose_hash, so the cache cannot go stale.
Same instance_id with a different compose_hash means the app was upgraded in place (typical for KMS-provisioned CVMs that reuse their disk). Previously, a legacy-style re-registration (port_attrs=None) would preserve stale cached attrs across such upgrades because the gateway assumed instance_id ↔ compose_hash was stable. Track the compose_hash each cached port_attrs was learned against (taken directly from the attested AppInfo, not from client input). Mismatch clears the cache so the lazy Info() fetch runs again.
End-to-end test on tdxlabDeployed an nginx app with Test endpoints
App ID Outbound PP (per-port, declared in
|
| Port | pp |
Backend sees |
|---|---|---|
| 8080 | true | proxy_protocol_addr=107.131.79.101 (real client) |
| 8081 | false | remote_addr=10.8.42.1 (gateway WG IP — client IP lost, as expected) |
Inbound PP (gateway behind a PP-aware LB)
Set inbound_pp_enabled = true, moved gateway listen to :13006, fronted with haproxy on :13004 using send-proxy-v2:
client (107.131.79.101) → haproxy:13004 → [PP v2] → gateway:13006 → [PP v2] → backend
Result on pp=true port: origin addr = 107.131.79.101 — the real client IP propagates through both hops.
There was a problem hiding this comment.
Pull request overview
Adds end-to-end PROXY protocol (v1/v2) support in dstack-gateway, with per-(instance_id, port) outbound decisioning sourced from attested app-compose metadata (and lazily fetched for legacy CVMs), plus optional inbound PP parsing when the gateway is behind a PP-aware LB.
Changes:
- Introduce
AppCompose.ports/PortAttrsand propagate port attributes through CVM registration (protobuf + dstack-util). - Add gateway-side PP header read/synthesis and conditional outbound PP header injection per selected backend instance/port.
- Persist per-instance port attributes in WaveKV with compose-hash invalidation and legacy lazy fetch via guest-agent
Info().
Reviewed changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| guest-agent/src/rpc_service.rs | Updates test fixture for new AppCompose.ports field. |
| gateway/src/proxy/tls_terminate.rs | Injects outbound PP header (when enabled) before bridging to backend. |
| gateway/src/proxy/tls_passthough.rs | Carries PP header through passthrough flow; returns winning instance_id from racing connect; injects PP header per port. |
| gateway/src/proxy/port_attrs.rs | New: per-instance/port lookup with legacy lazy fetch via agent Info(). |
| gateway/src/proxy.rs | Reads/synthesizes inbound PP header before SNI extraction; passes header through proxy paths. |
| gateway/src/pp.rs | New: inbound PROXY protocol v1/v2 parse + synthesized header creation + display helper. |
| gateway/src/models.rs | Extends InstanceInfo with port_attrs and port_attrs_hash. |
| gateway/src/main_service/tests.rs | Adjusts test calls for updated registration/new_client signatures. |
| gateway/src/main_service/snapshots/dstack_gateway__main_service__tests__config.snap | Snapshot update for new InstanceInfo fields. |
| gateway/src/main_service/snapshots/dstack_gateway__main_service__tests__config-2.snap | Snapshot update for new InstanceInfo fields. |
| gateway/src/main_service.rs | Wires registration to store port attrs + compose hash; persists/invalidates cache; exposes instance lookup/update helpers; threads instance_id through address selection. |
| gateway/src/main.rs | Registers new pp module. |
| gateway/src/kv/mod.rs | Persists port_attrs and port_attrs_hash in InstanceData; defines PortFlags. |
| gateway/src/debug_service.rs | Updates debug registration call signature. |
| gateway/src/config.rs | Adds agent_port, inbound_pp_enabled, and timeouts.pp_header. |
| gateway/rpc/proto/gateway_rpc.proto | Adds PortAttrs / PortAttrsList; extends RegisterCvmRequest with optional port_attrs. |
| gateway/gateway.toml | Adds inbound_pp_enabled and pp_header timeout configuration. |
| gateway/Cargo.toml | Adds proxy-protocol dependency. |
| dstack-util/src/system_setup.rs | Sends port_attrs during registration based on app-compose ports. |
| dstack-types/src/lib.rs | Adds AppCompose.ports and PortAttrs schema. |
| Cargo.toml | Adds workspace dependency pin for proxy-protocol. |
| Cargo.lock | Locks new dependency graph for proxy-protocol and transitive deps. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Three fixes from review:
1. Treat the wire-format `port: uint32` as out-of-range when it can't fit
in u16 (instead of silently truncating to a different valid port). Use
`u16::try_from` and skip invalid entries.
2. Move the legacy `Info()` lazy fetch off the connection critical path:
- `should_send_pp` is now sync. On a cache hit it returns the declared
value; on a miss it enqueues the instance for the background worker
and returns `pp = false` immediately, so a slow/missing CVM agent
never blocks a proxied connection.
- A single background task (`spawn_fetcher`) drains the queue, dedupes
in-flight instance ids via a HashSet, applies a configurable
timeout (`timeouts.port_attrs_fetch`, default 10s), and writes the
result back to WaveKV.
3. Add unit tests in `pp.rs` for the inbound PROXY parser: v1/v2 IPv4
happy paths, no-prefix rejection, v1 missing terminator, v2
over-length cap, and the address synthesis/Display helpers.
When a CVM registers without port_attrs (legacy CVM, or compose_hash mismatch invalidated the cache), enqueue a background fetch right away instead of waiting for the first proxied connection to discover the miss. Reduces the window during which the fast path returns a wrong `pp = false` because the cache hasn't been populated yet. The fetcher dedupes in-flight ids, so this is safe to enqueue on every registration that ends up without cached attrs.
Right after registration, the WireGuard handshake hasn't completed yet and the agent's TCP port isn't reachable. The previous one-shot fetch would fail and leave the cache empty, falling back to pp=false until the next connection (which would itself eat one more failed fetch). Move the timeout/retry policy into a dedicated config block so it can be tuned per deployment: [core.proxy.port_attrs_fetch] timeout = "10s" # per-attempt Info() RPC timeout max_retries = 5 # extra attempts after the initial try backoff_initial = "1s" # doubles each retry up to backoff_max backoff_max = "30s" Worst-case 1+2+4+8+16+30 ≈ 1 min covers a reasonable WG warmup window. Bail out early when the instance is no longer in state (recycled while queued) — the unknown-instance error chain is the signal.
Don't waste a 1-minute retry budget on errors that can't recover. Two classes: - Transient → retry: TCP/RPC failure, Info() timeout. The CVM may just be warming up. - Permanent → bail: instance was recycled (no longer in state), tcb_info isn't valid JSON, missing app_compose key, or app_compose itself fails to parse. Same input each retry, same failure. `tcb_info` empty (public_tcbinfo=false) still goes through the success path with an empty map cached, as before — that's not a fetch failure.
Thread the new gateway config knobs through the dstack-app deployment: - .env / .app_env gains `INBOUND_PP_ENABLED` (default false). Set to true only when the gateway runs behind a PP-aware L4 LB; otherwise every connection would be rejected because the parser would try to read a PP header that isn't there. - docker-compose.yaml forwards the new env vars plus the retry/backoff knobs for the background port_attrs fetcher and the pp_header read timeout. - entrypoint.sh writes the corresponding fields into gateway.toml, including the new [core.proxy.port_attrs_fetch] section. Defaults match the in-repo gateway.toml so existing deployments continue to work without any .env changes.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 25 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
The pre-existing script had three latent issues that weren't checked because the file hadn't been touched. Modifying it for the PP rollout brings it into the prek diff, so fix them now: - SC1091: `source .env` — explicitly mark dynamic include - SC2002: replace `cat … | tr …` with `tr … < file` redirect - SC2086: quote $WG_ADDR in the cut pipeline
Summary
Add PROXY protocol (v1/v2) support to dstack-gateway, with the decision of whether to send a PP header to a backend made per
(instance, port)— not per-client, not per-gateway.Background & security
An earlier revision of this PR encoded PP as a
psuffix in the SNI subdomain (e.g.app-8080p.domain.com). That's client-controlled: a client could connect to a PP-expecting port without the suffix, the gateway would skip writing the PP header, and the backend would fall back to the raw TCP peer address — the gateway's WireGuard IP — as the source. Effectively a source-address spoof.PP must therefore be declared by the app itself and delivered to the gateway through channels that clients cannot forge.
Design
1.
AppCompose.ports(dstack-types)Apps declare per-port attributes in
app-compose.json:{ "ports": [{ "port": 8080, "pp": true }] }Because it's part of app-compose, the declaration is measured into
compose_hashand covered by attestation.2. Reported at registration (
RegisterCvmRequest.port_attrs)New CVMs include their
port_attrsin the existing WireGuard registration RPC. The field is wrapped inoptional PortAttrsListso the gateway can distinguish "not reported" (old dstack-util) from "reported empty" (new CVM with no PP-enabled port).3. Stored per-instance, synced across gateway nodes
InstanceInfo/InstanceDatagrow aport_attrsmap and aport_attrs_hash(thecompose_hashit was learned against). Both are persisted in the existing WaveKVinst/{instance_id}record, so per-instance decisions survive gateway restarts and propagate across the cluster without extra keys.Different instances of the same app may legitimately run different compose hashes (rolling upgrades), so caching is keyed by
instance_id, notapp_id.4. Per-connection decision
AddressInfocarries the instance_id andconnect_multiple_hostsreturns the winner's id.should_send_pp(state, instance_id, port)consults the cachedport_attrs:Info()RPC (see next section), cache, and use the result.pp=false.5. Backward compatibility
Legacy CVMs that don't yet ship
port_attrsat registration:port_attrs: Nonebut records the attestedcompose_hash.http://{cvm_ip}:{agent_port}/prpcInfo(), parsestcb_info.app_compose, extractsports, and writes the result back to WaveKV.Subtleties:
port_attrs=Nonedoes not wipe previously cached attrs (avoids a redundant lazy fetch every 3 minutes when dstack-util is old).compose_hash(app upgraded in place — typical for KMS-provisioned CVMs that reuse their disk) does invalidate the cache so stale PP flags don't outlive the upgrade.6. Inbound PP
inbound_pp_enabled(server config) tells the gateway to read a PP header from the inbound TCP stream — used when the gateway sits behind a PP-aware LB like Cloudflare. When disabled, the gateway synthesises a PP header from the real TCP peer. Either way, the resulting header is what gets forwarded to the backend (when enabled).Config surface
In
gateway.tomlunder[core.proxy]:There is no global
outbound_pp_enabled— per-port control comes fromapp-compose.json.Files changed
dstack-types/src/lib.rs—AppCompose.ports,PortAttrsgateway/rpc/proto/gateway_rpc.proto—PortAttrs,PortAttrsList,RegisterCvmRequest.port_attrsdstack-util/src/system_setup.rs— CVM reportsport_attrsduring registrationgateway/src/pp.rs(new) — PP v1/v2 header parse + synthesisgateway/src/proxy/port_attrs.rs(new) —should_send_pp+ lazy fetchgateway/src/{config,main_service,models,debug_service}.rs,gateway/src/kv/mod.rs,gateway/src/proxy/{proxy,tls_passthough,tls_terminate}.rs— wiringTest plan
cargo check --workspacecargo test -p dstack-gateway(8 tests pass, snapshots updated)cargo fmt --allports: [{port: 8080, pp: true}], confirm PP header is received at the backendportsfield), confirm no PP header is sentpp: truetopp: false, confirm the cache is invalidated on the first re-registration with the newcompose_hashinbound_pp_enabled = true, confirm client IP is propagated end-to-end