STUN Server Deployment Strategies

A STUN server does one cheap thing — it reflects a client’s public IP and port back in a Binding Response — but where you place those servers, how you route clients to the nearest one, and how you detect a dead node determine whether ICE gathering completes in tens of milliseconds or stalls long enough to be discarded before SDP exchange. This guide is part of the WebRTC Protocol Stack & Signaling Servers guide. The goal here is a concrete, multi-region STUN deployment: geographic placement that keeps reflexive lookups local, a routing layer that steers each client to its closest resolver, an anycast-versus-unicast decision grounded in UDP behaviour, and health checks that validate actual Binding cycles rather than TCP liveness.

STUN is stateless and almost free to run, which tempts teams to treat it as an afterthought and point everyone at stun:stun.l.google.com:19302. That works until you measure tail latency: a client in Singapore querying a US-East resolver adds 180–220 ms of round-trip time to a step that should cost single-digit milliseconds, and that delay lands directly on the critical path of ICE Candidate Gathering & Filtering. Multi-region placement cuts initial connect latency by 40–60% precisely because the server-reflexive (srflx) candidate is the one most ICE agents end up nominating on networks where direct host paths fail but the NAT is not symmetric.

One hostname fronts regional resolvers; clients reach the nearest node.

Step 1 — Geographic placement of resolvers

Place STUN resolvers in the same regions where your users — and your media path — already live. The reflexive lookup itself is one UDP round trip, so the only latency you control is propagation: a resolver within 30 ms of the client is the target. Three regions (a US point of presence, a European one, and an Asia-Pacific one) cover the majority of global traffic; add a fourth in South America or India only when your analytics show a concentrated population paying a 150 ms+ penalty.

Co-locate STUN with the rest of your edge where it makes sense. If you also run TURN relays — see TURN Server Configuration & Auth — putting the STUN listener in the same region keeps the fallback path coherent: a client that fails srflx and escalates to a relay does not suddenly cross an ocean. Bind each node to dual-stack IPv4/IPv6 interfaces so mobile clients on IPv6-only carriers still gather a usable reflexive candidate.

# coturn STUN-only listener, one node per region
listening-port=3478
# Advertise the node's public address, not the cloud-internal RFC 1918 IP
external-ip=203.0.113.21
no-tls           # STUN needs no TLS; drop the listener to shrink attack surface
no-tcp           # STUN binding requests are UDP-only
no-auth          # plain STUN (RFC 8489) is unauthenticated by design
no-cli           # disable the telnet admin console in production

Keep resolvers stateless. A STUN Binding exchange carries no session, so any node can answer any client — that property is what lets you scale horizontally behind a load balancer or anycast prefix without session affinity. The moment you add connection tracking you have broken the assumption that makes STUN cheap.

Size each node for request rate, not bandwidth. A STUN Binding is one small UDP datagram in and one out, so a modest instance answers tens of thousands of lookups per second; the constraint is packet-per-second handling and the kernel’s UDP socket buffers, not CPU or link capacity. This is the opposite of TURN, where each relayed session consumes sustained media bandwidth. Because the workload is so light, the right granularity is one small node (or a small autoscaling group) per region rather than a few large central boxes — placement near users buys more than vertical scale ever will.

Step 2 — Minimising latency: routing clients to the nearest node

Geographic placement only pays off if each client actually reaches its closest resolver. Two routing mechanisms achieve this, and they apply at different layers.

GeoDNS resolves a single hostname (stun.yourdomain.com) to the regional IP nearest the client’s resolver. It is simple and works with the standard iceServers array, but it inherits DNS caching: a client on a misconfigured resolver, or one using a public DNS service far from its physical location, can be steered to the wrong region. Keep the TTL low (30–60 s) so failover is timely without hammering your authoritative servers.

The alternative — anycast, covered in depth in Step 3 — advertises one IP from every region and lets BGP pick the closest. Either way, the client config stays trivial: list one or two STUN URLs and let the routing layer resolve them. Do not list five regional hostnames in iceServers; browsers cap the ICE candidate pool, and every extra endpoint adds DNS resolution time and srflx candidates that compete for nomination without improving connectivity.

// Client config points at ONE routed hostname, not a hardcoded region.
// The routing layer (anycast or GeoDNS) selects the nearest node.
const pc = new RTCPeerConnection({
  iceServers: [
    { urls: 'stun:stun.yourdomain.com:3478' },      // nearest node, resolved at runtime
    { urls: 'turn:turn.yourdomain.com:3478',         // co-located relay fallback
      username: creds.username, credential: creds.credential }
  ],
  iceCandidatePoolSize: 10  // pre-gather srflx candidates before the call starts
});

Pre-gathering with iceCandidatePoolSize warms the reflexive lookup ahead of createOffer(), so the round trip to the nearest resolver overlaps with signalling rather than serialising after it. On a well-placed node this removes the STUN round trip from the perceived connect path entirely.

Step 3 — Anycast vs unicast topology

Anycast announces the same IP from multiple physical locations; the network routes each packet to the topologically nearest announcement. For STUN this is attractive because a single iceServers entry transparently resolves to a local node, with sub-second failover when a region withdraws its route — no DNS TTL to wait out.

The caveat is UDP statelessness, and for STUN it happens to be a non-issue. A STUN Binding is request/response: the client sends one packet, gets one back, and the exchange is complete. Even if BGP reconverges mid-flight and a retransmit lands on a different node, that node can answer it identically because no node holds session state. This is exactly why anycast pairs cleanly with STUN but is dangerous for TURN, where an allocation is a long-lived stateful flow that must stay pinned to one server.

Unicast (distinct IPs per region, fronted by GeoDNS) is the pragmatic default for teams without their own anycast prefix and BGP relationships. It is operationally simpler, debuggable with a plain dig, and good enough when your TTLs are short.

Property	Anycast	Unicast + GeoDNS
Client config	One IP, network-routed	One hostname, DNS-routed
Failover speed	Sub-second (BGP withdraw)	Bounded by DNS TTL (30–60 s)
STUN suitability	Excellent (stateless req/resp)	Excellent
TURN suitability	Poor (stateful allocations)	Acceptable with pinning
Operational cost	High (BGP, PI space)	Low (managed DNS)
Debuggability	Harder (path-dependent)	Easy (`dig`, traceroute)

Most deployments start unicast and graduate to anycast only when failover latency or per-region DNS skew becomes a measured problem.

A practical hybrid avoids choosing globally: run anycast for STUN where you already have the prefix and BGP relationships, and keep TURN on distinct unicast IPs with session pinning. Because STUN and TURN are answered by the same coturn binary, you can still co-locate them on one host per region — you simply announce the STUN listener into the anycast prefix and bind the TURN listener to a region-specific address that GeoDNS or explicit per-region hostnames resolve. The client then gathers reflexive candidates from the network-routed STUN IP and, only on symmetric-NAT fallback, allocates a relay on the pinned TURN address.

Step 4 — Health checks and verification

A STUN node can pass a TCP connect check and still be useless: the process may be alive while the UDP listener is wedged or returning a private address. Health checks must validate a real Binding Request/Response cycle and assert that the mapped address is public, then deregister failing nodes from the routing layer.

#!/bin/bash
# Validate each resolver behind the routed hostname with a real STUN exchange.
# Requires stun-client (apt install stun-client) — checks the Mapped Address.
for host in $(dig +short stun.yourdomain.com A); do
  out=$(stun-client --mode full --localport 0 "$host" 3478 2>&1)
  if echo "$out" | grep -q "Mapped address"; then
    addr=$(echo "$out" | grep "Mapped address" | grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+')
    # Reject RFC 1918 leaks — a node returning a private IP is misconfigured.
    if [[ "$addr" =~ ^(10\.|192\.168\.|172\.(1[6-9]|2[0-9]|3[01])\.) ]]; then
      echo "UNHEALTHY (private mapped addr $addr): $host"   # deregister
    else
      echo "HEALTHY ($addr): $host"
    fi
  else
    echo "UNHEALTHY (no response): $host"                   # deregister
  fi
done

Drive this from your load balancer or an external prober on a 10–15 s interval, and export coturn’s Prometheus counters so you can alarm on request rate, dropped packets, and 4xx error responses. To verify end-to-end from a browser, open chrome://webrtc-internals, start a connection, and confirm srflx candidates appear with the expected public address and a gathering time well under your ICE timeout. A node returning candidates 200 ms+ late is effectively unhealthy even if it answers — those candidates risk being discarded before SDP exchange.

Health and routing must be wired together or the check is cosmetic. With GeoDNS, a failing prober should remove the regional A record (respecting the 30–60 s TTL) so new clients resolve to a healthy neighbour. With anycast, deregistration means withdrawing the BGP announcement from the failing node so the network reroutes to the next-nearest region within a second. Either way, run the prober from multiple vantage points: a node can be reachable from your monitoring VPC but blackholed from a particular carrier, and a single-origin check will miss it. Capture a packet trace with tcpdump -i any -n 'udp port 3478' when a node flaps to confirm whether requests arrive and responses leave — that distinguishes a wedged listener from an upstream routing problem.

Edge Cases & Browser Quirks

Candidate pool caps differ. Chrome and Firefox cap the number of candidates gathered; listing many regional STUN hostnames inflates DNS resolution time and produces redundant srflx candidates rather than better connectivity. One routed hostname is correct.
Firefox mDNS obfuscation. Firefox (and Chrome) replace host candidate IPs with .local mDNS names by default, which makes the srflx candidate from STUN even more important — it is often the first globally routable candidate the remote peer can use.
Safari gathering timeout. Safari is less tolerant of slow STUN responses and may finish gathering before a distant or rate-limited resolver replies, silently dropping that srflx candidate. Nearest-node routing is what keeps Safari from skipping STUN entirely.
Symmetric NAT defeats STUN regardless of placement. On a symmetric NAT the port mapping changes per destination, so the reflexive address STUN learns does not match the address used to the peer. No amount of geographic tuning fixes this — a TURN relay is the mandatory fallback, which is why traversing symmetric NAT with TURN is the companion path. Mobile and CGNAT bindings also refresh in under 30 s, so a candidate gathered too early can expire.
MTU on carrier networks. Keep STUN responses under 1280 bytes; some carrier paths fragment larger UDP datagrams unreliably, dropping the response and forcing a retransmit that costs another round trip.

Common Implementation Mistakes

Single-region or single-instance STUN. One node, or one region, is a single point of failure on the critical path of every connection and adds cross-continent latency for half your users. Deploy per-region with a routing layer in front.
external-ip left at the cloud-internal address. Behind a cloud NAT gateway the node advertises an RFC 1918 mapped address, so every client receives a useless private srflx candidate. Set external-ip to the public address and verify with the health check above.
Routing STUN through a TCP load balancer. STUN Binding requests are UDP; a TCP LB drops or mangles them. Use a UDP-aware L4 balancer with Direct Server Return to preserve the source address, or rely on anycast/GeoDNS.
TCP-style health checks. A port-open check reports a wedged UDP listener as healthy. Probe with an actual Binding exchange and assert a public mapped address.
No TURN fallback. STUN cannot traverse symmetric NAT. Shipping STUN-only guarantees connection failures for users behind symmetric or carrier-grade NAT; always pair it with a relay.
No rate limiting. STUN responses are slightly larger than requests, making open resolvers a UDP amplification vector. Apply per-source request quotas even though plain STUN needs no authentication.

FAQ

How many STUN regions do I actually need? Three — North America, Europe, and Asia-Pacific — cover most global traffic and keep nearly all users within a 30 ms reflexive round trip. Add a fourth region only when analytics show a concentrated population paying a 150 ms+ penalty. Per client, expose one routed hostname rather than several regional ones.

Should I run my own STUN at all, or just use public servers? It depends on reliability and privacy requirements. Public resolvers like stun.l.google.com are free but offer no SLA, no latency guarantee near your users, and unannounced rate limits. The full trade-off — including a minimal coturn STUN-only config — is in self-hosting Coturn STUN vs public STUN servers.

Can I run STUN and TURN on the same coturn process? Yes — coturn answers STUN Binding requests on the same listener whether or not TURN relaying is enabled. But the deployment topologies diverge: STUN scales statelessly behind anycast, while TURN allocations are stateful and must pin to one node. For a relay-grade config see configuring Coturn for production TURN relay.

Why does anycast work for STUN but not for TURN? A STUN Binding is a single stateless request/response, so any node can answer any packet and BGP reconvergence is harmless. A TURN allocation is a long-lived stateful flow; if anycast reroutes mid-session to a node that holds no allocation state, the relay breaks. STUN is the stateless half of the same coturn binary.

Related: this guide sits under WebRTC Protocol Stack & Signaling Servers; pair it with self-hosting Coturn STUN vs public STUN servers, TURN Server Configuration & Auth, and ICE Candidate Gathering & Filtering for the full NAT-traversal path.

Related Guides