STUN Server Deployment Strategies

A STUN server does one cheap thing β€” it reflects a client’s public IP and port back in a Binding Response β€” but where you place those servers, how you route clients to the nearest one, and how you detect a dead node determine whether ICE gathering completes in tens of milliseconds or stalls long enough to be discarded before SDP exchange. This guide is part of the WebRTC Protocol Stack & Signaling Servers guide. The goal here is a concrete, multi-region STUN deployment: geographic placement that keeps reflexive lookups local, a routing layer that steers each client to its closest resolver, an anycast-versus-unicast decision grounded in UDP behaviour, and health checks that validate actual Binding cycles rather than TCP liveness.

STUN is stateless and almost free to run, which tempts teams to treat it as an afterthought and point everyone at stun:stun.l.google.com:19302. That works until you measure tail latency: a client in Singapore querying a US-East resolver adds 180–220 ms of round-trip time to a step that should cost single-digit milliseconds, and that delay lands directly on the critical path of ICE Candidate Gathering & Filtering. Multi-region placement cuts initial connect latency by 40–60% precisely because the server-reflexive (srflx) candidate is the one most ICE agents end up nominating on networks where direct host paths fail but the NAT is not symmetric.

Multi-region STUN topology with nearest-resolver selection Three regional STUN resolvers (US-East, EU-West, AP-Southeast) sit behind an anycast or GeoDNS routing layer. Clients in each region resolve a single hostname and are steered to the geographically closest resolver, which returns the server-reflexive candidate. stun.yourdomain.com β€” one name, nearest node Anycast / GeoDNS routing layer STUN US-East 3478/udp STUN EU-West 3478/udp STUN AP-SE 3478/udp Client NYC low RTT Client Paris low RTT Client SG low RTT
One hostname fronts regional resolvers; clients reach the nearest node.

Step 1 β€” Geographic placement of resolvers

Place STUN resolvers in the same regions where your users β€” and your media path β€” already live. The reflexive lookup itself is one UDP round trip, so the only latency you control is propagation: a resolver within 30 ms of the client is the target. Three regions (a US point of presence, a European one, and an Asia-Pacific one) cover the majority of global traffic; add a fourth in South America or India only when your analytics show a concentrated population paying a 150 ms+ penalty.

Co-locate STUN with the rest of your edge where it makes sense. If you also run TURN relays β€” see TURN Server Configuration & Auth β€” putting the STUN listener in the same region keeps the fallback path coherent: a client that fails srflx and escalates to a relay does not suddenly cross an ocean. Bind each node to dual-stack IPv4/IPv6 interfaces so mobile clients on IPv6-only carriers still gather a usable reflexive candidate.

# coturn STUN-only listener, one node per region
listening-port=3478
# Advertise the node's public address, not the cloud-internal RFC 1918 IP
external-ip=203.0.113.21
no-tls           # STUN needs no TLS; drop the listener to shrink attack surface
no-tcp           # STUN binding requests are UDP-only
no-auth          # plain STUN (RFC 8489) is unauthenticated by design
no-cli           # disable the telnet admin console in production

Keep resolvers stateless. A STUN Binding exchange carries no session, so any node can answer any client β€” that property is what lets you scale horizontally behind a load balancer or anycast prefix without session affinity. The moment you add connection tracking you have broken the assumption that makes STUN cheap.

Size each node for request rate, not bandwidth. A STUN Binding is one small UDP datagram in and one out, so a modest instance answers tens of thousands of lookups per second; the constraint is packet-per-second handling and the kernel’s UDP socket buffers, not CPU or link capacity. This is the opposite of TURN, where each relayed session consumes sustained media bandwidth. Because the workload is so light, the right granularity is one small node (or a small autoscaling group) per region rather than a few large central boxes β€” placement near users buys more than vertical scale ever will.

Step 2 β€” Minimising latency: routing clients to the nearest node

Geographic placement only pays off if each client actually reaches its closest resolver. Two routing mechanisms achieve this, and they apply at different layers.

GeoDNS resolves a single hostname (stun.yourdomain.com) to the regional IP nearest the client’s resolver. It is simple and works with the standard iceServers array, but it inherits DNS caching: a client on a misconfigured resolver, or one using a public DNS service far from its physical location, can be steered to the wrong region. Keep the TTL low (30–60 s) so failover is timely without hammering your authoritative servers.

The alternative β€” anycast, covered in depth in Step 3 β€” advertises one IP from every region and lets BGP pick the closest. Either way, the client config stays trivial: list one or two STUN URLs and let the routing layer resolve them. Do not list five regional hostnames in iceServers; browsers cap the ICE candidate pool, and every extra endpoint adds DNS resolution time and srflx candidates that compete for nomination without improving connectivity.

// Client config points at ONE routed hostname, not a hardcoded region.
// The routing layer (anycast or GeoDNS) selects the nearest node.
const pc = new RTCPeerConnection({
  iceServers: [
    { urls: 'stun:stun.yourdomain.com:3478' },      // nearest node, resolved at runtime
    { urls: 'turn:turn.yourdomain.com:3478',         // co-located relay fallback
      username: creds.username, credential: creds.credential }
  ],
  iceCandidatePoolSize: 10  // pre-gather srflx candidates before the call starts
});

Pre-gathering with iceCandidatePoolSize warms the reflexive lookup ahead of createOffer(), so the round trip to the nearest resolver overlaps with signalling rather than serialising after it. On a well-placed node this removes the STUN round trip from the perceived connect path entirely.

Step 3 β€” Anycast vs unicast topology

Anycast announces the same IP from multiple physical locations; the network routes each packet to the topologically nearest announcement. For STUN this is attractive because a single iceServers entry transparently resolves to a local node, with sub-second failover when a region withdraws its route β€” no DNS TTL to wait out.

The caveat is UDP statelessness, and for STUN it happens to be a non-issue. A STUN Binding is request/response: the client sends one packet, gets one back, and the exchange is complete. Even if BGP reconverges mid-flight and a retransmit lands on a different node, that node can answer it identically because no node holds session state. This is exactly why anycast pairs cleanly with STUN but is dangerous for TURN, where an allocation is a long-lived stateful flow that must stay pinned to one server.

Unicast (distinct IPs per region, fronted by GeoDNS) is the pragmatic default for teams without their own anycast prefix and BGP relationships. It is operationally simpler, debuggable with a plain dig, and good enough when your TTLs are short.

Property Anycast Unicast + GeoDNS
Client config One IP, network-routed One hostname, DNS-routed
Failover speed Sub-second (BGP withdraw) Bounded by DNS TTL (30–60 s)
STUN suitability Excellent (stateless req/resp) Excellent
TURN suitability Poor (stateful allocations) Acceptable with pinning
Operational cost High (BGP, PI space) Low (managed DNS)
Debuggability Harder (path-dependent) Easy (dig, traceroute)

Most deployments start unicast and graduate to anycast only when failover latency or per-region DNS skew becomes a measured problem.

A practical hybrid avoids choosing globally: run anycast for STUN where you already have the prefix and BGP relationships, and keep TURN on distinct unicast IPs with session pinning. Because STUN and TURN are answered by the same coturn binary, you can still co-locate them on one host per region β€” you simply announce the STUN listener into the anycast prefix and bind the TURN listener to a region-specific address that GeoDNS or explicit per-region hostnames resolve. The client then gathers reflexive candidates from the network-routed STUN IP and, only on symmetric-NAT fallback, allocates a relay on the pinned TURN address.

Step 4 β€” Health checks and verification

A STUN node can pass a TCP connect check and still be useless: the process may be alive while the UDP listener is wedged or returning a private address. Health checks must validate a real Binding Request/Response cycle and assert that the mapped address is public, then deregister failing nodes from the routing layer.

#!/bin/bash
# Validate each resolver behind the routed hostname with a real STUN exchange.
# Requires stun-client (apt install stun-client) β€” checks the Mapped Address.
for host in $(dig +short stun.yourdomain.com A); do
  out=$(stun-client --mode full --localport 0 "$host" 3478 2>&1)
  if echo "$out" | grep -q "Mapped address"; then
    addr=$(echo "$out" | grep "Mapped address" | grep -oE '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+')
    # Reject RFC 1918 leaks β€” a node returning a private IP is misconfigured.
    if [[ "$addr" =~ ^(10\.|192\.168\.|172\.(1[6-9]|2[0-9]|3[01])\.) ]]; then
      echo "UNHEALTHY (private mapped addr $addr): $host"   # deregister
    else
      echo "HEALTHY ($addr): $host"
    fi
  else
    echo "UNHEALTHY (no response): $host"                   # deregister
  fi
done

Drive this from your load balancer or an external prober on a 10–15 s interval, and export coturn’s Prometheus counters so you can alarm on request rate, dropped packets, and 4xx error responses. To verify end-to-end from a browser, open chrome://webrtc-internals, start a connection, and confirm srflx candidates appear with the expected public address and a gathering time well under your ICE timeout. A node returning candidates 200 ms+ late is effectively unhealthy even if it answers β€” those candidates risk being discarded before SDP exchange.

Health and routing must be wired together or the check is cosmetic. With GeoDNS, a failing prober should remove the regional A record (respecting the 30–60 s TTL) so new clients resolve to a healthy neighbour. With anycast, deregistration means withdrawing the BGP announcement from the failing node so the network reroutes to the next-nearest region within a second. Either way, run the prober from multiple vantage points: a node can be reachable from your monitoring VPC but blackholed from a particular carrier, and a single-origin check will miss it. Capture a packet trace with tcpdump -i any -n 'udp port 3478' when a node flaps to confirm whether requests arrive and responses leave β€” that distinguishes a wedged listener from an upstream routing problem.

Edge Cases & Browser Quirks

Common Implementation Mistakes

FAQ

How many STUN regions do I actually need? Three β€” North America, Europe, and Asia-Pacific β€” cover most global traffic and keep nearly all users within a 30 ms reflexive round trip. Add a fourth region only when analytics show a concentrated population paying a 150 ms+ penalty. Per client, expose one routed hostname rather than several regional ones.

Should I run my own STUN at all, or just use public servers? It depends on reliability and privacy requirements. Public resolvers like stun.l.google.com are free but offer no SLA, no latency guarantee near your users, and unannounced rate limits. The full trade-off β€” including a minimal coturn STUN-only config β€” is in self-hosting Coturn STUN vs public STUN servers.

Can I run STUN and TURN on the same coturn process? Yes β€” coturn answers STUN Binding requests on the same listener whether or not TURN relaying is enabled. But the deployment topologies diverge: STUN scales statelessly behind anycast, while TURN allocations are stateful and must pin to one node. For a relay-grade config see configuring Coturn for production TURN relay.

Why does anycast work for STUN but not for TURN? A STUN Binding is a single stateless request/response, so any node can answer any packet and BGP reconvergence is harmless. A TURN allocation is a long-lived stateful flow; if anycast reroutes mid-session to a node that holds no allocation state, the relay breaks. STUN is the stateless half of the same coturn binary.

Related: this guide sits under WebRTC Protocol Stack & Signaling Servers; pair it with self-hosting Coturn STUN vs public STUN servers, TURN Server Configuration & Auth, and ICE Candidate Gathering & Filtering for the full NAT-traversal path.