Implementing Custom Signaling Protocols with gRPC-Web for WebRTC

Replacing JSON-over-WebSocket with type-safe, bidirectional gRPC-Web streams eliminates ad-hoc parsing and enforces strict framing for SDP and ICE payloads. This guide is part of the Signaling State Machine Patterns section, and it covers the exact decision of when a schema-driven RPC transport earns its tooling cost over a WebSocket β€” and how to wire one to RTCPeerConnection without breaking trickle ICE ordering.

Context & Trade-offs

WebSocket signaling is the default for good reason: it delivers SDP and ICE candidates in sub-10 ms over a single persistent socket with almost no setup. gRPC-Web does not beat that latency β€” both ride one HTTP/2 (or WebSocket-tunnelled) connection β€” so you do not adopt it for speed. You adopt it for a contract. A Protobuf schema makes Offer, Answer, Candidate, and Bye distinct, versioned message types, so a malformed payload fails at deserialisation instead of three frames later as an InvalidStateError. That matters most on teams shipping multiple client platforms against one signaling backend, where an untyped JSON envelope drifts silently between releases.

The cost is real: gRPC-Web cannot speak raw HTTP/2 from a browser, so you need a translating proxy β€” Envoy with the grpc_web and cors filters, or grpc-gateway/improbable-eng middleware in a Go or Node backend. That proxy adds an operational hop, a keepalive to tune, and binary-framing config that, if wrong, breaks every stream. Generated stubs add a build step. For a single-platform app with a stable message set, that overhead is not worth it and a typed WebSocket envelope (e.g. zod-validated JSON) gets you most of the safety. Reserve gRPC-Web for multi-client, multiplexed deployments where the schema pays for itself.

Factor JSON over WebSocket gRPC-Web
Message delivery sub-10 ms sub-10 ms (proxy hop adds <1 ms)
Type safety runtime validation only compile-time from .proto
Infra required a WebSocket endpoint Envoy/gateway proxy + keepalive
Best fit single client, fast iteration many clients, strict contract

Minimal Runnable Implementation

Define the contract with a oneof so SDP, ICE, and error payloads cannot collide during rapid candidate generation, then attach the bidirectional stream to the peer connection. Map inbound sdp to setRemoteDescription, inbound candidate to addIceCandidate, and buffer candidates until the remote description resolves β€” the same buffering rule the FSM enforces.

// signaling.proto β€” one message type, payloads mutually exclusive via oneof
syntax = "proto3";
package webrtc.signaling;

message SignalingMessage {
  string peer_id = 1;
  oneof payload {
    string sdp       = 2;   // JSON-encoded RTCSessionDescriptionInit
    string candidate = 3;   // JSON-encoded RTCIceCandidateInit
    string error     = 4;
  }
}

service SignalingService {
  // Full-duplex stream: client and server both write SDP/ICE as discovered
  rpc ExchangeSignals (stream SignalingMessage) returns (stream SignalingMessage) {}
}
// Client: bind the gRPC-Web stream to the RTCPeerConnection lifecycle
import { SignalingServiceClient } from './generated/signaling_grpc_web_pb';
import { SignalingMessage } from './generated/signaling_pb';

const client = new SignalingServiceClient('https://grpc-proxy.example.com');
const call = client.exchangeSignals();      // bidirectional stream handle
const pendingCandidates = [];               // hold until remote description set

pc.onicecandidate = (e) => {
  if (!e.candidate) return;                 // null candidate = gathering done
  const msg = new SignalingMessage();
  msg.setPeerId(remotePeerId);
  msg.setCandidate(JSON.stringify(e.candidate.toJSON()));
  call.write(msg);                          // trickle each candidate immediately
};

pc.onnegotiationneeded = async () => {
  const offer = await pc.createOffer();
  await pc.setLocalDescription(offer);      // stable -> have-local-offer
  const msg = new SignalingMessage();
  msg.setPeerId(remotePeerId);
  msg.setSdp(JSON.stringify({ type: offer.type, sdp: offer.sdp }));
  call.write(msg);
};

call.on('data', async (msg) => {
  if (msg.hasSdp()) {
    const init = JSON.parse(msg.getSdp());
    await pc.setRemoteDescription(init);     // now flush buffered candidates
    while (pendingCandidates.length) await pc.addIceCandidate(pendingCandidates.shift());
  } else if (msg.hasCandidate()) {
    const init = JSON.parse(msg.getCandidate());
    if (pc.remoteDescription) await pc.addIceCandidate(init);
    else pendingCandidates.push(init);       // buffer: avoids InvalidStateError
  }
});

On the server, keep an in-memory registry mapping peer_id to its active stream and forward messages atomically so payloads never interleave. Apply a bounded queue per stream to absorb the burst of a trickle-ICE phase β€” an unbuffered stream can exhaust the heap under rapid candidate writes β€” and explicitly close the stream when the client’s connectionState reaches closed to avoid zombie sessions.

Reproduction Steps & Debugging Log Patterns

  1. Launch Envoy with envoy.filters.http.grpc_web and envoy.filters.http.cors enabled and http2_protocol_options.connection_keepalive set; without keepalive, idle streams die mid-negotiation.
  2. Open two browser clients and trigger an offer from the initiator.
  3. Watch the browser console and Envoy access logs; confirm the offer reaches the second client and an answer returns.
  4. Kill the network on one client for 5 s to force a disconnected/ICE-restart cycle and confirm the stream survives or reconnects.
  5. Correlate any failure against the patterns below.

Expected and diagnostic log lines:

# Healthy: offer out, answer back, ICE connected
[client-A] negotiationneeded -> setLocalDescription(offer) ok
[client-B] data: sdp -> setRemoteDescription ok -> flushed 4 candidates
[client-A] connectionState: connected

# Proxy timeout / missing keepalive
[gRPC] UNAVAILABLE: stream terminated by RST_STREAM  -> set connection_keepalive in Envoy

# State-machine violation: out-of-order SDP
DOMException: Failed to set remote answer sdp: Called in wrong state: stable
  -> verify candidate buffering and single-offer serialisation

# Framing/CORS misconfig
Failed to execute 'send' on 'XMLHttpRequest': the object's state is DONE
  -> enable grpc_web + cors HTTP filters and binary framing in the proxy

Common Implementation Mistakes

FAQ

Can gRPC-Web fully replace WebSocket for WebRTC signaling?

Yes, with caveats. It gives type-safe bidirectional streaming for SDP and ICE, but it requires a translating proxy and explicit stream lifecycle management. The Protobuf schema buys stronger guarantees than JSON at the cost of tooling β€” worth it for multi-platform backends, overkill for a single fast-iterating client.

How do I keep ICE candidate ordering correct over a stream?

Buffer inbound candidates in an array and flush them sequentially only after setRemoteDescription() resolves. This preserves trickle-ICE semantics and avoids InvalidStateError, identical to the buffering used over WebSocket.

Related: return to the Signaling State Machine Patterns guide, handle simultaneous offers over any transport in Recovering from Glare in Offer Collisions, and review the underlying transport in the WebSocket Signaling Implementation guide.