Dynamically Switching Video Codecs Based on Client Capabilities
Changing the active video codec mid-session is one of the few WebRTC operations the specification deliberately makes expensive: it always requires a full offer/answer exchange, never a quiet setParameters() tweak. This guide is part of the VP8 vs H.264 vs AV1 Codec Selection guide, and it solves one precise problem β how to move a live call from one codec to another (for example, dropping AV1 to VP8 when the encoder overloads or packet loss climbs) without dropping the connection or leaving the remote decoder stuck on a frozen frame.
Context & Trade-offs
The W3C WebRTC specification does not permit changing the codecs field returned by RTCRtpSender.getParameters(); it is read-only, and any attempt to mutate it is ignored or throws InvalidModificationError. The only sanctioned path to a different codec is RTCRtpTransceiver.setCodecPreferences() followed by a new offer. That means every switch costs a renegotiation round trip plus a keyframe, so the decision to switch must be deliberate, not reflexive.
The quantified cost: a renegotiation over a healthy WebSocket signaling channel completes in well under 100 ms of signaling time, but the visible media pause is dominated by the keyframe interval β typically 1β4 seconds before the remote decoder locks onto the new stream. That is why you switch on sustained signals (loss above 5% for more than 2 seconds, or per-frame encode time above the 33 ms budget for a 30 fps target), never on transient jitter. Switching too often produces back-to-back renegotiations that interrupt media more than the condition you were trying to fix.
The trade-off is therefore between efficiency and stability. AV1 saves roughly 30% bitrate over VP9, but if the device software-encodes it the encoder stalls; VP8βs intra-refresh recovers cleanly from loss but spends more bytes. The switching logic encodes that judgement, and it should bias toward staying put unless a threshold is convincingly breached.
There are two distinct reasons to switch, and they call for different thresholds. The first is network-driven: sustained packet loss makes H.264βs I/P frame chains fragile, so dropping to VP8 (or AV1) for its intra-refresh resilience improves perceived quality even though it spends more bytes. The signal is loss rate from inbound-rtp, and the threshold should be both high (above 5%) and durable (more than 2 seconds). The second is CPU-driven: a device that cannot software-encode AV1 or VP9 at the target frame rate drops frames and overheats, so stepping down to a cheaper codec restores smoothness. The signal here is per-frame encode time from outbound-rtp, compared against the frame budget β 33 ms for a 30 fps target, 16 ms for 60 fps. Conflating the two leads to bad decisions: switching codecs because the encoder is overloaded does nothing for a network problem, and vice versa, so read both reports before acting and attribute the symptom correctly.
A second subtlety is hysteresis. Because each switch costs a full renegotiation plus a keyframe, the system must resist oscillation. The cleanest pattern is asymmetric: step down to a more robust or cheaper codec quickly once a threshold is convincingly breached, but step back up to a more efficient codec only after a longer window of healthy stats (15β20 s is a reasonable floor). That asymmetry keeps the call stable through brief bad patches while still recovering efficiency once the link genuinely improves.
Minimal Runnable Implementation
The function below performs a complete, safe mid-call switch: it confirms the target codec exists, re-orders preferences so the target leads, renegotiates, and (at the call site) requests a keyframe once the new codec is active. It relies on RTCRtpSender.getCapabilities('video') for enumeration, which is the same capability-detection entry point used throughout codec negotiation.
async function switchCodecMidCall(pc, transceiver, targetMimeType, signaling) {
const caps = RTCRtpSender.getCapabilities('video');
// Confirm the target is actually supported before touching the transceiver.
const target = caps.codecs.filter(c => c.mimeType === targetMimeType);
if (target.length === 0) throw new Error(`Codec ${targetMimeType} unsupported`);
// Re-order: target codec first, everything else after (keeps a fallback tail).
const rest = caps.codecs.filter(c => c.mimeType !== targetMimeType);
transceiver.setCodecPreferences([...target, ...rest]); // must precede createOffer
// Full renegotiation β the only way to change the active codec.
const offer = await pc.createOffer();
await pc.setLocalDescription(offer);
signaling.send({ type: 'offer', sdp: offer.sdp });
// Remote answer is applied via your normal signaling handler; once
// the new codec is active, request a keyframe (PLI) to unfreeze the decoder.
}
// Congestion-driven trigger: poll getStats and switch to VP8 on sustained loss.
async function checkAndFallback(pc, transceiver, signaling, state) {
for (const r of (await pc.getStats()).values()) {
if (r.type === 'inbound-rtp' && r.kind === 'video') {
const total = (r.packetsReceived ?? 0) + (r.packetsLost ?? 0);
const loss = total > 0 ? r.packetsLost / total : 0;
// Require the condition to persist across polls to avoid flapping.
state.highLossSamples = loss > 0.05 ? state.highLossSamples + 1 : 0;
if (state.highLossSamples >= 2 && state.codec !== 'video/VP8') {
console.warn(`Loss ${(loss * 100).toFixed(1)}% β switching to VP8`);
state.codec = 'video/VP8';
await switchCodecMidCall(pc, transceiver, 'video/VP8', signaling);
}
}
}
}
Poll checkAndFallback on a 2000 ms interval. The state object carries the consecutive-sample counter and the current codec so the function is idempotent β it will not re-trigger a switch to a codec that is already active. Building the initial preference order is covered in the parent guideβs negotiation procedure; this page picks up after the call is already connected.
A subtle correctness point lives in the answer-handling half of the flow, which the snippet delegates to your normal signaling handler. When the remote answer arrives, apply it with setRemoteDescription() and only then request the keyframe β sending a Picture Loss Indication before the new codec is negotiated has no effect, because there is no decoder yet bound to the new payload type. In an SFU topology the keyframe request must also propagate to the original sender, since the forwarding unit cannot synthesise an IDR on its own; rely on the SFUβs PLI/FIR machinery rather than assuming the browser handles it end to end. Finally, guard the whole switch behind an in-flight flag: if signalingState is anything other than stable when your trigger fires, defer the switch until the current negotiation completes, otherwise createOffer() will throw InvalidStateError for attempting to renegotiate mid-exchange.
Reproduction Steps & Debugging Log Patterns
- Establish a baseline. Connect with AV1 or VP9 preferred and confirm the negotiated
m=videoline viapc.localDescription.sdp. Log the active codec fromgetStats()(outbound-rtpβcodecIdβ matchingcodecreport). - Inject loss. Apply 20% packet loss with
tc qdisc add dev eth0 root netem loss 20%(or Chrome DevTools throttling). Expect the loss rate ininbound-rtpto climb above 5% within one or two polls. - Observe the trigger. After two consecutive high-loss samples (β4 s at a 2 s interval) the console prints
Loss 20.0% β switching to VP8and a renegotiation offer is sent. - Confirm the switch. Re-read the negotiated SDP; the
m=videoline should now lead with the VP8 payload type. The remote decoder freezes for one keyframe interval (1β4 s), then resumes. - Watch for failures. These are the log lines that signal a broken switch:
InvalidStateError: setCodecPreferences called on a stopped transceiverβ the transceiver was closed; rebuild it before retrying.InvalidModificationError on setParametersβ code is wrongly trying to mutate the read-onlycodecsfield instead of renegotiating.RTCError: Failed to set codec preferences: unsupported mime typeβ the target codec is absent fromgetCapabilities(); guard with the existence check shown above.- Prolonged remote freeze beyond 4 s β the keyframe request never fired; send a PLI explicitly after the answer applies.
Common Implementation Mistakes
- Mutating
getParameters().codecsβ it is read-only after initial negotiation; codec changes require a full renegotiation, full stop. - Calling
setCodecPreferences()aftercreateOffer()β it has no effect once negotiation has begun, so re-apply it before every offer, including the switch offer. - Hardcoding payload type numbers β PTs are assigned per session and differ between Chrome, Firefox, and Safari; always resolve them through
getCapabilities(). - Skipping the keyframe β without an IDR (H.264) or intra frame (VP8/VP9/AV1) the remote side shows artefacts or a stalled decoder long past the expected window.
- Switching on transient jitter β a single noisy sample should never trigger a renegotiation; require consecutive breaches before acting.
- Renegotiating while
signalingStateis notstableβ firing a second offer mid-exchange throwsInvalidStateError; gate the switch behind an in-flight flag and defer until the current negotiation settles.
FAQ
Can I switch codecs without renegotiating the SDP?
No. A full offer/answer exchange is mandatory to change the active codec. There is no setParameters() path because the codecs field is read-only once negotiation completes.
Why does my video freeze right after the switch?
The remote decoder needs a fresh keyframe to start decoding the new codec. A brief freeze during the renegotiation window is expected and typically lasts one keyframe interval (1β4 s). If it lasts longer, your keyframe request did not fire.
How do I know a client supports AV1 hardware decoding?
If video/AV1 appears in getCapabilities('video'), the browser supports AV1. Hardware acceleration is not exposed directly; infer it by watching totalEncodeTime in outbound-rtp β a hardware encoder keeps it low even at high resolution. For the underlying selection trade-offs, see the codec comparison in the parent guide.
Related: this deep-dive sits under VP8 vs H.264 vs AV1 Codec Selection; pair it with forcing H.264 hardware acceleration on Safari for the iOS path, and with Bandwidth Estimation & Congestion Control for the loss and bitrate signals that drive these switches.