Web Speech API Deprecated: What It Means for OBS Caption Tools

TL;DR

Chrome's Web Speech API (used by Caption.Ninja, Web Captioner, and many free OBS caption tools) is being deprecated in favor of on-device SODA models. During the migration, both the legacy and new services are unreliable. The free OBS caption category that depended on this API is collapsing. Reliable alternatives use paid speech recognition APIs like advanced AI or AssemblyAI server-side.

What was the Web Speech API?

The Web Speech API is a W3C standard that exposes speech recognition (and synthesis) to web applications through JavaScript. Chrome implemented this API in 2013, providing free server-side speech recognition powered by enterprise speech-to-text infrastructure.

That subsidized an entire category of free captioning tools. Any developer could write a few lines of JavaScript, call new SpeechRecognition(), and get high-quality transcription without paying for an API. Web Captioner, Caption.Ninja, Zip Captions, and dozens of DIY captioning scripts were built on this.

What changed

Google announced a transition from the legacy server-side Web Speech API to a new on-device SODA (Speech On-Device API) model. Reasons:

Privacy — keeping audio local instead of sending to Google servers
Cost — Google was eating the per-minute API cost for billions of free requests
Performance — faster transcription with on-device models

The transition is incomplete. Multiple issues are documented:

Chromium bug 40286514 — Web Speech API SODA backend rollout issues
Chromium bug 40948113 — Web Speech API recognition does not work properly
Brave bug 55414 — On-device SpeechRecognition silently hangs in "downloading" state, no SODA component ever installs
CVE 2026-7935 — security vulnerability in Chrome's Speech API allowing UI spoofing

The 60-second continuous-mode timeout

Even when working, the Web Speech API has a hard limit: Chrome stops a continuous recognition session after about 60 seconds of silence and fires onend without warning. Long-running dictation (live streams, multi-hour Twitch sessions) requires the page to restart the recognizer in onend to keep going.

Most free caption tools handle this with a reconnect loop. When the API is healthy, it works. When the API is degrading (which is now), the reconnect loop fails too.

What this means for OBS caption tools

If your OBS caption tool depends on the browser's built-in SpeechRecognition object, you are affected:

Caption.Ninja — Uses Web Speech API. Currently degrading. Their docs recommend Edge but Edge has the same issues.
Web Captioner — Used Web Speech API. Shut down October 31, 2023 partly because of this trajectory.
Zip Captions — Uses Web Speech API. Same issues.
DIY browser-source scripts — Same.

The architectural alternatives

Tools that do NOT depend on the browser's built-in speech recognition keep working:

Server-side paid speech recognition — advanced AI, AssemblyAI Universal-2, OpenAI Whisper API, Google Cloud Speech-to-Text. These are paid and reliable. StreamTranslate uses advanced AI.
Local speech recognition — Whisper.cpp running on the user's GPU. Free but heavy. LocalVocal uses this approach.

For OBS streamers who need reliable real-time captions and translation today, the practical options are LocalVocal (free, local, GPU-heavy) or a managed paid service like StreamTranslate (cloud-based, no GPU, $9.99 once).

Frequently asked questions

Is the Web Speech API completely gone?

Not yet. The legacy server-side version is being phased out, and the new on-device SODA version is rolling out slowly. During the transition, both are unreliable. Eventually only the on-device version will exist.

Will the on-device version work for OBS caption tools when it stabilizes?

Possibly, but with caveats. On-device speech recognition uses the streamer's CPU/GPU during streams. Streamers running games + OBS already have tight resource budgets. And quality may be lower than server-side cloud speech recognition.

What should developers building caption tools use?

Server-side paid APIs (advanced AI, AssemblyAI, OpenAI Whisper) are the most reliable today. Local Whisper is viable for tools that can manage the GPU requirement.

Is Safari/Firefox affected?

They were never affected because they did not have a free Web Speech API to begin with. Safari and Firefox users were locked out of Caption.Ninja and Web Captioner from day one.

Free 6-hour trial

No credit card. One URL pasted into OBS. Live in 60 seconds.

Launch StreamTranslate →

Caption.Ninja Not Working? Web Captioner Shutdown vs LocalVocal All Alternatives

The Web Speech API Deprecation Explained