← Back to all sparks
D

Deepgram

COMMS
Velocity6.3

Diarization v2 lands with a 3.3× human-eval edge — Deepgram's contact-center push gets sharper.

speech-to-textdiarizationvoice agentscontact centerself-hosted releasesmultilingual
Current state
Deepgram is shipping in two coordinated lanes: deeper transcription quality (Nova-3 multilingual numerals, Gujarati, profanity filtering across 50+ languages) and a maturing Voice Agent API (managed LLM swaps, third-party TTS controls). The new opt-in diarize_model=v2 brings a new architecture preferred 3.3× over v1 in human eval, with the biggest gains on contact-center audio. Self-hosted images and multi-language SDKs are released on a tight, predictable cadence.
Where it's heading
The arc is consolidating around enterprise contact-center workloads: better speaker separation, safer outputs via profanity redaction, and richer language coverage are exactly the gates that block call-center adoption. Voice Agent is becoming a managed-LLM thin layer where customers pick the brain (OpenAI, removed Llama Nemotron) while Deepgram owns ears and mouth. Expect diarize_model=v2 to become the default once telemetry catches up.
Prediction
Likely next: v2 diarization promoted to default for diarize=true, and a streaming version of the same architecture to extend the contact-center story to live transcription. More managed-LLM additions in Voice Agent, plus continued language fill-in for Nova-3.

Recent moves

  1. 6d ago

    Numerals Support Now Available for 3 New Languages: Russian, Romanian, and Hebrew (Monolingual Models)

    View source ↗
  2. 7d ago

    Self-hosted May 2026 release (260514)

    Routine self-hosted image bump (release 260514) tracking the latest API/engine versions. Useful for on-prem operators but no new capabilities behind it.

    View source ↗
  3. 7d ago

    Profanity Filtering Now Available in 50+ Languages

    Profanity filtering now covers 50+ monolingual languages, broadening the same redaction guarantee Deepgram already offered in English to the rest of its language matrix. Enables more regulated-industry rollouts where unfiltered transcripts were a blocker.

  4. 8d ago

    Diarization v2: Improved Batch Speaker Diarization

    ⚡ SPARK

    Diarize v2 introduces a new architecture, opt-in via diarize_model. Human evaluators preferred v2 over v1 by 3.3×, and median CER on contact-center audio dropped roughly 80%. This is the quality jump that justifies pitching Deepgram into call-center workflows previously won by incumbents.

  5. 9d ago

    SDK releases

    Coordinated SDK drop: Flux multilingual lands in Rust, the JS Agent interface is restored, Python fixes a WebSocket query-param bug, and Java ships a breaking reconnect overhaul. Signals continued investment in keeping the four-SDK matrix at parity with the API surface.

  6. 9d ago

    Nova-3 Multilingual Model Update

    Nova-3 multilingual now formats spoken numbers as digits across eight languages — useful for downstream search, billing, and analytics that key off normalized numerics. Hindi and Japanese remain pending.