← Back to all sparks
T

Together AI

AI-ASSISTANTS
Velocity5.5

Open-source AI cloud platform for training and inference of generative models

Together AI is pricing itself as the open-stack alternative to frontier coding-agent APIs.

inference-economicscoding-agentsopen-modelsdeepseekhuggingfaceday-zero-launches
Current state
Together is hammering on two things: (a) inference economics, with a benchmark claiming 76% lower cost than Claude Opus 4.6 on coding-agent workloads, and (b) breadth of model surface, evidenced by day-0 Nemotron 3 Nano Omni, DeepSeek-V4 Pro at 512K context, and Goose-driven 'deploy any HuggingFace model' tooling. Side outputs — a voice finder, the Violin video-translation tool, and a Pearl Research Labs crypto-inference partnership — broaden the developer surface without changing the core narrative.
Where it's heading
Together is positioning to be the default API for teams running coding agents on open models, with explicit price/perf comparisons against closed labs. The pattern of day-0 launches plus dedicated container offerings makes the strategy clear: any open frontier model should be one click away on Together. Crypto-adjacent and partnership work (Pearl, Adaption) reads as experimentation rather than core roadmap.
Prediction
Expect more cost-comparison content against named frontier APIs and a tighter coding-agent SKU (likely a benchmark-grounded preset for Cursor/Aider-style workloads). Day-0 launch cadence will continue as the differentiator versus AWS Bedrock and other neoclouds.

Recent moves

  1. 2d ago

    Benchmarking inference at scale: coding agents

    ⚡ SPARK

    Together publishes head-to-head inference benchmarks framing itself as 76% cheaper than Claude Opus 4.6 on coding agents, with TPS and TTFT advantages against TensorRT-LLM. The most explicit competitive positioning move in the window.

    View source ↗
  2. 6d ago

    Together AI and Pearl Research Labs Team Up to Reduce the Cost of AI Inference

    Partnership with Pearl Research Labs to ship a discounted endpoint for Gemma-4-31B-it-pearl using Proof of Useful Work crypto emissions. Niche and experimental, but a notable cross of crypto and inference economics.

    View source ↗
  3. 7d ago

    Violin: An open-source video translation skill that breaks language barriers

    Violin, an open-source video translation pipeline combining ASR, LLM translation, and TTS. Broadens Together's reference-implementation library and showcases its TTS stack.

    View source ↗
  4. 9d ago

    Introducing voice finder — a new tool to quickly find the right voice for your app from over 600+ voices

    Voice Finder lets developers search 600+ TTS voices using natural-language prompts or audio samples. A UX-layer addition that makes the voice catalog actually navigable.

    View source ↗
  5. 10d ago

    Serving DeepSeek-V4: why million-token context is an inference systems problem

    Deep engineering blog on serving DeepSeek-V4 at million-token context on HGX B200. Useful proof of capability but informational rather than a release.

    View source ↗
  6. 13d ago

    Deploy and inference any model from HuggingFace

    Tutorial-as-product: deploy any Hugging Face model via Goose into Together's Dedicated Container Inference in one prompt. Lowers the barrier between an open model release and a production endpoint on Together.

    View source ↗