How Tavus Scaled Human-like AI Experiences with Cerebrium

"

From the moment we started working with you guys—super helpful, super responsive, and building things we needed right away. That was big.

Roey Priel

Lead engineer, Tavus

Use case

Generative AI, B2B, Video, Infrastructure, AI, Training

Location

San Fransisco, United States

Customer since

2024

Features used

GPU, Asynchronous jobs, Observability, Logging

Highlights

Improved developer velocity, application reliability, Lowered costs, Exceptional support, Met scale requirements

Want to learn more?

Find your perfect fit with a tailored demo

Book a demo

From Empathy to Engineering: Introducing Tavus

Tavus is pioneering the human layer of AI—building empathetic, human-like interactions that redefine how we engage with machines. “We’re focused on making AI-human interfaces feel more natural, more intuitive,” explains a backend ML engineer at Tavus. On the team, his role spans everything from deploying and optimizing models on GPUs to building scalable backend APIs.

The Challenge: Scaling Under Pressure

Before adopting Cerebrium, Tavus ran into major bottlenecks around GPU deployments, particularly for their CVI (Custom Video Infrastructure) workloads. Cold starts, inflexible auto-scaling, and unpredictable surges in traffic created infrastructure pain that exposed latency to end-users. “We don’t have a steady state—it’s big spikes,” they shared. “Supporting that without exposing cold starts to customers was a huge pain point.”

Their previous infrastructure setup struggled under scale. While trying to deploy their Phoenix 2 and 3 models—which required high-performance GPUs like L40s and H100s—the Tavus team found available capacity scarce and the control they needed to fine-tune deployments even scarcer.

Finding the Right Fit

Tavus evaluated several platforms, including Baseten and Modal, but ultimately chose Cerebrium. Why?

“From the moment we started working with you guys—super helpful, super responsive, and building things we needed right away. That was big.”

Onboarding was “shockingly fast.” In under a day, Tavus had their workloads up and running. That early win set the tone: “We didn’t want to spend more time trying to get things to work—it already worked.”

Impact: Faster Cycles, Better Product

Cerebrium drastically reduced their development cycle times. Quick 30-second deployments meant faster feedback loops and faster iteration. Compared to competitors where deploying a change could take 30 minutes to an hour, this was a game-changer.

“The loop is a lot shorter than it used to be… and that helps you move really fast.”

This agility gave Tavus a competitive edge. They could respond to customers quicker, run experiments faster, and ship improvements without friction. “It lets us focus on building—on what really matters.”

Reliability, Scale, and Confidence

For a company with enterprise-grade SLAs, stability is non-negotiable. Cerebrium delivered.

“We’ve had GPU provider outages in the past. That’s incredibly frustrating—for us and for our customers. Since switching to Cerebrium, that hasn’t been a problem.”

During their Phoenix 3 launch, Tavus scaled to 150 containers seamlessly. The infrastructure held up, and the built-in observability—metrics, logs, dashboards—gave their team confidence in every deployment.

Cost and Support: Scaling Sustainably

While Tavus didn’t disclose exact cost savings, they noted that infrastructure spending had scaled favorably. “Obviously we’re spending more because we’re using more, but it seems like it’s been scaling pretty well in our favor.”

What stood out most, though, was the support.

“The support—especially from a four-person team—is kind of insane. We seem to have issues at the worst times—Friday evenings, Saturday mornings—and you’re always there on Slack. That’s a huge part of why we love working with you.”

Final Thoughts

The Tavus team sums it up best:

“Your shit works—which is honestly, great.”

Cerebrium helped Tavus accelerate product delivery, scale seamlessly during peak traffic, and build with confidence. For a team building the human layer of AI, having solid infrastructure behind the scenes means they can focus on what they do best—bringing empathy to technology.

‹ Lelapa AI uses Cerebrium to Break Language Barriers

Trying out AI at your company?

We offer up to $1,000.00 in free credits and face-time with our engineers to get you started.

Product

Pricing

Developers

Docs

Status

Company

Blog

About

Use cases

Large language models

Voice

Image & Video

Resources

Examples

Articles

Brand assets