How Tavus Scaled Human-like AI Experiences with Cerebrium
"
"
From the moment we started working with you guys—super helpful, super responsive, and building things we needed right away. That was big.
Roey Priel
Lead engineer, Tavus

Use case
Generative AI, B2B, Video, Infrastructure, AI, Training
Location
San Fransisco, United States
Customer since
2024
Features used
GPU, Asynchronous jobs, Observability, Logging
Highlights
Improved developer velocity, application reliability, Lowered costs, Exceptional support, Met scale requirements
Want to learn more?
Find your perfect fit with a tailored demo
From Empathy to Engineering: Introducing Tavus
Tavus is pioneering the human layer of AI—building empathetic, human-like interactions that redefine how we engage with machines. “We’re focused on making AI-human interfaces feel more natural, more intuitive,” explains a backend ML engineer at Tavus. On the team, his role spans everything from deploying and optimizing models on GPUs to building scalable backend APIs.
The Challenge: Scaling Under Pressure
Before adopting Cerebrium, Tavus ran into major bottlenecks around GPU deployments, particularly for their CVI (Custom Video Infrastructure) workloads. Cold starts, inflexible auto-scaling, and unpredictable surges in traffic created infrastructure pain that exposed latency to end-users. “We don’t have a steady state—it’s big spikes,” they shared. “Supporting that without exposing cold starts to customers was a huge pain point.”
Their previous infrastructure setup struggled under scale. While trying to deploy their Phoenix 2 and 3 models—which required high-performance GPUs like L40s and H100s—the Tavus team found available capacity scarce and the control they needed to fine-tune deployments even scarcer.
Finding the Right Fit
Tavus evaluated several platforms, including Baseten and Modal, but ultimately chose Cerebrium. Why?
“From the moment we started working with you guys—super helpful, super responsive, and building things we needed right away. That was big.”
Onboarding was “shockingly fast.” In under a day, Tavus had their workloads up and running. That early win set the tone: “We didn’t want to spend more time trying to get things to work—it already worked.”
Impact: Faster Cycles, Better Product
Cerebrium drastically reduced their development cycle times. Quick 30-second deployments meant faster feedback loops and faster iteration. Compared to competitors where deploying a change could take 30 minutes to an hour, this was a game-changer.
“The loop is a lot shorter than it used to be… and that helps you move really fast.”
This agility gave Tavus a competitive edge. They could respond to customers quicker, run experiments faster, and ship improvements without friction. “It lets us focus on building—on what really matters.”
Reliability, Scale, and Confidence
For a company with enterprise-grade SLAs, stability is non-negotiable. Cerebrium delivered.
“We’ve had GPU provider outages in the past. That’s incredibly frustrating—for us and for our customers. Since switching to Cerebrium, that hasn’t been a problem.”
During their Phoenix 3 launch, Tavus scaled to 150 containers seamlessly. The infrastructure held up, and the built-in observability—metrics, logs, dashboards—gave their team confidence in every deployment.
Cost and Support: Scaling Sustainably
While Tavus didn’t disclose exact cost savings, they noted that infrastructure spending had scaled favorably. “Obviously we’re spending more because we’re using more, but it seems like it’s been scaling pretty well in our favor.”
What stood out most, though, was the support.
“The support—especially from a four-person team—is kind of insane. We seem to have issues at the worst times—Friday evenings, Saturday mornings—and you’re always there on Slack. That’s a huge part of why we love working with you.”
Final Thoughts
The Tavus team sums it up best:
“Your shit works—which is honestly, great.”
Cerebrium helped Tavus accelerate product delivery, scale seamlessly during peak traffic, and build with confidence. For a team building the human layer of AI, having solid infrastructure behind the scenes means they can focus on what they do best—bringing empathy to technology.
Trying out AI at your company?
We offer up to $1,000.00 in free credits and face-time with our engineers to get you started.
Company
Resources
© 2025 Cerebrium, Inc.