Back to AIBriefs
AnalysisMusic

Cartesia runs real-time voice AI on Together infrastructure

Together AI avatar
Together AI
@togethercompute

.@cartesia runs one of the hardest inference workloads: real-time voice. Their stack has to keep long-lived streams moving, serve millions of audio minutes a day, and hold model latency around 90ms. Together gives them the managed GPU infrastructure and low-level cluster

·
4 hours ago