Stay Ahead of the Curve

Latest AI news, expert analysis, bold opinions, and key trends — delivered to your inbox.

Dia by Nari Labs: Open-Source Voice AI Rivaling ElevenLabs?

3 min read Two undergrads. Zero funding. And they just built a voice AI that outperforms ElevenLabs. Meet **Dia**—an open-source text-to-speech model from Nari Labs that adds emotion, timing, and realism. It’s a wake-up call: you really can just build things now. April 23, 2025 14:21 Dia by Nari Labs: Open-Source Voice AI Rivaling ElevenLabs?

Two undergrads. Zero funding. And yet—Korean startup Nari Labs has just released Dia, an open-source text-to-speech (TTS) model that's already outperforming commercial giants like ElevenLabs and Sesame.


The Details:

  • Dia is a 1.6B parameter model that supports:
    – Emotional tones (happy, sad, angry, etc.)
    – Multiple speaker tags
    – Nonverbal cues like laughter, coughs, and even screams

  • Inspired by Google NotebookLM, the team used Google’s TPU Research Cloud for training compute—free access, high output.

  • In side-by-side comparisons, Dia beat out ElevenLabs Studio and Sesame’s CSM-1B in timing accuracy, expressiveness, and nonverbal script handling.

  • According to founder Toby Kim, Nari Labs plans to build a consumer-facing app for social content creation and remixing using the Dia model.


Why It Matters:

Dia isn’t just a technical breakthrough—it’s a cultural moment.

It proves Sam Altman’s idea that “you can just do things” is more real than ever.
With zero VC backing and no formal research pedigree, two students built a TTS model that rivals industry leaders.

AI is democratizing innovation. If you've ever thought about building something... this is your sign.



User Comments (0)

Add Comment
We'll never share your email with anyone else.

img