Subscribe
Sign in
Home
Deep Dives
LLMs
Tips and Tricks
Speech to Text
Text to Speech
Archive
About
Latest
Top
DubX: Next generation of TTS models
TTS turns written words into spoken languages that sound just like a person talking.
Jul 27
•
Amartya Roy Chowdhury
January 2025
Notes on building in AI in 2024
Learnings on building and scaling our speech products
Jan 2
•
Varshul
4
October 2024
[Beta] Foundational Speech Model for India
Building the Future of Multilingual AI: Foundational Speech Models for India"
Oct 10, 2024
•
Jaskaran Singh
,
Varshul
, and
Raghav Prabhakar
4
February 2024
Research to Production - NeoDub
scaling in-house tech to a million users!
Feb 16, 2024
•
Jaskaran Singh
3
Pioneering Translation Benchmarking with LLMs
NMT for Indic LLMs?
Feb 9, 2024
•
Tanay Rathore
and
Sambbhav Garg
1
October 2023
Evals are all we need
voice models are having their "stable diffusion moment"
Oct 26, 2023
•
Tanay Rathore
3
A State-of-the-Art Survey of Text-to-Speech Technology 2023
Speaking Machines: Foundation Audio Generation Models
Oct 24, 2023
•
Jaskaran Singh
8
Running RVC Models on the Easy GUI
How to run this on Colab after Google Banned Gradio UIs
Oct 13, 2023
•
Tanay Rathore
4
September 2023
Converging to Multi-Modal Generative AI
Single Foundational Model to rule them all
Sep 7, 2023
•
Jaskaran Singh
4
August 2023
Contextual Translations - Attempt 1
The case for an in-house LLM for translation
Aug 29, 2023
•
Nitin Surya
and
Ruchir Kumbhare
2
Self Supervised Learning (SSL)
"Unlocking Powerful Representations: The Frontier of Self-Supervised Learning"
Aug 9, 2023
•
Jaskaran Singh
6
July 2023
Whisper's Word-Level Timestamps are Out
Bid farewell to extensive waiting periods—finally, it has arrived!
Jul 26, 2023
•
T. Pranav
6
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts