<https://dev.to/jamesmurdza/five-apis-for-ai-text-...
# 06-technical-discussion
j
https://dev.to/jamesmurdza/five-apis-for-ai-text-to-speech-49mo I’ve been playing around with text narration a lot recently. So far I’ve found ElevenLabs to be fantastic, but expensive. After that OpenAI, GCP Neural and AWS Long-form are all good but still not cheap. $1-2/hour of audio. Tortoise also looks great (open source), but haven’t gotten it running. Anyone have some experience to add to this comparison?
j
https://replicate.com/afiaka87/tortoise-tts you can use Tortoise here. i just ran a sample through and it was okay, better than Polly, which is pretty old, and arguably better than Google's.
replicate-prediction-pji6jmjbehvtxgqcpkdq2ad6qq.mp3
j
Thanks, sounds alright! Any estimate on cost, or do I have to benchmark that myself?
It seems to be taking about 30s to generate a 1-2s sample?
That is quite slow.
v
Meta’s SeamlessM4T-v2 is solid too
p
Haven't found anything that beats elevenlabs yet in terms of realism if that's what you're going for. The closest that seems to pass the "turing test" of speech
j
@philipglorenzo I’m looking for anything that’s decent and cheap, doesn’t have to be ElevenLabs quality. OpenAI is 1/2 the cost of ElevenLabs, so will use that for now!
👍 1
k
@James Murdza curious what did you land on?
j
I’m still using OpenAI. If you come across something else let me know.