问HN:最佳多语言文本转语音系统

1作者: powera3 个月前原帖
我正在寻找一种基于文本文件批量生成音频的方法。理想情况下,这应该是一个可以在本地运行的系统(M3 Mac,24GB RAM),并且至少支持10种语言的本地化。 我尝试过一些系统(eSpeak、Piper、QWEN),但都没有给出令人满意的结果。Huggingface似乎也没有特别受欢迎的文本转语音模型。我一直在使用OpenAI的gpt-4o-mini模型,但这似乎快要结束生命周期了。 您有什么推荐的LLM(或非LLM)系统吗?
查看原文
I&#x27;m looking for a way to bulk-generate audio based on text files. Ideally, it would be a system I can run locally (M3 mac, 24GB RAM), and support at least 10 languages natively.<p>I have tried a few systems (eSpeak, Piper, QWEN) and none of them have given satisfactory results. Huggingface seems to have no text-to-speech models with particular acclaim, either. I have been using OpenAI&#x27;s gpt-4o-mini model, but that seems to be approaching end-of-life.<p>Is there an LLM (or non-LLM) system that you would recommend?