HackerNews中文版

我正在寻找一种基于文本文件批量生成音频的方法。理想情况下，这应该是一个可以在本地运行的系统（M3 Mac，24GB RAM），并且至少支持10种语言的本地化。我尝试过一些系统（eSpeak、Piper、QWEN），但都没有给出令人满意的结果。Huggingface似乎也没有特别受欢迎的文本转语音模型。我一直在使用OpenAI的gpt-4o-mini模型，但这似乎快要结束生命周期了。您有什么推荐的LLM（或非LLM）系统吗？

查看原文

I'm looking for a way to bulk-generate audio based on text files. Ideally, it would be a system I can run locally (M3 mac, 24GB RAM), and support at least 10 languages natively.<p>I have tried a few systems (eSpeak, Piper, QWEN) and none of them have given satisfactory results. Huggingface seems to have no text-to-speech models with particular acclaim, either. I have been using OpenAI's gpt-4o-mini model, but that seems to be approaching end-of-life.<p>Is there an LLM (or non-LLM) system that you would recommend?

问HN：最佳多语言文本转语音系统