展示HN:CompareGPT – 通过比较多个大型语言模型识别幻觉

1作者: tinatina_AI6 天前原帖
嗨,HN,我是蒂娜。 我在使用大型语言模型(LLMs)时遇到的一个主要困扰是“幻觉”:那些听起来很自信但实际上是虚构的回答。比如虚假的引用、错误的数字,甚至整个“系统报告”。 因此,我正在开发CompareGPT,旨在通过以下方式提高AI输出的可信度: - 将多个LLM并排对同一查询进行比较 - 方便查看一致性(或缺乏一致性) - 在浪费时间或造成损害之前帮助识别幻觉 链接在这里:<a href="https:&#x2F;&#x2F;comparegpt.io&#x2F;home" rel="nofollow">https:&#x2F;&#x2F;comparegpt.io&#x2F;home</a>。我们已经开启了候补名单,期待收到反馈,特别是来自研究、金融或法律领域的LLM使用者的意见。 谢谢!
查看原文
Hi HN I’m Tina. One frustration I keep running into with LLMs is hallucinations: answers that sound confident but are fabricated. Fake citations, wrong numbers, even entire “system reports.” So I’ve been building CompareGPT, which tries to make AI outputs more trustworthy by: Putting multiple LLMs side by side for the same query Making it easy to see consistency (or lack of it) Helping catch hallucinations before they waste time or cause harm link here:<a href="https:&#x2F;&#x2F;comparegpt.io&#x2F;home" rel="nofollow">https:&#x2F;&#x2F;comparegpt.io&#x2F;home</a>. We’ve opened a waitlist and would love feedback, especially from folks working with LLMs in research, finance, or law. Thanks!