递归演绎验证:减少人工智能幻觉的框架
我一直在研究一种系统化的方法论,显著提高大型语言模型(LLM)的可靠性。核心思想是:在得出结论之前进行验证。
问题:
LLM生成的输出听起来合理,但并未验证前提条件。它们优化的是连贯性,而非正确性。
RDV原则:
- 永不假设 - 如果无法验证,就要提问或承认不确定性
- 递归分解 - 将复杂的主张拆解为可测试的基本事实
- 区分“是”与“应该” - 将观察与建议分开
- 首先测试机制 - 注重功能而非本质,重现性行为优于推测
- 知识的诚实胜于舒适 - “我不知道”是合理的
实际结果:
作为系统指令应用后,RDV显著减少了:
- 幻觉(模型停止而不是编造)
- 逻辑错误(分解捕捉缺陷)
- 不合理的自信(验证揭示空白)
示例:
没有RDV时:“最佳解决方案是X,因为Y”(未经验证的假设)
有RDV时:“我们在优化什么?存在哪些约束?在推荐X之前让我验证Y…”
实施:
可以将其添加到系统提示或自定义指令中。关键是将验证作为必需步骤,而非可选步骤。
这并不是限制能力,而是增加严谨性。更好的验证 = 更可靠的输出。
开放性问题:像这样的验证框架是否可以在模型训练中构建,而不仅仅是在提示中使用?
查看原文
:
I've been working on a systematic methodology that significantly improves LLM reliability. The core idea: force verification before conclusion.
The Problem:
LLMs generate plausible-sounding outputs without verifying premises. They optimize for coherence, not correctness.
RDV Principles:<p>Never assume - If not verifiable, ask or admit uncertainty
Decompose recursively - Break complex claims into testable atomic facts
Distinguish IS from SHOULD - Separate observation from recommendation
Test mechanisms first - Functions over essences, reproducible behavior over speculation
Intellectual honesty over comfort - "I don't know" is valid<p>Practical Results:
Applied as system instructions, RDV significantly reduces:<p>Hallucinations (model stops instead of confabulating)
Logical errors (decomposition catches flaws)
Unjustified confidence (verification reveals gaps)<p>Example:
Without RDV: "The best solution is X because Y" (unverified assumption)
With RDV: "What are we optimizing for? What constraints exist? Let me verify Y before recommending X..."
Implementation:
Can be added to system prompts or custom instructions. The key is making verification a required step, not optional.
This isn't about restricting capability - it's about adding rigor. Better verification = more reliable outputs.
Open question: Could verification frameworks like this be built into model training rather than just prompting?