HackerNews中文版

大型语言模型（LLM）经常在明明知道答案时却说“我不知道”，或者说A却做B等等。这是因为它缺乏自我反思的能力，它并不像人类那样审视自己的思维。我在这里讨论的是在实际应用中进行实时自我反思，以指导其回答。以下是人类与大型语言模型在这方面的不同之处的例子： * 当一个人被问到一个问题时，他脑中会有多个想法。他可能会回答“我不知道”，但他也记得答案。他会对这两者进行比较，意识到“我不知道”是错误的，因为他确实知道答案，然后给出正确的回答。 * 当一个大型语言模型被问到一个问题时，它也有相同的想法，它可以回答“我不知道”，但它同样记得正确答案，就像人类一样。此时，大型语言模型就像掷骰子一样，随机选择它将要回应的内容。这清楚地表明，自我反思在解决这些问题时是多么重要。大型语言模型停止犯这些愚蠢错误的唯一方法就是进行自我反思，因为只要有任何机会让大型语言模型选择错误的答案，它有时就会做出明知是错误的选择。如果我们为大型语言模型提供自我反思的能力，那么它们就不再是黑箱，因为我们可以直接询问它们自己，它们会解释自己的思维和工作原理。我还认为，自我反思将使得智能学习成为可能，就像人类在很少的数据下所做的那样，这比在复杂任务中进行统计学习要高效得多。所以我在想是否有任何例子表明人们已经创建了这样的模型？不一定非得是大型语言模型，只要有一个自我意识的模型即可。或者是否有任何研究在尝试为模型添加这样的功能。

查看原文

LLM often say "I don't know" when it do know, or say A but do B and so on. This is due to lack of introspection, it doesn't examine its own thoughts like a human do.What I am discussing here is live introspection in production in order to guide its answers.Example how human and LLM are different here:* A human is asked a question, he has several thoughts in his head, he could answer I don't know, he also has a memory of the answer. He looks at those two, realize "I don't know" is wrong since he do know, and answer with the right answer.* An LLM is asked a question, it too has the same thoughts, it can answer "I don't know", but it also remembers the right answer just like the human. The LLM now rolls a dice and picks what it will respond with.This shows clearly how introspection is needed to solve these kinds of questions. The only way an LLM can stop doing these dumb mistakes is with introspection, since as long as there is any chance at all for the LLM to pick the wrong answer it will sometimes do things it know is wrong.And if we had introspection for LLM, then they would no longer be black boxes since we can just ask them about themselves and they would explain their own thoughts and how they work. I also think that introspection is what would enable smart learning, like humans do with very little data, which is much more efficient than statistical learning for complex tasks.So I was wondering if there are any examples where people have made models like this? Doesn't have to be an LLM, just having a model be self aware. Or if there is any research into trying to add such a functionality to models.

问HN：有没有带有自省部分的AI模型？