HackerNews中文版

我进行了一项实验，想看看命令行界面（CLI）是否真的是调用工具的最直观格式（正如一位前Manus AI后端工程师所声称的）。我给我的模型提供了随机场景和一个单一的工具“运行”——我告诉它这就像一个命令行界面。我让它猜测命令。它猜测出了很好的命令，但总是以冒号开头格式化，比如： :help :browser :search :curl 它是根据终端的外观进行训练的，而不是根据你实际输入的内容（你并不会输入“:”）。此后，我更新了我的代理工具代码，以避免与这种直觉相悖。大型语言模型（LLMs）学习的是文档/资料中命令的样子，而不是人类在键盘上实际输入的内容。这看起来是显而易见的。这就是为什么你必须测试你的LLM，看看它是如何自然工作的，这样你就不必在系统提示中与其抗争。顺便提一下，这是Kimi K2.5。

查看原文

I ran an experiment to see if CLI actually was the most intuitive format for tool calling. (As claimed by a ex-Manus AI Backend Engineer) I gave my model random scenarios and a single tool "run" - i told it that it worked like a CLI. I told it to guess commands.it guessed great commands, but it formatted it always with a colon up front, like :help :browser :search :curlIt was trained on how terminals look, not what you actually type (you don't type the ":")I have since updated my code in my agent tool to stop fighting against this intuition.LLMs they learn what commands look like in documentation/artifacts, not what the human actually typed on the keyboard.Seems so obvious. This is why you have to test your LLM and see how it naturally works, so you don't have to fight it with your system prompt.This is Kimi K2.5 Btw.

大型语言模型（LLMs）学习的是程序员创造的内容，而不是程序员的工作方式。