请问HN:你如何判断对你的AI技能所做的调整是否提高了其表现?
好奇这里的人们在调整AI技能/提示/工作流程时如何评估变化。<p>很多时候,某个调整在一两个案例中可能感觉更好,但很难判断它是否真的整体上改善了技能,还是只是以某种方式改变了其行为,使其在短期内看起来更好。<p>你们主要依靠直觉,还是有一些轻松的方法来检查这个调整是否真的有帮助?
查看原文
Curious how people here evaluate changes when they tweak an AI skill / prompt / workflow.<p>A lot of the time, a tweak might feel better in one or two cases, but it’s hard to tell if it actually improved the skill overall or just changed its behavior in a way that looks better for a bit.<p>Do you mostly go by intuition, or do you have some lightweight way to check if a tweak really helped?