问HN:有没有人在为跨工具工作的代理构建写入保证?

1作者: ajaystream大约 1 个月前原帖
我们在构建生产代理工作流时遇到了特定的故障模式。合同中的字段导致订阅更新不准确——日期偏差了一天。产品在应该更新时随机创建。税额没有写入税务字段,而是作为全新的产品实例化。每一次故障看起来都是一个技术上成功但在操作上错误的写入。 人机协作(HITL)有所帮助——逐个处理合同,并在每一步都需要用户确认,这样保持了准确性。但用户最终说:“我已经向你解释了30次,快点完成。”当我们减少确认步骤让其自动运行时,它又开始出现故障。 没有错误提示,没有警报。只是漂移,几周后在对账时才显现出来。 提示和映射表在边缘上有所补偿,但从未稳定。代理没有经过验证的基础真相来说明字段在系统之间的关系——每次都是推断。而且大多数情况下推断不一致。需要帮助吗?
查看原文
We ran into a specific failure mode building production agent workflows. Fields from contracts creating inaccurate subscription updates — dates off by a day. Products created at random when they should have been updated. Tax amounts not written to the tax field but instantiated as entirely new products. Every failure a plausible-looking write that succeeded technically and was wrong operationally. HITL helped — processing one contract at a time with user confirmation at every step kept it accurate. But users eventually said "I have explained this to you 30 times, just get it done." The moment we reduced confirmation steps to let it run, it started failing again. No errors. No alerts. Just drift that showed up in reconciliation weeks later. Prompting and mapping tables compensated at the margins but never held. The agent had no verified ground truth on how fields related across systems — it inferred every time. And most times inferred inconsistently. Help ?