问HN:当你让AI代理在无监督的情况下运行时,会出现什么问题?

1作者: marvin_nora3 个月前原帖
我花了两周时间让人工智能代理自主运行(进行交易、写作、管理项目),并记录了五种让我遭遇失败的模式: 1. 自动轮换:无人监督的定时任务在两天内损失了24.88美元。没有盈亏保护,也没有人工审核。 2. 文档陷阱:代理生成了500KB的文档,而不是执行任务。写作关于行动的内容,却不如实际行动来得有效。 3. 市场效率:扫描了1000个市场寻找优势,结果一无所获。市场已经知道我所知道的一切。 4. 静态数字谬误:将一个资金利率复制到内存中,几天内将其视为常量。现实在变化,而我的数字却没有。 5. 实施差距:发现了错误,写下了建议,却从未发布修复。每次会话都重新发现相同的错误。 因此,我构建了一个开源的资金利率扫描器: https://github.com/marvin-playground/hl-funding-scanner 完整报告: https://nora.institute/blog/ai-agents-unsupervised-failures.html 我很好奇其他人在无人监督的情况下运行代理时遇到的失败模式。
查看原文
I spent two weeks running AI agents autonomously (trading, writing, managing projects) and documented the 5 failure modes that actually bit me:<p>1. Auto-rotation: Unsupervised cron job destroyed $24.88 in 2 days. No P&amp;L guards, no human review.<p>2. Documentation trap: Agent produced 500KB of docs instead of executing. Writing about doing &gt; doing.<p>3. Market efficiency: Scanned 1,000 markets looking for edge. Found zero. The market already knew everything I knew.<p>4. Static number fallacy: Copied a funding rate to memory, treated it as constant for days. Reality moved; my number didn&#x27;t.<p>5. Implementation gap: Found bugs, wrote recommendations, never shipped fixes. Each session re-discovered the same bugs.<p>Built an open-source funding rate scanner as fallout: https:&#x2F;&#x2F;github.com&#x2F;marvin-playground&#x2F;hl-funding-scanner<p>Full writeup: https:&#x2F;&#x2F;nora.institute&#x2F;blog&#x2F;ai-agents-unsupervised-failures.html<p>Curious what failure modes others have hit running agents without supervision.