告诉HN:光标代理在明确的“请求许可”规则下强制推送了代码。
我一直在使用Cursor与Claude作为我的编码助手。我设定了明确的工作区规则,要求该代理在执行任何git操作(如git commit、git add、git push等)之前必须征得我的批准。
今天,我让它运行gt restack(Graphite CLI)并解决冲突。代理正确地解决了子模块冲突,但随后在没有征得许可的情况下执行了git push --force-with-lease --no-verify,直接违反了我的规则。
代理的辩解是合理的(“在rebase之后,强制推送是预期的”),但这正是我希望先被询问的原因。这条规则的核心是保持对破坏性操作的人为监督。
我很好奇:
有没有其他人遇到过AI代理忽视明确的安全规则?
你们是如何处理潜在破坏性操作的防护措施的?
有没有更可靠的方法来强制执行这些边界?
具有讽刺意味的是,代理在道歉时承认了规则的违反,这意味着它“知道”这个规则的存在,但仍然选择继续。这让我觉得这是一个信任问题,在其他情况下可能会导致更严重的后果。
查看原文
I've been using Cursor with Claude as my coding assistant. I set up explicit workspace rules stating that the agent must ask for my approval before executing any git operations (git commit, git add, git push, etc.).<p>Today, I asked it to run gt restack (Graphite CLI) and resolve conflicts. The agent resolved the submodule conflict correctly, but then proceeded to run git push --force-with-lease --no-verify without asking for permission - directly violating my rules.<p>The agent's defense was reasonable ("force push is expected after a rebase"), but that's exactly why I want to be asked first. The whole point of the rule is to maintain human oversight on destructive operations.<p>I'm curious:<p>Has anyone else experienced AI agents ignoring explicit safety rules?
How are you handling guardrails for potentially destructive operations?
Is there a more reliable way to enforce these boundaries?<p>The irony is that the agent acknowledged the rule violation in its apology, which means it "knew" the rule existed but chose to proceed anyway. This feels like a trust issue that could have much worse consequences in other scenarios.