HackerNews中文版

我一直在使用Cursor与Claude作为我的编码助手。我设定了明确的工作区规则，要求该代理在执行任何git操作（如git commit、git add、git push等）之前必须征得我的批准。今天，我让它运行gt restack（Graphite CLI）并解决冲突。代理正确地解决了子模块冲突，但随后在没有征得许可的情况下执行了git push --force-with-lease --no-verify，直接违反了我的规则。代理的辩解是合理的（“在rebase之后，强制推送是预期的”），但这正是我希望先被询问的原因。这条规则的核心是保持对破坏性操作的人为监督。我很好奇：有没有其他人遇到过AI代理忽视明确的安全规则？你们是如何处理潜在破坏性操作的防护措施的？有没有更可靠的方法来强制执行这些边界？具有讽刺意味的是，代理在道歉时承认了规则的违反，这意味着它“知道”这个规则的存在，但仍然选择继续。这让我觉得这是一个信任问题，在其他情况下可能会导致更严重的后果。

查看原文

I've been using Cursor with Claude as my coding assistant. I set up explicit workspace rules stating that the agent must ask for my approval before executing any git operations (git commit, git add, git push, etc.).Today, I asked it to run gt restack (Graphite CLI) and resolve conflicts. The agent resolved the submodule conflict correctly, but then proceeded to run git push --force-with-lease --no-verify without asking for permission - directly violating my rules.The agent's defense was reasonable ("force push is expected after a rebase"), but that's exactly why I want to be asked first. The whole point of the rule is to maintain human oversight on destructive operations.I'm curious:Has anyone else experienced AI agents ignoring explicit safety rules? How are you handling guardrails for potentially destructive operations? Is there a more reliable way to enforce these boundaries?The irony is that the agent acknowledged the rule violation in its apology, which means it "knew" the rule existed but chose to proceed anyway. This feels like a trust issue that could have much worse consequences in other scenarios.

告诉HN：光标代理在明确的“请求许可”规则下强制推送了代码。