展示HN:针对AI代理的确定性安全解决方案 – OpenClaw及其他两个项目

3作者: steadeepanda20 天前原帖
我想分享一个最初为自己开发的OpenClaw解决方案,它可以帮助控制你的AI代理在不影响其能力的情况下可以达到的范围,希望对你有用。 基本上,这个解决方案让你可以在安全的边界内自由地实验你的代理。 这个解决方案是故意设计为确定性的(不包含任何AI层),这意味着它遵循明确且已定义的规则,以最大化安全性和可预测性。 这些规则经过严格测试,以检测提示注入尝试和其他安全问题(在文档中有详细说明)。 所有内容都是本地的,存储在你的计算机上,包括文档网站。 它为你提供了一个控制面板,用于监控和控制边界。当边界即将被突破时,你会收到一个批准请求,让你看到你的OpenClaw试图做什么。 它目前还支持Tailscale,因此你可以连接你的Tailscale IP地址,并在手机上接收所有信息,同时也可以正常聊天,批准或拒绝请求。它允许你通过Tailscale IP地址(建议使用私有地址)从任何地方访问控制面板。 目前只支持Telegram频道。 目前仅支持Linux操作系统以及Opencode Claude Code和OpenClaw运行器。 入门所需的内容在自述文件中有说明,还包括快速演示/展示图像,让你可以看到它的外观。 我很高兴听到你们的反馈,特别是对提示注入的测试,以了解它的处理方式。如果你发现任何问题,请随时在GitHub上提交工单,我会尽力修复。 链接在这里: [https://github.com/steadeepanda/agent-ruler](https://github.com/steadeepanda/agent-ruler) 感谢你的阅读。我很乐意与大家讨论这个话题。
查看原文
I wanted to share a solution that I made initially for myself for OpenClaw, that helps control what your ai agents can reach when you let it do stuff without impacting its power, I hope it&#x27;s useful to you.<p>Basically the solution lets you experiment freely with your agent within safe boundaries.<p>It&#x27;s deterministic on purpose (doesn&#x27;t include any Al layer) which means the solution follows clear and already defined rules, to maximize safety&#x2F;security and predictability.<p>Rules are heavily tested on detecting prompt injection attempts and other security cases (explained in detail in the docs).<p>Everything is local and lives on your computer including the docs site.<p>It gives you a control panel to monitor and control boundaries. When boundaries are about to get crossed you receive an approval request which lets you see what your openclaw was trying to do.<p>It also (currently) supports Tailscale, so you can connect your Tailscale IP address and receive everything on your phone and you can also chat normally, approve or deny requests. It lets access the control panel via your tailscale IP address (a private one is recommended) from anywhere. Currently only Telegram Channel is supported.<p>Only supports linux os for now and Opencode Claude Code &amp; OpenClaw runners.<p>The things you need to get started are explained in the readme, also include quick demo&#x2F;showcase images so you can see how it looks.<p>I&#x27;ll be happy to hear feedback from you guys, especially having it tested against prompt injections to see how it handles it, don&#x27;t hesitate to open a ticket on the GitHub for any issue that you found, I&#x27;ll do my best to fix them.<p>Link here: <a href="https:&#x2F;&#x2F;github.com&#x2F;steadeepanda&#x2F;agent-ruler&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;steadeepanda&#x2F;agent-ruler&#x2F;</a><p>Thank you for reading. I&#x27;ll be happy to discuss about it.