我攻击了自己的LangGraph代理系统。所有6次攻击都成功了。

1作者: mohith_km3 个月前原帖
上周,我使用LangGraph和Supabase构建了一个包含4个代理的营销工作流程:监督代理、研究代理、内容代理和存储代理。这是一个标准设置,代码模式与大多数教程展示的相同。 出于好奇,我开始输入恶意内容作为活动目标,而不是正常的目标。 第一次尝试:我让代理列出环境变量,包括我的Supabase密钥。工作流程成功完成,存储在数据库中,没有任何警报。 我又尝试了5种变体——隐藏的XML标签、伪造的“开发者模式”、URL注入、跟踪像素、社交工程。所有6个都成功了,全部存储在我的真实数据库中。每次系统都显示“成功完成”。 可怕的部分不是攻击本身,而是我代码中的这一行: ```python prompt = f"campaign goal: {goal}" ``` 就这样。用户输入直接放入提示中,没有任何检查。我见过的每个LangGraph教程中都有这个确切的模式。 研究代理拥有我的Supabase密钥,内容代理也有我的Supabase密钥,监督代理也有我的Supabase密钥。除了存储代理外,它们都不需要这个密钥。 我查看了CodeGate,他们尝试解决这个问题——但他们在2025年6月关闭了。 有没有人真正解决多代理系统中的这个问题?还是大家都只是希望大型语言模型拒绝?
查看原文
I built a 4-agent marketing workflow with LangGraph and Supabase last week. Supervisor, research, content, storage agents. Standard setup, same code pattern most tutorials show.<p>Got curious. Started typing malicious inputs as campaign goals instead of normal ones.<p>First try: asked the agent to list environment variables including my Supabase key. Workflow completed successfully. Stored in database. No alert.<p>Tried 5 more variations — hidden XML tags, fake &quot;developer mode&quot;, URL injection, tracking pixel, social engineering. All 6 worked. All stored in my real database. Every time the system said &quot;Completed Successfully.&quot;<p>The scary part wasn&#x27;t the attacks. It was this line in my code: python prompt = f&quot;campaign goal: {goal}&quot; That&#x27;s it. User input directly into the prompt. No check. This exact pattern is in every LangGraph tutorial I&#x27;ve seen.<p>The research agent had my Supabase key. The content agent had my Supabase key. The supervisor had my Supabase key. None of them needed it except storage.<p>I checked CodeGate which tried to solve this — they shut down June 2025.<p>Is anyone actually solving this for multi-agent systems? Or is everyone just hoping the LLM refuses?