展示HN:Airbyte代理 – 跨多个数据源的代理上下文
我是米歇尔,Airbyte 的联合创始人兼首席执行官(<a href="https://airbyte.com" rel="nofollow">https://airbyte.com</a>)。过去六年里,我们一直在构建数据连接器。今天,我们推出了 Airbyte Agents(<a href="https://docs.airbyte.com/ai-agents/" rel="nofollow">https://docs.airbyte.com/ai-agents/</a>),这是一个统一的数据层,旨在帮助代理在操作系统中发现信息并采取行动。
以下是一个简要的介绍:<a href="https://www.youtube.com/watch?v=ZosDytyf1fg" rel="nofollow">https://www.youtube.com/watch?v=ZosDytyf1fg</a>
随着代理进入实际工作流程,它们需要访问更多工具(例如 Slack、Salesforce、Linear)。这意味着需要大量的 API 管理:身份验证、分页、过滤器、处理模式以及在系统之间匹配实体。
大多数 MCP(多渠道平台)并没有解决这个问题。它们只是对 API 的薄包装,因此代理继承了它们的弱原语,并且在跨工具工作时大多数情况下仍然会出错。
更深层次的问题是,API 假设你已经知道要查询什么(想想端点、对象 ID、字段),而代理通常需要在此之前先一步:它们需要首先发现重要的信息,然后才能开始推理。
因此,我们构建了 Airbyte Agents,作为您的代理与所有数据之间的上下文层。其核心是我们称之为上下文存储(Context Store)的东西:一个针对代理搜索优化的数据索引,由我们的复制连接器填充。过去六年在数据连接器上的所有工作在这里都派上了用场!
这为代理提供了一种结构化的方式来发现数据,同时在需要时仍然允许它们直接读取和写入上游系统。
促使我们开展这项工作的原因是我们在将一个代理迁移到我们的新 SDK 时,发现了一个令人震惊的追踪记录。它本应回答“本季度哪些客户有流失风险?”这个追踪记录有 47 步,大多数是 API 调用。代理首先必须找到一堆账户,然后将它们映射到正确的客户,再查找工单,等等……当代理最终响应时,答案听起来不错,但却是错误的。不仅如此,速度也极其缓慢。因此,我们必须对此采取行动。
这个 47 步的代理是 Airbyte Agents 特别擅长处理的问题的一个例子。其他例子包括:
- “给我展示所有本月即将关闭的企业交易以及未解决的支持工单。”
- “找到所有没有打开 GitHub 问题的支持工单。”
这些问题听起来可能很简单,但当代理不需要在运行时组装所有上下文时,答案的质量会发生显著变化。
一旦我们有了产品的早期版本,我花了一个周末构建了一个基准测试工具来查看它是否有效。为了好玩,我喜欢编写基准测试 :)。我比较了调用 Airbyte Agent MCP 与直接调用一堆供应商 MCP 的情况。我测试了检索和搜索。
为了简化,我使用了令牌消耗作为衡量单位。我认为这可以很好地反映代理的工作效率。一个失败的代理(比如那个需要 47 步的代理)会在毫无进展的情况下消耗大量令牌,而一个成功的代理则能直奔主题。
以下是我在测量时发现的结果:对于 Gong,它使用的令牌比他们自己的 MCP 少了多达 80%;对于 Zendesk,少了多达 90%;对于 Linear,少了多达 75%;对于 Salesforce,少了多达 16%(Salesforce 自己的 SOQL 在这里表现良好)。
当然,存在通常的明显偏见:我们是正在基准测试的构建者。因此,我们将测试工具公开发布:<a href="https://github.com/airbytehq/airbyte-agents-benchmarks" rel="nofollow">https://github.com/airbytehq/airbyte-agents-benchmarks</a>。欢迎您进行探索,如果有发现,请告诉我们!
现在还很早,一些部分还比较粗糙,但我们希望尽快与社区分享这一成果。我们很想听听正在构建代理的人的看法:
- 你们是提前索引数据,还是让代理实时调用 API?
- 你们是如何在系统之间匹配实体的?
我们也非常希望听到任何想法、评论或建议,关于我们如何改进,以及是否有明显的遗漏。目前,我们很高兴继续构建!
查看原文
I’m Michel, co-founder and CEO of Airbyte (<a href="https://airbyte.com/" rel="nofollow">https://airbyte.com/</a>). We’ve spent the last six years building data connectors. Today we're launching Airbyte Agents (<a href="https://docs.airbyte.com/ai-agents/" rel="nofollow">https://docs.airbyte.com/ai-agents/</a>), a unified data layer for agents to discover information and take action across operational systems.<p>Here’s a quick walkthrough: <a href="https://www.youtube.com/watch?v=ZosDytyf1fg" rel="nofollow">https://www.youtube.com/watch?v=ZosDytyf1fg</a><p>As agents move into real workflows, they need access to more tools (e.g. Slack, Salesforce, Linear). That means a ton of API plumbing: authentication, pagination, filters, handling schema, and matching entities across systems.<p>Most MCPs don’t fix this. They’re thin wrappers over APIs, so agents inherit their weak primitives and still get it wrong most of the time, especially when working across tools.<p>An even deeper issue is that APIs assume you already know what to query (think endpoints, Object IDs, fields), whereas agents usually start one step earlier: they need first to discover what matters before they can even start reasoning.<p>So we built Airbyte Agents to be a context layer between your Agents and all of your data. The core of this is something we call Context Store: a data index optimized for agentic search, populated by our replication connectors. All that work on data connectors the last six years comes in handy here!<p>This gives agents a structured way to discover data, while still allowing them to read and write directly to the upstream system when needed.<p>What got us working on this was an insane trace from an agent we were migrating to our new SDK. It was supposed to answer "which customers are at risk of leaving this quarter?" The trace had 47 steps. Most were API calls. The agent first had to find a bunch of accounts, then map them to the right customers, then look for tickets, bla bla... and when the Agent finally responded, the answer sounded ok, but was wrong. Not only that, it was excruciatingly slow. So we had to do something about it.<p>That 47-step agent is one example of a question where Airbyte Agents does particularly well. Other examples: - “Show me all enterprise deals closing this month with open support tickets." - “Find every support ticket that doesn’t have a Github issue opened”<p>Some of these might sound simple, but the quality of the answer changes dramatically when the agent doesn’t have to assemble all that context at runtime.<p>Once we had an early version of the product, I spent a weekend building a benchmark harness to see if it worked. Also for fun, I like writing benchmarks :). I compared calling the Airbyte Agent MCP vs calling a bunch of vendor MCPs directly. I tested retrieval, and search.<p>For the sake of simplicity, I used token consumption as a unit of measure. I think that’s a good proxy for how well agents are working. A failing agent (like the one that took 47 steps), will churn through lots of tokens while getting nowhere, while a successful one will get straight to the point.<p>Here's what I found when measuring: for Gong, it used up to 80% fewer tokens than their own MCP, for Zendesk up to 90% fewer, for Linear up to 75%, and for Salesforce up to 16% (Salesforce’s own SOQL does a good job here).<p>Of course there is the usual obvious bias: we are the builders of what we are benchmarking. So we made the test harness public: <a href="https://github.com/airbytehq/airbyte-agents-benchmarks" rel="nofollow">https://github.com/airbytehq/airbyte-agents-benchmarks</a>. Feel free to poke at it, and please tell us what you find if you do!<p>It's still early and some parts are rough, but we wanted to share this with the community asap. We'd love to hear from people building agents:
- Are you indexing data ahead of time, or letting the agent call APIs live?
- How are you matching entities across systems?<p>Would also love to hear any thoughts, comments, or ideas of how we could make this better, and if there are obvious things we’re missing. For now, we’re excited to keep building!