请问HN:用文件系统接口替代RAG管道以供AI代理使用
我开始的每个AI代理项目最终都落入同样的模板:分块文档,选择嵌入模型,设置向量存储,编写检索逻辑,将其连接到自定义工具。<p>这确实可行,但这只是基础工作——每个新代理或运行时都需要重建。<p>我正在探索的想法是:在 `/drive/` 挂载一个驱动器,并创建两个目录:<p>- `/drive/files/` — 实际文档(PDF、代码、Markdown等)<p>- `/drive/search/` — 虚拟目录,其中文件名就是语义查询<p>所以,代理不再需要自定义的RAG工具,而是直接执行:<br>cat "/drive/search/refund policy enterprise customers"<p>任何能够读取文件的运行时都能立即工作。没有集成代码。上下文成本降低了约10-20倍,因为你得到的是相关的文块,而不是完整的文档。<p>在底层:使用markitdown进行转换,使用sqlite-vss进行向量搜索,以及一个虚拟文件系统层将所有内容连接在一起。<p>在我构建这个之前:这是一个我不知道的已解决问题吗?文件系统接口是否合理,还是我在把简单的事情复杂化?<p>如果有兴趣,我会在GitHub上分享实现细节。<p>如果有足够的兴趣,我会公开构建这个并分享更新。<br>关注我:@r_klosowski 在X上。
查看原文
Every AI agent project I start ends up with the same boilerplate: chunk docs, pick an embedding model, set up a vector store, write retrieval logic, wire it into a custom tool.<p>It works, but it's plumbing — and it needs to be rebuilt for every new agent or runtime.<p>The idea I'm exploring: mount a drive at /drive/ with two directories:<p>- /drive/files/ — actual documents (PDF, code, markdown, etc.)<p>- /drive/search/ — virtual directory where the filename IS the semantic query<p>So instead of a custom RAG tool, the agent just does:
cat "/drive/search/refund policy enterprise customers"<p>Any runtime that reads files works immediately. No integration code. Context cost drops ~10-20x since you get a relevant chunk, not the full document.<p>Under the hood: markitdown for conversion, sqlite-vss for vector search, and a virtual filesystem layer to wire it all together.<p>Before I build this: is this a solved problem I'm not aware of? Does the filesystem interface make sense, or am I overcomplicating something simpler?<p>GitHub / implementation details coming if there's interest.<p>If there's enough interest, I'll build this in public and share updates.
Follow along: @r_klosowski on X