展示HN:我构建了一个小型开源内核,用于重放和比较AI决策。
我最近在一个名为 Verist 的小型开源项目上进行开发,想在这里分享一下,以便获得一些早期反馈。
让我感到困扰的并不是如何构建 AI 功能,而是之后的一切:解释某个事件发生的原因、几周后再现该事件,或者在不以微妙方式破坏事物的情况下更改提示/模型。
日志在某种程度上有所帮助,但远远不够。代理框架对我来说感觉太隐晦了。而模型升级则实在让人感到害怕,输出会发生变化,且不总是能明确知道变化的原因或位置。
因此,我最终构建了一个非常小巧、明确的内核,在这里每一个 AI 步骤都可以被重放、比较和审查。可以把它想象成一种类似 Git 的工作流程,用于 AI 决策,但并不试图成为一个框架或运行时。
这不是一个代理框架,不是聊天界面,也不是平台,只是一个专注于明确状态、审计事件以及重放和比较的 TypeScript 库。
代码库:<a href="https://github.com/verist-ai/verist" rel="nofollow">https://github.com/verist-ai/verist</a>
我特别想知道这里的其他人是否在将 AI 功能投入生产时遇到过类似的问题,或者觉得这是否有些过于复杂。欢迎提问或提出批评。
查看原文
I’ve been hacking on a small open-source project called Verist and figured I’d share it here to get some early feedback.<p>What kept bothering me with AI features in production wasn’t really how to build them, but everything that comes after: explaining why something happened, reproducing it weeks later, or changing prompts/models without breaking things in subtle ways.<p>Logs helped a bit, but not enough.
Agent frameworks felt too implicit for my taste.
And model upgrades were honestly scary, outputs would change and it wasn’t always obvious where or why.<p>So I ended up building a very small, explicit kernel where each AI step can be replayed, diffed, and reviewed. Think something like Git-style workflows for AI decisions, but without trying to be a framework or a runtime.<p>It’s not an agent framework, not a chat UI, and not a platform, just a TypeScript library focused on explicit state, audit events, and replay + diff.<p>Repo: <a href="https://github.com/verist-ai/verist" rel="nofollow">https://github.com/verist-ai/verist</a><p>I’m especially curious if others here have run into similar issues shipping AI features to prod, or if this feels like overkill. Happy to answer questions or hear criticism.