请问HN:你们是如何在本地Ollama会话中处理持久性内存的?

2作者: null-phnix大约 2 个月前原帖
我在本地构建了很多小型的人工智能工具,主要基于Ollama,但我一直遇到的问题是每次会话都是从零开始。无论我前一天晚上建立了什么上下文,无论模型学到了我喜欢的结构方式,或者我们一起进行的半成品推理,当我打开一个新的终端时,这些都消失了。 一段时间以来,我只是手动在每次会话开始时粘贴上下文,这听起来确实很痛苦。最终,我构建了一个小代理,它位于我的客户端和Ollama之间,试图解决这个问题。它嵌入了最近的交互,存储在本地,并在新会话开始时注入相关的内容。它的工作效果足够好,以至于我现在每天都在使用,但我构建的方式就像一个没有正式计算机科学背景的人那样,这意味着我只是将其拼凑起来,并不完全自信架构是正确的。 让我仍然感到困扰的是作用域问题。我同时在几个不同的项目上工作,不希望一个项目的上下文渗透到另一个项目中。目前我通过手动管理,基本上是保持独立的目录并小心处理,但这感觉像是一种变通,而不是解决方案。 我真心想知道其他人是怎么处理这个问题的。你们是使用向量数据库进行检索,还是使用普通文件,或者基于某种MCP的方案,还是已经接受了本地会话是无状态的,并围绕这个构建了工作流程?如果你们干净利落地解决了作用域问题,我真的很想知道你们是怎么做到的。
查看原文
I build a lot of small AI tools locally, mostly on top of Ollama, and the thing I keep running into is that every session starts from zero. Whatever context I built up the night before, whatever the model learned about how I like things structured, whatever half-finished reasoning we were working through together, it is just gone when I open a new terminal.<p>For a while I was just manually pasting in context at the start of every session which is exactly as painful as it sounds. Eventually I built a small proxy that sits between my client and Ollama and tries to solve this. It embeds recent interactions, stores them locally, and injects the relevant chunks when a new session starts. It works well enough that I actually use it every day now, but I built it the way someone with no formal CS background builds things, which means I patched it into shape and I am not totally confident the architecture is right.<p>The part that still bothers me is scoping. I work on a few different projects at the same time and I do not want context from one bleeding into another. Right now I am managing that by hand, basically just keeping separate directories and being careful, but that feels like a workaround not a solution.<p>Genuinely curious what other people have landed on. Are you using a vector DB for retrieval, or plain files, or something MCP based, or have you just accepted that local sessions are stateless and built your workflow around that? And if you have solved the scoping problem cleanly I really want to know how.