2作者: beacon2944 个月前原帖
This is a declarative orchestration framework for stateless short-lived LLM agents.<p>Agents are orchestrated by hierarchical state machines (HSMs).<p>The HSMs and agents are defined in YAML, with hooks and webhooks for additional functionality.<p>* FlatAgent: A single LLM call: model + prompts + output schema.<p>* FlatMachine: A state machine that orchestrates multiple agents, actions, and state machines.<p>Examples: - <a href="https:&#x2F;&#x2F;github.com&#x2F;memgrafter&#x2F;flatagents&#x2F;tree&#x2F;main&#x2F;sdk&#x2F;python&#x2F;examples" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;memgrafter&#x2F;flatagents&#x2F;tree&#x2F;main&#x2F;sdk&#x2F;pytho...</a> - <a href="https:&#x2F;&#x2F;github.com&#x2F;memgrafter&#x2F;research-crawler-flatagents" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;memgrafter&#x2F;research-crawler-flatagents</a> - <a href="https:&#x2F;&#x2F;github.com&#x2F;memgrafter&#x2F;claude-skills-flatagents" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;memgrafter&#x2F;claude-skills-flatagents</a>
4作者: raymondtana4 个月前原帖
Raymond here from Butter.dev, an LLM response cache built as a chat-completions proxy. Today we&#x27;re launching a key feature for the platform: the ability to generalize on dynamic, templated inputs.<p>Caching at the HTTP request level has the obvious problem of generalizability. Nearly no request is identical, due to templated variables (like names) and metadata (like timestamps), so exact-match cache lookups rarely hit. We solve this at Butter by using LLMs to detect dynamic content in requests and derive their inter-relationships, allowing the cache entry to be stored as a template + variables + deterministic code. This allows future requests to contain different variable data, yet still serve from cache.<p>We&#x27;ve found this approach greatly improves cache hit rate, and believe it could be useful for agents performing repetitive back-office tasks, computer use, or data transformations where input data is frequently of the same shape.<p>- You can see a demo of learning patterns here: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=ORDfPnk9rCA" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=ORDfPnk9rCA</a><p>- We wrote more about the technical approach here: <a href="https:&#x2F;&#x2F;blog.butter.dev&#x2F;on-automatic-template-induction-for-response-caching">https:&#x2F;&#x2F;blog.butter.dev&#x2F;on-automatic-template-induction-for-...</a><p>- It&#x27;s free to try out here: <a href="https:&#x2F;&#x2F;butter.dev&#x2F;auth">https:&#x2F;&#x2F;butter.dev&#x2F;auth</a>