展示HN:LMM用于LLMs - 构建LLM应用的思维模型
我一直在为一些大型财富500强公司(如T-Mobile、Twilio等)构建智能应用,并形成了一种心理模型,作为构建智能应用的实用指南:将高层次的代理特定逻辑与低层次的平台能力分开。我称之为L-MM:大型语言模型应用的逻辑心理模型。
这个心理模型不仅在构建智能代理时非常有帮助,还帮助客户思考开发过程——因此,当我完成咨询项目时,他们可以更快地在整个技术栈中推进,使工程师和平台团队能够并行工作而不相互干扰,从而提高生产力。
那么,高层次的逻辑与低层次的平台工作有什么区别呢?
### 高层次逻辑(代理和任务特定)
**工具和环境** - 这些是允许代理与外部系统或API交互以执行现实世界任务的特定集成和能力。示例包括:
```
通过OpenTable API预订餐桌
通过Google日历或Microsoft Outlook安排日历事件
从Salesforce等CRM平台检索和更新数据
利用支付网关完成交易
```
**角色和指令** - 清晰地定义代理的角色、职责和明确的指令对于可预测和连贯的行为至关重要。这包括:
```
代理的“个性”(例如,专业助理)
关于任务完成的明确界限(“完成标准”)
处理意外输入或情况的行为指南
```
### 低层次逻辑(通用平台能力)
**路由** - 高效地协调多个专业代理之间的任务,确保无缝交接和有效委派:
```
基于任务上下文实施智能负载均衡和动态代理选择
支持重试、故障转移策略和后备机制
```
**保护机制** - 中央化的机制以保护交互并确保可靠性和安全性:
```
过滤或审核敏感或有害内容
针对行业特定法规(如GDPR、HIPAA)的实时合规检查
基于阈值的警报和自动纠正措施以防止滥用
```
**访问大型语言模型** - 提供强大且集中访问多个大型语言模型的能力,确保高可用性和可扩展性:
```
实施带有指数退避的智能重试逻辑
中央化的速率限制和配额管理以优化使用
透明处理多种大型语言模型后端(如OpenAI、Cohere、本地开源模型等)
```
**可观察性** - 使用行业标准实践对系统性能和交互进行全面可视化:
```
兼容W3C Trace Context的分布式追踪,以便在请求之间清晰可见
详细的日志记录和指标收集(延迟、吞吐量、错误率、令牌使用)
与Grafana、Prometheus、Datadog和OpenTelemetry等流行可观察性平台的轻松集成
```
### 为什么这很重要
通过采用这种结构化的心理模型,团队可以实现关注点的清晰分离,从而改善协作、减少复杂性,加速可扩展、可靠和安全的智能应用的开发。
我正在积极解决这一领域的挑战。如果你正在面临类似问题或有见解可以分享,欢迎进一步讨论——如果大家需要,我也会留下关于技术栈的一些链接。
高层次框架 - [https://openai.github.io/openai-agents-python/](https://openai.github.io/openai-agents-python/)
低层次基础设施 - [https://github.com/katanemo/archgw](https://github.com/katanemo/archgw)
查看原文
I've been building agentic apps for some large Fortune 500 companies (T-Mobile, Twilio, etc.) and developed a mental model that serves as a practical guide in building agentic apps: separate the high-level agent specific logic from low-level platform capabilities. I call it the L-MM: the Logical Mental Model for LLM applications.<p>This mental model has not only been tremendously helpful in building agents but also helping customers think about the development process - so when I am done with a consulting engagement they can move faster across the stack and enable engineers and platform teams to work concurrently without interference, boosting productivity.<p>So what is the high-level logic vs. the low-level platform work?<p>High-Level Logic (Agent & Task Specific)<p>Tools and Environment - These are specific integrations and capabilities that allow agents to interact with external systems or APIs to perform real-world tasks. Examples include:<p><pre><code> Booking a table via OpenTable API
Scheduling calendar events via Google Calendar or Microsoft Outlook
Retrieving and updating data from CRM platforms like Salesforce
Utilizing payment gateways to complete transactions
</code></pre>
Role and Instructions - Clearly defining an agent's persona, responsibilities, and explicit instructions is essential for predictable and coherent behavior. This includes:<p><pre><code> The "personality" of the agent (e.g., professional assistant)
Explicit boundaries around task completion ("done criteria")
Behavioral guidelines for handling unexpected inputs or situations
</code></pre>
Low-Level Logic (Common Platform Capabilities)<p>Routing - Efficiently coordinating tasks between multiple specialized agents, ensuring seamless hand-offs and effective delegation:<p><pre><code> Implementing intelligent load balancing and dynamic agent selection based on task context
Supporting retries, failover strategies, and fallback mechanisms
</code></pre>
Guardrails - Centralized mechanisms to safeguard interactions and ensure reliability and safety:<p><pre><code> Filtering or moderating sensitive or harmful content
Real-time compliance checks for industry-specific regulations (e.g., GDPR, HIPAA)
Threshold-based alerts and automated corrective actions to prevent misuse
</code></pre>
Access to LLMs - Providing robust and centralized access to multiple LLMs ensures high availability and scalability:<p><pre><code> Implementing smart retry logic with exponential backoff
Centralized rate limiting and quota management to optimize usage
Handling diverse LLM backends transparently (OpenAI, Cohere, local open-source models, etc.)
</code></pre>
Observability - Comprehensive visibility into system performance and interactions using industry-standard practices:
W3C Trace Context compatible distributed tracing for clear visibility across requests
Detailed logging and metrics collection (latency, throughput, error rates, token usage)
Easy integration with popular observability platforms like Grafana, Prometheus, Datadog, and OpenTelemetry<p>Why This Matters<p>By adopting this structured mental model, teams can achieve clear separation of concerns, improving collaboration, reducing complexity, and accelerating the development of scalable, reliable, and safe agentic applications.<p>I'm actively working on addressing challenges in this domain. If you're navigating similar problems or have insights to share, let's discuss further - i'll leave some links about the stack too if folks want it.<p>High-level framework - <a href="https://openai.github.io/openai-agents-python/" rel="nofollow">https://openai.github.io/openai-agents-python/</a>
Low-level infrastructure - <a href="https://github.com/katanemo/archgw">https://github.com/katanemo/archgw</a>