展示HN:使用这个MCP/CLI工具将LLM令牌使用量减少约30%(经过Claude基准测试)

1作者: jahala2 个月前原帖
智能代码阅读,适用于人类和人工智能代理。Tilth 是将 ripgrep、tree-sitter 和 cat 结合在一起的产物。 --<p>v0.4.4:为调用者搜索添加了自适应的二次影响分析——当一个函数的唯一调用者数量 ≤10 时,Tilth 会在一次扫描中自动追踪调用者的调用者。首次实现完整的 26 项 Opus 基准(之前仅有 5 项困难任务)。Haiku 的采用率从 42% 提升至 78%,使 Haiku 从成本回归转变为 -38% $&#x2F;正确率。<p>v0.4.5:将 TOKEN_THRESHOLD 从 3500 提高到 6000 估计标记(约 24KB),因此中等大小的文件返回完整内容,而不是代理通过 5–7 次顺序 --section 调用读取的提纲。修复了两个主要回归问题:gin_radix_tree (+35% → ~持平) 和 rg_search_dispatch (+90% → -26% 胜率)。Sonnet 达到 100% 的准确率(52&#x2F;52)和 -34% $&#x2F;正确率。<p>--<p><a href="https:&#x2F;&#x2F;github.com&#x2F;jahala&#x2F;tilth&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;jahala&#x2F;tilth&#x2F;</a><p>完整结果:<a href="https:&#x2F;&#x2F;github.com&#x2F;jahala&#x2F;tilth&#x2F;blob&#x2F;main&#x2F;benchmark&#x2F;README.m" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;jahala&#x2F;tilth&#x2F;blob&#x2F;main&#x2F;benchmark&#x2F;README.m</a>...<p>-- 附言:我没有预算进行大量基准测试(尤其是 Opus),所以如果有任何有能力的标记大户可以运行一些基准测试,请随时提交结果。
查看原文
Smart code reading for humans and AI agents. Tilth is what happens when you give ripgrep, tree-sitter, and cat a shared brain. --<p>v0.4.4: Added adaptive 2nd-hop impact analysis to callers search — when a function has ≤10 unique callers, tilth automatically traces callers-of-callers in a single scan. First full 26-task Opus baseline (previously 5 hard tasks only). Haiku adoption improved from 42% to 78%, flipping Haiku from a cost regression to -38% $&#x2F;correct.<p>v0.4.5: Bumped TOKEN_THRESHOLD from 3500 to 6000 estimated tokens (~24KB), so mid-sized files return full content instead of an outline that agents then read back via 5–7 sequential --section calls. Fixed two major regressions: gin_radix_tree (+35% → ~tie) and rg_search_dispatch (+90% → -26% win). Sonnet hit 100% accuracy (52&#x2F;52) and -34% $&#x2F;correct overall.<p>--<p><a href="https:&#x2F;&#x2F;github.com&#x2F;jahala&#x2F;tilth&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;jahala&#x2F;tilth&#x2F;</a><p>Full results: <a href="https:&#x2F;&#x2F;github.com&#x2F;jahala&#x2F;tilth&#x2F;blob&#x2F;main&#x2F;benchmark&#x2F;README.m" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;jahala&#x2F;tilth&#x2F;blob&#x2F;main&#x2F;benchmark&#x2F;README.m</a>...<p>-- PS: I dont have the budget to run the benchmark a lot (especially with Opus), so if any token whales has capacity to run some benchmarks, please feel free to PR results.