用于通过本地模型改善代码审查的结构和语义组件

2作者: rs5458372 天前原帖
我对改进代码审查很感兴趣,因为目前的审查效果仍然很差,因此我正在研究一个可以附加到本地大型语言模型(LLMs)/ API 调用的分类层,以便进行更好的代码审查。 大多数审查工具只是将拉取请求(PR)的差异直接输入到模型中,希望它能找到错误。模型看到的是添加/删除的行、代码块头和上下文行,但它并不知道它正在查看的函数是由其他 x 个函数在 y 个文件中调用的,或者这里的类型变化会破坏三层目录之外的接口。 分类层使用树解析器将源代码解析为抽象语法树(AST),提取出语义上有意义的实体(函数、类、方法、结构体),并构建跨文件的依赖图。它根据传递的影响范围对每个更改的实体进行排名,减少了 80-90% 的审查范围,并显著提高了对错误的关注度。虽然我相信它在某些情况下可能会出现分布外的问题,但对于快速的代码审查,这种权衡是值得的。 一旦你将问题缩小到“这是这个 PR 中 n 个风险最大的实体”,你就不再需要一个前沿模型。你需要的是一个只了解你代码的模型。一个在你的代码库上进行微调的 7B 模型了解你的模式、你的约定和你常见的错误。结构性分类处理了全局推理,使得你的模型能够很好地进行判断。 命令: - inspect diff - 实体级别的差异,带有风险评分和影响范围 - inspect predict - 显示哪些未更改的实体有可能会导致问题 - inspect review - 结构性分类 + LLM 审查 - inspect pr - 审查 GitHub PR 支持 21 种语言解析器。使用 Rust 编写。开源。 GitHub: https://github.com/Ataraxy-Labs/inspect
查看原文
I was curious in improving code reviews because they still suck, so researching on a triage layer that you can attach to your local LLMs&#x2F;api calls for better code reviews.<p>Most review tools dump a PR diff into a model and hope it finds bugs. The model sees added&#x2F;removed lines, hunk headers, context lines. It has no idea that the function it&#x27;s looking at is called by x other functions across y files, or that a type change here breaks an interface three directories away.<p>The triage layer parses source code into ASTs using tree-sitters, extracts semantically meaningful entities (functions, classes, methods, structs), and builds a cross-file dependency graph. It ranks every changed entity by transitive blast radius. Cuts the review surface by 80-90%, and increases the attention score on the bug significantly. Now I am sure it can be out of distribution few times but for fast code reviews this tradeoff is worth making.<p>Once you&#x27;ve narrowed the problem to &quot;here are the n riskiest entities in this PR,&quot; you don&#x27;t need a frontier model. You need a model that just knows your code. A 7B fine-tuned on your codebase knows your patterns, your conventions, your common bugs. Structural triage handles the global reasoning that results in your model handling the judgment call really well.<p>Commands:<p>- inspect diff - entity-level diff with risk scoring and blast radius<p>- inspect predict - show which unchanged entities are at risk of breaking<p>- inspect review - structural triage + LLM review<p>- inspect pr - review a GitHub PR<p>21 language parsers. Written in Rust. Open source.<p>Github: https:&#x2F;&#x2F;github.com&#x2F;Ataraxy-Labs&#x2F;inspect