问HN:为什么在Python中,死代码检测比大多数工具所承认的要困难?

2作者: duriantaco大约 8 小时前原帖
我一直在思考为什么在Python中,死代码检测(以及静态分析一般)相比其他语言感觉如此不可靠。我明白Python本质上是动态的。 理论上,这应该是简单的(再次强调,理论上):解析抽象语法树(AST),构建调用图,找到引用为零的符号。但在实践中,由于许多因素,这一过程很快就会失效,例如: 1. 动态调度(getattr、注册表、插件系统) 2. 框架入口点(Flask/FastAPI路由、Django视图、pytest夹具) 3. 装饰器和隐式命名约定 4. 仅通过测试或运行时配置调用的代码 大多数工具似乎在两种糟糕的权衡中选择其一: 1. 保守处理,错过大量真正的死代码 2. 激进处理,标记假阳性,导致人们失去信任 到目前为止,对我来说最有效的方法是将代码视为一种置信度评分,并结合一些有限的运行时信息(例如,测试期间实际执行的内容),而不是完全依赖静态分析。 我很好奇其他人在实际代码库中是如何处理这个问题的……你们是接受假阳性吗?还是完全忽视死代码检测?有没有人见过实际可扩展的方法?我知道SonarQube的噪音很大。 我构建了一个带有vsce扩展的库,主要是为了探索这些权衡(如果相关,链接在下面),但我更感兴趣的是其他人是如何看待这个问题的。希望我在正确的频道。 上下文的代码库: https://github.com/duriantaco/skylos
查看原文
I’ve been thinking about why dead code detection (and static analysis in general) feels so unreliable in Python compared to other languages. I understand that Python is generally dynamic in nature.<p>In theory it should be simple(again in theory): parse the AST, build a call graph, find symbols with zero references. In practice it breaks down quickly because of many things like:<p>1. dynamic dispatch (getattr, registries, plugin systems)<p>2. framework entrypoints (Flask&#x2F;FastAPI routes, Django views, pytest fixtures)<p>3. decorators and implicit naming conventions<p>4. code invoked only via tests or runtime configuration<p>Most tools seem to pick one of two bad tradeoffs:<p>1. be conservative and miss lots of genuinely dead code<p>or<p>2. be aggressive and flag false positives that people stop trusting<p>What’s worked best for me so far is treating the code as sort of a confidence score, plus some layering in limited runtime info (e.g. what actually executed during tests) instead of relying on 100% static analysis.<p>Curious how others handle this in real codebases..<p>Do yall just accept false positives? or do yall ignore dead code detection entirely? have anyone seen approaches that actually scale? I am aware that sonarqube is very noisy.<p>I built a library with a vsce extension, mainly to explore these tradeoffs (link below if relevant), but I’m more interested in how others think about the problem. Also hope I&#x27;m in the right channel<p>Repo for context: https:&#x2F;&#x2F;github.com&#x2F;duriantaco&#x2F;skylos