请问HN:发布代码在AGPLv3许可下,但想要阻止大型语言模型的重构?
我正在准备发布一个遵循AGPLv3许可协议的软件项目。我的目标是实现传统的互惠版权原则——如果你使用或托管它,请分享你的修改。
然而,我对当前的法律环境持现实态度。大型科技公司将公共代码视为用于大型语言模型(LLM)训练的免费原材料,打着“合理使用”的旗号。我担心某家公司会获取我的代码库,并利用LLM有效地“洗白”逻辑,从而让他们的用户能够在不触发AGPL的情况下,提示生成一个干净的、闭源的我的软件重制版。
我们是否有专门的许可证来防止这种情况,同时又能保持开源软件的健康和活力?我们是否有llm.txt或robots.txt文件,能够被LLM抓取工具尊重?我觉得整个开源软件模型在这里面临威胁,甚至比以前更严重(例如,大公司从Linux实例中赚取数十亿美元,而不需要支付任何软件许可费用,但他们却乐于向其他人收费自己的操作系统)。
查看原文
I am preparing to release a software project under the AGPLv3. The goal is traditional copyleft reciprocity - if you use it or host it, share your changes.<p>However, I am realistic about the current legal landscape. Big tech corps are treating public code as free raw material for LLM training under the banner of "Fair Use". I am concerned that a company will ingest my codebase and use an LLM to effectively launder the logic, allowing their users to prompt a clean, closed-source recreation of my software without triggering the AGPL.<p>Do we have a licence specifically to prevent this but still keep OSS healthy and alive? Do we have a llm.txt / robots.txt that LLM scrapers respect? I feel that the whole OSS model is under threat here, even more than before (e.g. big corps earn billions from Linux instances without having to pay any software licensing cost, but they're more than happy to charge others for their own OS).