人工智能的难题:我们生活在一个高度补贴的、有趣的时代
如果追溯大型语言模型(LLMs)从技术专家的梦想,到早期文本生成玩具,再到改变世界的ChatGPT发布,最后成为现代编程的日常工具(如Sonnet、Opus)的时间线,这一切发生的时间不到十年。这是一个令人兴奋、几乎难以置信的故事。
让我们看看我们是如何走到今天的,以及行业目前所面临的壁垒。
- **梦想阶段(2010-2016年)**。在上一个十年的黎明(2011年),发生了一件有趣的事情。维基百科和Stack Overflow这两个平台开始获得巨大的关注,大家在这些平台上合作,公开交流知识。回首往事,这似乎是人类更理想、以社区为驱动的发展路径——而我们却放弃了,转向了今天的集中式架构。
- **颠覆阶段(2016-2021年)**。一系列无关事件的完美风暴为人工智能铺平了道路。到2017年,新程序员们对Stack Overflow严格的政策、主观的问题拒绝和资深程序员的教条主义感到深深的沮丧。回想起来,那些严格的版主为后来的Copilot和ChatGPT奠定了基础。如果社区不愿意回答初学者的问题而选择降级,那么一个私人LLM乐意提供帮助。
再加上谷歌在2017年发布的里程碑论文《Attention Is All You Need》,解锁了Transformer架构,以及2020年COVID-19强制隔离的影响。突然间,虚拟助手作为孤立开发者的编程伙伴的土壤变得肥沃。
- **吸引阶段(2023-2025年)**。ChatGPT的发布毫无疑问地展示了“吸引”的简单性。对于非技术人员来说,这简直是魔法。像Copilot、Claude和Deepseek这样的专业LLM迅速成为程序员工具箱中不可或缺的一部分。与此同时,OpenAI仍在宣传其“非营利”根基,大家普遍认为这纯粹是为了赋能人类。
- **终局阶段(2025年至今/未来)**。到这个时候,人工智能公司在许多事情上都出现了误判。他们在优化“长期”目标,但正如约翰·梅纳德·凯恩斯多年前所说的,“从长远来看,我们都死了”。今天,风险投资者失去了耐心,因为尽管技术本身已经获得了广泛的普及和认可,但收入增长却没有那么快。吸引的效果在某种程度上奏效,但并未完全实现。
大多数前沿模型,如Sonnet、Opus和GPT 5.5,仍在“补贴模式”下运行。他们向用户收取的每月订阅费用(每月10/20/30美元)与运行这些“思考...”和“沉思...”的计算和内存需求相比,微不足道。为了真正显示利润并摆脱补贴模式,他们必须根据输入/输出令牌的规模进行收费,而这似乎很困难。很少有公司能够维持如此不确定的硬件扩展预算,最近的Uber故事正好展示了他们尝试这样做时发生的事情。
前沿模型试图替代人类历史上从未成功委托或自动化的东西——人类大脑的最高认知技能,如推理、演绎和逻辑。然而,努力仍在继续,目标是长期的。困境在于,如果他们停止补贴,吸引阶段可能会被逆转——人们很可能会回归到维基百科/Stack Overflow的旧方式,或者完全转向可以在自己硬件上本地运行的开源“干燥/学术”模型,如Llama和Qwen。然而,他们也无法无限期地继续补贴和耗尽资金。
当补贴的镜子破裂时,会发生什么?
查看原文
If you trace the timeline of how LLMs went from a technologist's dream to early text-generation toys, to the world-shifting launch of ChatGPT, and finally to the daily drivers of modern programming (Sonnet, Opus), it has taken less than a decade. It’s a thrilling, almost unbelievable tale.<p>Let's look at how we got here, and the wall the industry is currently hitting.<p>- The Dream Phase (2010-2016). By the dawn of the last decade (2011), an interesting thing was happening. The two platforms, Wikipedia and Stack Overflow, had started gaining tremendous traction, folks were collaborating on these platforms to openly exchange knowledge. Looking back, this feels like a more ideal, community-driven path for humanity — one we abandoned for the centralized architecture we have today.<p>- The Disruption Phase (2016-2021). A perfect storm of unrelated events paved the way for AI. By 2017, new programmers were growing deeply frustrated by Stack Overflow's rigid policies, subjective question rejections, and senior coder pedantry. In retrospect, those strict moderators carved the first stones of what would later become Copilot and ChatGPT. If the community won't answer a beginner's question without downvoting it, a private LLM gladly will.<p>Add to this Google's landmark 2017 paper "Attention Is All You Need" which unlocked the Transformer architecture, and the forced isolation of COVID-19 in 2020. The ground was suddenly fertile for virtual assistants that could act as isolated developers' programming partners.<p>- The Hook Phase (2023-2025). The launch of ChatGPT left no doubt about how easy the "hook" would be. For non-technical folks, it was pure magic. It didn't take long for specialized LLMs like Copilot, Claude and Deepseek to become an indispensable part of the programmer's toolbox. Meanwhile, OpenAI was still advertising its "non-profit" roots, and the consensus was that this was purely about empowering humanity.<p>- The Endgame Phase (2025-present/future). AI companies had miscalculated a lot of things by this time. They were optimizing for the "long-term" but as John Maynard Keynes rightly said many years ago, <i>"In the long-term, we are all dead"</i>. The VCs are losing patience today because while the technology itself has gained massive ubiquity and appreciation, the revenues aren't coming as fast. The hook had sort of worked but failed to fully work.<p>Most frontier models like Sonnet, Opus and GPT 5.5 are still running on 'subsidized mode'. The amount of monthly subscription they charge users (USD 10/20/30 per month) is a pittance compared to all the compute and RAM needed to run those "thinking..." and "pondering..." tokens. In order to truly show profits in the books and come out of subsidized mode, they must charge on the scaling of input/output tokens and that appears to be difficult. Very few companies might be able to sustain such unlimited budget for unpredictable hardware scaling, the recent Uber story shows exactly what happens when they try doing this.<p>The frontier models are trying to replace something which could never be successfully delegated or automated in entire human history - the highest cognitive skills of human brain like reasoning, deduction and logic. Yet, the efforts are on and the goals are long term. The conundrum is that if they stop subsidizing, the hook phase may be undone - there is a strong possibility of folks reverting back to older ways of Wikipedia/Stack Overflow or pivot entirely to open source <i>dry/academic</i> models like Llama and Qwen which can run locally on their own hardware. And yet, they also can't keep subsidizing and draining the funds indefinitely.<p>What happens when the subsidy mirror cracks?