问HN:'变压器替代品'的发布路径?

2作者: adinhitlore11 天前原帖
我花了大约1000小时在一个副项目上,设定了两个目标: 1. 在CPU上比变压器(transformers)更快; 2. 比变压器更智能。 下面是几张截图(黑色/红色部分是故意遮盖的……暂时如此): https://i.imgur.com/r0equ55.png https://i.imgur.com/fohRbIr.png https://i.imgur.com/5Xx1RGX.png 总结:这到底是什么? 有两个架构: 1. 线性RNN,解决了当前领先的RNN变压器替代品(RWKV、Mamba)中的长记忆问题,此外,它对CPU友好,完全用C从零开始编写,但代码量不大:大约4000行。 2. 两个SNN实验程序(最初用C编写,也移植到了C#和F#),结果比预期的要好,但不幸的是,目前来看:比线性RNN更“笨”(我需要更多测试)。 问题是:该如何处理它们?谷歌的Gemini Pro 3.1/Sonnet 4.6告诉我申请专利、知识产权,估计价值数百万,虽然这显然是个错误:我已经将所有代码上传到Claude/Gemini进行分析,但考虑到项目大约70%是通过Vibe编码的,我觉得以守门人的姿态行事有些傲慢。 问题是:我并不想要数百万,但同时我看到开放源代码发布存在几个问题: * 完全不对齐,我不相信“AGI热潮”,但潜在风险可能存在,比如网络安全方面; * 我坦率地说讨厌XAI和马斯克,而可能对运行AI模型作为B2C解决方案感兴趣的公司大约有20家,其中之一就是XAI。 * 非常不正统的实现:全部用C编写,并在C#/F#中移植。没有Python或Rust,这意味着可能有些不熟悉这些机器学习语言的人会遇到问题,因此我必须不停地提供支持,这很耗时,老实说,一旦开源,我将不得不无偿提供支持。 * 即使它有潜力,也可能在GitHub上默默无闻地消亡,除非你中了大奖,天然流量很少会有效。 顺便说一下,这并不是炫耀,我相信有比我更优秀的程序员,有比我更懂机器学习的人,也有比我更优秀的数学家,尽管坦白说,我具备一种特殊的坚持与傲慢的结合,这在技术/发明/新颖性方面走得很远。 正如我所说,这些是数百小时工作的结果,加上多年的其他领域编程经验,这并不是一个周末“Claude,给我AGI”的尝试。 所有项目在编译时没有任何警告,逻辑上似乎有效,并且在速度上明显快于变压器,具备明显的泛化能力和创造新/独特内容的能力。缺少的部分是经典基准测试的扩展和基准测试。 我缺乏的是对技术采纳的理解。10倍!
查看原文
So, a side project I&#x27;ve spent&#x2F;wasted ~1000 hours on, with 2 goals set in mind:<p>1. faster than transformers on CPU; 2. smarter than transformers.<p>couple of screenshots below (the black&#x2F;red part are censored on purpose...for now):<p>https:&#x2F;&#x2F;i.imgur.com&#x2F;r0equ55.png https:&#x2F;&#x2F;i.imgur.com&#x2F;fohRbIr.png https:&#x2F;&#x2F;i.imgur.com&#x2F;5Xx1RGX.png<p>Summary: what the hell is this?<p>Two architectures -<p>1. Linear RNN which solves the long memory problem in current front-runner RNN transformer alternatives (RWKV, Mamba), in addition to being cpu friendly and entirely in C from scratch, but not too big: ~4000 lines.<p>2. 2 SNN experimental programs (in C originally but also ported to C# and F#) that turned out to be better than expected but unfortunately for the time being: dumber than the linear RNN one (i need more tests).<p>The question is: what to do with them? google gemini pro 3.1&#x2F;sonnet 4.6 told me to patent, IP, estimating value in the many millions and while this is clearly a mistake: I&#x27;ve uploaded all the code to claude&#x2F;gemini for analysis though seeing how the project is ~70% vibecoded I think it would be snobby to act like a gatekeeper.<p>The thing is: I don&#x27;t want millions but at the same time i see several issues with fee open source rollout:<p>* completely unalighned, i don&#x27;t believe in the &quot;agi hype&quot; but potential risks may exist, such as in cybersecurity; * I frankly hate Xai and Musk and since the companies who may be interested in running AI models as b2c solution are likely ~20, one of them will be xai. * Very unorthodox implementation: All in C with ports in c#&#x2F;f#. No python or rust, which would mean likely some people unfamiliar with these languages in ML running into issues so i&#x27;d have to support nonstop which is time consuming and let&#x27;s face it i&#x27;ll have to do it for free once it&#x27;s open source. * It may die completely unheard of somewhere on GitHub even if it has potential, organic traffic rarely works unless you hit the lottery.<p>This is NOT a flex btw, I&#x27;m convinced there are programmers better than me, people who understand ML better than me, mathematicians better than me though frankly I posses special kind of persistence combined with arrogance which goes a long way in terms of technology&#x2F;inventions&#x2F;novelty.<p>Like i said this is the results of hundreds of hours work spiced up by many years programming experience in other areas, this wasn&#x27;t one weekend &quot;claude, give me agi&quot; kind of shot.<p>All of the projects compile with zero warnings, logically seem to work and are visibly faster than transformers with obvious ability to generalize and create new&#x2F;unique content. The missing part is scaling and benchmarking on classic benchmarks.<p>What I lack is understanding adoption of technology.10x!