HackerNews中文版

在过去几个月里，我尝试了许多潜在的替代变压器（transformers）的方法，其中之一是所谓的脉冲神经网络（Spiking Neural Networks，SNNs）。对于不熟悉它们的人来说，这是一种旨在使人工智能尽可能接近人脑的方式。换句话说，变压器、递归神经网络（RNN）和长短期记忆网络（LSTM）是以结果为导向的（并不寻求1:1模拟人类），而脉冲神经网络则试图精确模仿人类学习的方式。将它们用于自然语言处理（NLP）的想法并不完全新颖（可以查找SpikeGPT），然而据我所知，尚未有人尝试以下内容： * 在脉冲神经网络中每n个标记后加入情节性长期记忆（这是我自己的想法）； * 在没有Torch/TensorFlow的情况下，使用C和C#进行实现（SpikeGPT是用Python和Torch编写的）； * 多种类型的“注意力”机制、训练模式和记忆模式； * 无需反向传播的训练/学习； * 在CPU友好的意义上，虽然速度仍然有点慢（不幸的是），但至少不需要GPU。以下是C# Windows窗体实现和C/Cygwin移植的屏幕截图，以及关于该程序的Claude Sonnet 4.6和Gemini Pro 3.1的两张随机截图： https://imgur.com/a/SAQqKmm 为什么从种子生成的文本仍然远未完美？有两个原因：语料库非常小，且C#的准确率低于100%。不过，令人惊喜的是：似乎语法和语义都被学习到了，这与我想要加入长期情节记忆的方法结合起来，可以轻松扩展到数千个标记，而不会降低速度——这可能使其成为一个实用的程序。生成速度也非常快。未来的工作： * BPE（字节对编码），现在只是一个单词分词器……不适合代码； * 我说过“代码”吗？这可能在编码方面完全失败……或者也许不是：完全未经测试； * 该程序实际上有两个版本，另一个版本明显与这个版本不同，并且有C和甚至F#的移植，然而F#的移植根本无法工作……它总是产生完全的胡言乱语……这是一个重大错误； * 从未在实际的神经形态CPU上测试过，只是在老旧的英特尔通用笔记本电脑上测试过； * Python移植应该是可行的； * 最后一个重大测试：大型文本语料库（以兆字节计）和超过95%的准确率——这是最终测试。

查看原文

So over the past months I've experimented with many potential alternatives to transformers, one of which is the so called Spiking Neural Networks which, to the ones unfamiliar with them, is an approach to AI which seeks to make AI as similar to the human brain as possible, in other words while transformers or RNN, LSTMs are result-driven (not seeking to simulate human 1:1) - SNNs actually try to mimic the way humans learn exactly. The idea to use them for NLP isn't entirely new (google SpikeGPT), however to the best of my knowledge no one yet has tried this:* inclusion of episodic long term memory in SNN every-n-number-of-tokens behind (my own idea...); * implementations in C and C# ports without torch/tensorflow (SpikeGPT is in python with torch); * several types of 'attention' and training modes and memory modes; * training/learning without backpropagation; * CPU-friendly in the sense that while it's still kind of slow (unfortunately) at least GPU isn't mandatoryHere are the screenshots of both the c# windows forms implementation and the C/cygwin port...and 2 random screenshots of claude sonnet 4.6 and gemini pro 3.1 about the program:https://imgur.com/a/SAQqKmmwhy is the text generated from seed still far from perfect? 2 reasons: very small corpus and the c# has <100% accuracy.However the big nice surprise: It seems like grammar and semantics are both learned, this coupled with my idea to include a way for long term episodic memory a long context outside the tiny 'ctx' window can be extended easily to thousands of tokens behind without decrease in speed - could make it a practical program. Generation is also very fast.future work:* BPE, right now it's just words tokenizer...not good for code; * did i say "code"? It may actually be a total failure for coding...or maybe not: completely untested; * The program actually has 2 versions, the other one noticeably deviates from this one and it has c and even f# port, however the f# just doesn't work...it always produces complete gibberish...major bug. * never tested on actual neuromorphic CPU, just goood ol' intel universal laptop ones; * python port should be possible; * finally the big test: large text corpus (megabytes) and accuracy over 95% <- ultimate test.

SNN（脉冲神经网络）灵感的生成型人工智能在C/C#中实现，不依赖外部人工智能库是否有前景？