HackerNews中文版

Tenmo 一个用纯Mojo编写的轻量级张量库和神经网络框架。https://github.com/ratulb/tenmoTenmo专注于：SIMD优化显式内存布局零拷贝视图一个简约但实用的自动求导系统状态：Tenmo与Mojo本身一起发展。 API可能会发生变化。目前尚未准备好用于生产环境。性能 MNIST（4层多层感知器，105K参数，15个训练周期）平台设备平均每个周期总测试准确率 Tenmo CPU（Mojo） 11.4秒 171秒 97.44% PyTorch CPU 14.5秒 218秒 98.26% PyTorch GPU（Tesla T4） 15.2秒 227秒 97.87% 备注Tenmo在连续缓冲区上使用SIMD向量化内核。在MNIST运行中未使用BLAS——所有操作均作为纯Mojo代码执行。对于这种规模的模型，GPU开销占主导地位；更大模型更能从GPU加速中受益。快速示例 ```python from testing import assert_true from tenmo import Tensor fn main() raises: var a = Tensor.d1([1.0, 2.0, 3.0], requires_grad=True) var b = a * 2 var c = a * 3 var d = b + c d.backward() assert_true(a.grad().all_close(Tensor.d1([5.0, 5.0, 5.0]))) ``` 非常感谢您的反馈！

查看原文

Tenmo A lightweight tensor library and neural network framework written in pure Mojo.https://github.com/ratulb/tenmoTenmo focuses on:SIMD-optimization explicit memory layout zero-copy views a minimal but practical autograd system Status: Tenmo evolves alongside Mojo itself. APIs may change. Not production-ready yet.Performance MNIST (4-layer MLP, 105K params, 15 epochs) Platform Device Avg Epoch Total Test Acc Tenmo CPU (Mojo) 11.4s 171s 97.44% PyTorch CPU 14.5s 218s 98.26% PyTorch GPU (Tesla T4) 15.2s 227s 97.87% NotesTenmo uses SIMD-vectorized kernels on contiguous buffers. No BLAS was used in the MNIST run — everything executes as pure Mojo code. GPU overhead dominates for models of this size; larger models benefit more from GPU acceleration. Quick Example from testing import assert_true from tenmo import Tensorfn main() raises: var a = Tensor.d1([1.0, 2.0, 3.0], requires_grad=True)<pre><code> var b = a * 2 var c = a * 3 var d = b + c d.backward() assert_true(a.grad().all_close(Tensor.d1([5.0, 5.0, 5.0]))) </code></pre> Feedback highly appreciated!

问HN：完全用Mojo构建了一个张量和神经网络框架——有什么反馈吗？