HackerNews中文版

你好，黑客们！我想分享一个我和我的团队成员正在进行的个人项目。这个项目的核心思想是构建一个多语言的计算图，同时让你能够在本地机器上快速部署一个迷你FaaS（函数即服务）平台。换句话说，你可以使用本地框架和服务器，轻松地混合和匹配来自不同来源（甚至是不同第三方工具）的代码。我们将这个项目称为SPL。这个想法是如何产生的呢？在处理一个复杂模型的过程中，我们意识到需要结合来自不同语言、早期项目中提取的代码和工具。我们有独立的数据库查询、几种根本不同的大数据集预处理方法、一个两阶段的训练过程，以及对最终模型的评估和验证。我们考虑了像Airflow、Dagster和Prefect等知名工具。然而，它们在某些简单场景下显得有些笨重，并且不适合快速原型开发。此外，我们的数据集的一部分需要使用C++进行更底层的处理，而不是标准的Python。这就是我们产生这个个人项目想法的原因——希望能够无缝地将那些本来不兼容的代码结合在一起，并且允许我们在团队内部分享我们的工作。基本上，我们有几个关键目标： 1. 构建一个由函数或工具组成的连接计算图，无论它们的语言或依赖关系如何。 2. 支持这些图的本地和远程执行（以便团队可以共享他们的工作）。 3. 使得只运行图的一部分成为可能，保持之前步骤的状态和结果，以简化新方法的测试。一些实现细节计算图是一个有向的、连接的图，包含节点（函数或工具）和它们之间的链接（输入和输出）。每个节点接受输入参数，执行特定任务，然后将结果传递给下一个节点。我们的方法涉及一个特定语言的框架——如果我们正在运行代码——以及一个处理FaaS部分的服务器，协调图中的节点，并正确传递工件。我们选择为Python构建SPL框架的第一个版本，因为这是我们使用最多的语言。最终结果将是一个库，让你可以直接在Python笔记本中直观地创建计算图。为了管理计算图本身——添加或删除节点、保存结果以及仅运行某些部分——我们决定采用类似于PyTorch的机制。在PyTorch中，你通过顺序添加层到`nn.Sequential()`来构建模型。迷你FaaS 在我们看来，最酷的功能之一是你可以快速在自己的机器上启动一个迷你FaaS。如果你的计算机可以访问互联网，你的函数和图将立即对其他用户可用。目前，SPL服务器支持： - 用于远程执行函数的HTTP API。 - 以JSON格式导入和导出图。 - 任务协调和分布式结果缓存。 - 一个简单的Web界面，用于查看和编辑图。 SPL的可能用例： 1. 本地开发：构建图和函数，可以在不同项目中重用，而无需不断复制代码。 2. 生产使用：将业务逻辑和基础设施分开，轻松进行零停机时间的更新。 3. 个人FaaS（包括函数市场）：有可能将你的工作发布给他人（包括货币化），只交付结果而不是整个代码库。 4. 可视化业务流程：服务器支持图形渲染，并显示输入和输出端口，这对于高层次的项目管理非常方便。我为什么要写这篇文章？我们非常希望听到你的想法： - 你是否遇到过类似的挑战？ - 这样的工具对你有用吗？ - 你希望在这样的项目中看到哪些功能？我们期待在评论中讨论这些想法！

查看原文

Hello, Hackers!I’d like to share a pet project my teammate and I have been working on. The core idea is to build a multi-language computational graph that also lets you quickly deploy a mini-FaaS (Function as a Service) platform on your local machine. In other words, you can easily mix and match code from various sources (and even different third-party tools) using a local framework and server. We’re calling this project SPL.How did this idea come about?While working on a complex model, we realized we needed to combine code and utilities written in different languages, pulled from earlier projects. We had separate DB-queries, several fundamentally different methods of preprocessing large datasets, a two-stage training process, plus final evaluation and validation of the resulting model.We considered well-known tools like Airflow, Dagster, and Prefect. However, they felt a bit heavy for some simpler scenarios, and they weren’t ideal for rapid prototyping. Besides, part of our dataset required lower-level processing with C++ rather than standard Python. That’s how the idea for a pet project arose—something that would let us seamlessly bring together code that otherwise wouldn’t play nicely, and also allow us to share our work within the team. Essentially, we had a few key goals:1. Build a connected computational graph made up of functions or utilities, regardless of their language or dependencies.2. Support both local and remote execution of these graphs (so teams can share their work).3. Make it possible to run only part of a graph, keeping the state and results of previous steps, to simplify testing new approaches.Some implementation detailsA computational graph is a directed, connected graph with nodes (functions or utilities) and links between them (inputs and outputs). Each node takes input parameters, performs a specific task, and sends the result along to the next node.Our approach involves a framework for a specific language—if we’re running code—plus a server that handles the FaaS side, orchestrates nodes in the graph, and takes care of passing artifacts around correctly.We chose to build the first version of the SPL framework for Python, since that’s the language we use most. The end result will be a library that lets you intuitively create computational graphs right from a Python notebook.To manage the graph itself—adding or removing nodes, saving results, and running only certain parts—we decided on a mechanic similar to PyTorch. In PyTorch, you build a model by sequentially adding layers to `nn.Sequential()`.A pocket-sized FaaSOne of the coolest features, in our opinion, is that you can quickly spin up your own mini-FaaS right on your machine. If your computer has internet access, your functions and graphs become instantly available to other users.Right now, the SPL server supports:-An HTTP API for remotely executing functions.-Import and export of graphs in JSON format.-Task coordination and distributed result caching.-A simple web interface for viewing and editing graphs.Possible use cases for SPL: 1. Local development: Build graphs and functions you can reuse across different projects without constantly copying code.2. Production usage: Keep business logic and infrastructure separate, with easy zero-downtime updates.3. Personal FaaS (including a function marketplace): Potentially publish your work for others (including monetization), delivering only results instead of the entire codebase.4. Visualizing business processes: The server supports graph rendering and displays input and output ports, which can be handy for high-level project management.Why am I writing this post?We’d really love to hear what you think:-Have you faced similar challenges?-Would such a tool be useful for you?-What features would you like to see in a project like this?We’re excited to discuss these ideas in the comments!

展示HN：SPL – 多语言管道和您自己的迷你FaaS在一台机器上