发布 HN:Nao Labs(YC X25)– 数据光标
大家好,我们是来自nao Labs的Claire和Christophe(<a href="https://getnao.io">https://getnao.io</a>)。我们刚刚推出了nao,这是一款用于数据处理的AI代码编辑器:它是一个本地编辑器,直接连接到您的数据仓库,并由内置数据架构上下文和数据特定工具的AI助手驱动。
<p>请查看我们的演示:<a href="https://www.youtube.com/watch?v=QmG6X-5ftZU" rel="nofollow">https://www.youtube.com/watch?v=QmG6X-5ftZU</a></p>
在软件工程中,使用大型语言模型(LLMs)编写代码已成为新常态。但在处理数据时情况并非如此。像Cursor这样的工具并不能与数据仓库原生交互——它们盲目地自动完成SQL,而不知道您的数据架构。我们中的大多数人仍在使用多个工具:在Cursor中编写代码,在仓库控制台中检查结果,使用可观察性工具进行故障排除,并在BI工具中验证仪表板是否正常。
<p>当您想用LLMs在数据上编写代码时,您并不太关心代码本身,而是关注数据输出。您需要一个工具,帮助您编写与数据相关的代码,让您可视化其对输出的影响,并为您进行质量检查。</p>
Christophe和我在数据领域各自工作了10年——Christophe曾是一名数据工程师,为数十个组织构建数据平台,而我则是数据负责人,帮助数据团队构建分析和数据产品。我们见证了业务要求您快速交付数据的同时,您却在担心这小小的一行代码是否会错误地将您CEO仪表板上的收入乘以5。这使您面临两个选择:广泛测试并缓慢交付,或者不测试并快速交付。这就是我们想要创建nao的原因:一个真正适应我们数据工作的工具,让数据团队能够以业务的节奏交付。
<p>nao是VS Code的一个分支,内置了BigQuery、Snowflake和Postgres的连接器。我们构建了自己的AI助手和标签系统,为您的数据仓库架构和代码库提供了RAG(红黄绿灯)。我们添加了一组代理工具,用于查询数据、比较数据、理解像dbt这样的数据工具,评估代码在整个数据血缘中的下游影响。</p>
AI标签和AI代理会立即编写与您的架构匹配的代码,无论是SQL、Python还是YAML。它会并排显示代码差异和数据差异,以可视化您的更改对数据输出的影响。您还可以将数据质量检查留给代理:检测缺失或重复的值、异常值、预测下游的破坏性更改,或比较开发和生产数据的差异。
数据团队通常使用nao编写SQL管道,通常与dbt一起使用。它帮助他们创建数据模型、记录模型、测试模型,同时确保不会破坏数据血缘和BI中的数据。处于运行模式时,他们还使用它进行一些分析,并识别生产中的数据质量错误。对于不太技术的用户,它也是加强代码最佳实践的极大帮助。对于大型团队,它确保代码和指标保持良好的结构和一致性。
<p>软件工程师使用nao进行数据库探索:使用nao标签编写SQL查询,使用代理探索数据架构,并编写DDL。</p>
我们经常被问到的问题是:为什么不直接使用Cursor和MCPs?Cursor必须触发许多MCP调用才能获取数据的完整上下文,而nao始终在一个RAG中提供这些上下文。MCPs仅存在于Cursor的一个非常封闭的部分:它们不会将数据上下文带入标签中,也不会使用户界面更适应数据工作流程。此外,nao为数据团队提供了预打包的解决方案:他们不需要设置扩展、安装和认证MCPs,构建CI/CD管道。这意味着即使是非技术的数据团队也能获得良好的开发体验。
<p>我们的长期目标是成为处理数据的最佳平台。我们希望为SQL、Python和YAML微调我们自己的模型,以提供最相关的数据代码建议。我们希望扩大对所有数据栈工具的理解,成为您任何数据工作流程的唯一无关编辑器。</p>
您可以在这里尝试:<a href="https://sunshine.getnao.io/releases/">https://sunshine.getnao.io/releases/</a> - 下载nao,免费注册并开始使用。仅限HN发布,您可以使用简单的用户名创建临时帐户,如果您不想使用电子邮件。现在,我们只有Mac版本,但Linux和Windows版本即将推出。
<p>我们非常期待您的反馈——并希望听到您对如何进一步改善数据开发体验的想法!</p>
查看原文
Hey HN, we’re Claire and Christophe from nao Labs (<a href="https://getnao.io/">https://getnao.io/</a>). We just launched nao, an AI code editor to work with data: a local editor, directly connected with your data warehouse, and powered by an AI copilot with built-in context of your data schema and data-specific tools.<p>See our demo here: <a href="https://www.youtube.com/watch?v=QmG6X-5ftZU" rel="nofollow">https://www.youtube.com/watch?v=QmG6X-5ftZU</a><p>Writing code with LLMs is the new normal in software engineering. But not when it comes to manipulating data. Tools like Cursor don’t interact natively with data warehouses — they autocomplete SQL blindly, not knowing your data schema. Most of us are still juggling multiple tools: writing code in Cursor, checking results in the warehouse console, troubleshooting with an observability tool, and verifying in BI tool no dashboard broke.<p>When you want to write code on data with LLMs, you don’t care much about the code, you care about the data output. You need a tool that helps you write code relevant for your data, lets you visualize its impact on the output, and quality check it for you.<p>Christophe and I have each spent 10 years in data — Christophe was a data engineer and has built data platforms for dozens of orgs, I was head of data and helped data teams building their analytics & data products. We’ve seen how the business asks you to ship data fast, while you’re here wondering if this small line of code will mistakenly multiply the revenue on your CEO dashboard by x5. Which leaves you 2 choices: test extensively and ship slow. Not test and ship fast. That’s why we wanted to create nao: a tool really adapted to our data work, that would allow data teams to ship at business pace.<p>nao is a fork of VS Code, with built-in connectors for BigQuery, Snowflake, and Postgres. We built our own AI copilot and tab system, gave them a RAG of your data warehouse schemas and of your codebase. We added a set of agent tools to query data, compare data, understand data tools like dbt, assess the downstream impact of code in your whole data lineage.<p>The AI tab and the AI agent write straight away code matching your schema, may it be for SQL, python, yaml. It shows you code diffs and data diffs side by side, to visualize what your change did to the data output. And you can leave the data quality checks to the agent: detect missing or duplicated values, outliers, anticipate breaking changes downstream or compare dev and production data differences.<p>Data teams usually use nao for writing SQL pipelines, often with dbt. It helps them create data models, document them, test them, while making sure they’re not breaking data lineage and figures in the BI. In run mode, they also use it to run some analytics, and identify data quality bugs in production. For less technical profiles, it’s also a great help to strengthen their code best practices. For large teams, it ensures that the code & metrics remain well factorized and consistent.<p>Software engineers use nao for the database exploration part: write SQL queries with nao tab, explore data schema with the agent, and write DDL.<p>Question we often get is: why not just use Cursor and MCPs? Cursor has to trigger many MCP calls to get full context of the data, while nao has it always available in one RAG. MCPs stay in a very enclosed part of Cursor: they don’t bring data context to the tab. And they don’t make the UI more adapted to data workflows. Besides, nao comes as pre-packaged for data teams: they don’t have to set up extensions, install and authenticate in MCPs, build CI/CD pipelines. Which means even non-technical data teams can have a great developer experience.<p>Our long-term goal is to become the best place to work with data. We want to fine-tune our own models for SQL, Python and YAML to give the most relevant code suggestions for data. We want to enlarge our comprehension of all data stack tools, to become the only agnostic editor for any of your data workflow.<p>You can try it here: <a href="https://sunshine.getnao.io/releases/">https://sunshine.getnao.io/releases/</a> - download nao, sign up for free and start using it. Just for HN Launch, you can create a temporary account with a simple username if you’d prefer not to use your email. For now, we only have Mac version but Linux and Windows are coming.<p>We’d love to hear your feedback — and get your thoughts on how we can improve even further the data dev experience!