HackerNews中文版

我是一名Linux运维工程师，专注于DevOps/SRE领域。在过去几个月里，我利用业余时间开展了一个小型的*网站监控*副项目： https://inostop.com/en/ 我之前构建的大多数监控和运维工具都是在公司内部使用的。这是我第一次尝试将一个相对完整的工具转变为可公开使用的产品。在日常运维中，网站监控通常涉及以下内容： - 基础设施监控 - 应用程序/API监控 - 部分CDN监控这些通常是基于Prometheus或Zabbix等工具构建的，结合日志系统（ELK/OpenObserve）和分布式追踪（OpenTelemetry）。虽然这些技术栈功能强大，但当你只是想快速监控网站的可用性时，它们可能显得*过于复杂和沉重*。这促使我尝试一种更简单的方法： - 非侵入式（无需代码更改/侧车模式） - 采用带外探测来评估网站可用性 - 设定保守的阈值以减少误报到目前为止，该项目涵盖了： - 域名和TLS证书监控，Ping和Telnet检查 - 基本警报阈值和多阶段警报静音，以减少警报疲劳目前仍面临一些挑战： - 网站监控结果的用户体验仍有改进空间（后端使用Go编写）。 - AI目前仅作为收集数据的分析层，而不是主动执行真实网络探测。该项目仍在不断发展（我重写了其中的部分内容，次数比我愿意承认的还要多）。如果你想试用，可以使用早期访问代码*95f40841e4888668c4d5f7e88506075d*，有效期为1个月，主要用于收集早期反馈。我非常希望听到社区的反馈： - 轻量级、非侵入式的网站监控方法在实际中是否可行？ - 是否有更好的模式或架构值得探索？ - 如果你是QA或测试工程师，我很想听听你的想法。

查看原文

I’m a Linux ops engineer working in the DevOps/SRE space. and over the past few months, I’ve been working on a small *website monitoring* side project in my spare time: https://inostop.com/en/Most of the monitoring and ops tools I’ve built before were used internally within companies. This is my first attempt to turn a relatively complete tool into something publicly usable.In day-to-day operations, website monitoring usually involves:- Infrastructure monitoring - Application / API monitoring - Partial CDN monitoringThese are often built on top of tools like Prometheus or Zabbix, combined with log systems (ELK / OpenObserve) and distributed tracing (OpenTelemetry). While powerful, this stack can feel *heavyweight and overkill* when you just want to quickly monitor a website’s availability.That led me to experiment with a simpler approach:- Non-intrusive (no code changes required/Sidecar) - Out-of-band probing to estimate website availability - Conservative thresholds to reduce false alarmsSo far, the project covers:- Domain and TLS certificate monitoring, Ping, Telnet checks - Basic alert thresholds and multi-stage alert silencing to reduce alert fatigueThere are still open challenges:- There’s still room to improve the UX of the Website Monitoring results (backend is written in Go).- AI currently works only as an analysis layer on collected data, rather than actively performing real network probesThis project is still evolving (I’ve rewritten parts of it more times than I’d like to admit ).If you’d like to try it out, there’s an early access code *95f40841e4888668c4d5f7e88506075d*, valid for 1 months, mainly for collecting early feedback.I’d love to hear feedback from the community:- Does a lightweight, non-intrusive website monitoring approach make sense in practice? - Are there better patterns or architectures worth exploring? - If you’re a QA or test engineer, I’d love to hear your thoughts.

一种轻量级、非侵入式的网站监控方法（运维视角）