研究工具体验

1作者: hodltothestars2 个月前原帖
我一直在使用wandb,并且研究过neptune ai和一些开源替代品,但我始终觉得协作和版本控制(例如将代码快照与训练运行关联等)显得笨拙。我还在想,如果能对我的长时间运行进行某种监控,以便在满足特定条件时提醒我,甚至能够远程停止或重新启动运行并修改超参数(进行潜在的自主操作),那该多好,比如通过手机来操作。 我很好奇你们在这些(以及类似的)AI开发平台/可观测性层上的经验,以及你们发现现有解决方案中缺乏什么或有什么不满(如果有的话)。我发现这个研究过程非常痛苦,不知道这是否只是我一个人的感受。
查看原文
I’ve been using wandb quite a bit and looked into neptune ai and some open source alternatives, but I’ve always felt that collaboration and version control (e.g. associating code snapshots with training runs etc) is clunky. I was also thinking it’d be nice to have some kind of monitoring on my longer runs to alert me on certain criteria, or even be able to stop or restart a run with hyperparam modifications remotely (take potentially agentic actions), like from my phone.<p>I was curious what all of your experiences have been with these (and similar) AI developer platforms &#x2F; observability layers and what you’ve found lacking or gripes you have with the existing solutions (if anything). I&#x27;ve found the research process extremely painful and was wondering if this was just me.