谁发现了“grokking”,为什么这个名字很难找到?

2作者: asmodeuslucifer3 个月前原帖
如果这对大家来说是旧闻,我深感抱歉,但也许大家的集体智慧能给出答案。我在观看Welch Labs的YouTube视频《我们真正理解的最复杂模型》时,听到了一个关于研究者的故事:他在度假时离开了一个模型的训练,结果这个模型在经过数千步训练后学会了泛化。但是,当我试图查找这位发现者的名字时,却发现并没有公开,这似乎对一个人不太公平。真实的故事是什么呢?
查看原文
Apologies if this is old news to everyone, but perhaps the hive mind knows the answer. I was watching a youtube video "The most complex model we actually understand" by Welch Labs and heard the story about the researcher who left a model training when going on vacation, which then learned to generalize after thousands of training steps. But when I try to look up the name of the discoverer it has not been made public, which seems a shabby way to treat someone. What's the real story?