问HN:一切都是数据压缩吗?

1作者: mnky9800n11 天前原帖
经过一段时间的思考,我在想,所有事物都是数据压缩吗?也许可以说是微不足道的?<p>像混沌系统和非混沌系统一样,都可以通过它们的可压缩性来描述。可压缩性越低,混沌性越强[1]。在这里,我们明确地定义混沌是在正的李雅普诺夫指数的范围内。因此,这在动态与信息之间建立了内在的联系。可以比喻地说,信息是宇宙的基本构成。<p>这也是为什么神经网络在预测混沌现象时表现得如此出色的原因,因为它们实际上是在尝试压缩数据空间,而这正是混沌的一个特征。<p>我们之所以说神经网络是不可解释的,或者至少难以解释,是因为它们并不直接展示解决方案。但如果我们将思维模型转变为宇宙只能通过将其压缩成某种形式来描述,那么神经网络就是任何函数的最佳估计器[3]。<p>这意味着神经网络是我们所能得到的最佳工具。因此,我们所能做的就是朝着更复杂的架构努力,以理解任何事物。因为没有任何真正的数据模型,函数近似是我们所能获得的全部[4]。此外,它们能够预测更远的未来的原因在于,我们之前的压缩算法不如现在的好。<p>在我看来,这一切似乎都是显而易见的,或者说甚至可能不如那样,因为似乎没有什么更深的含义,除了肤浅的观察,也许我只是太天真了。因此,我希望有人能指引我阅读或做更多的事情,以理解这些问题。因为我已经组织了一次会议来理解这些想法[5],我认为我应该再组织一次,但我也不确定如何与这些想法互动,因为我大部分时间都在做地球物理和数据科学,而不是这个。<p>[1] https://www.sciencedirect.com/science/article/abs/pii/S0370157301000254<p>[2] https://link.aps.org/doi/10.1103/PhysRevResearch.5.043252<p>[3] https://www.sciencedirect.com/science/article/abs/pii/S089360809700097X<p>[4] https://projecteuclid.org/journals/statistical-science/volume-16/issue-3/Statistical-Modeling--The-Two-Cultures-with-comments-and-a/10.1214/ss/1009213726.full<p>[5] YouTube上的会议演讲(并不是所有演讲都被录制,会议的主题也比这篇文章更广泛)。 https://youtube.com/playlist?list=PL6zSfYNSRHalAsgIjHHsttpYfxJ_XIPbt&si=VVWAE-fsv_WfFwfK
查看原文
After thinking about it for a while I am wondering, is everything data compression? Perhaps trivially so?<p>Like chaotic and nonchaotic systems alike can be described by their ability to be compressed. The less compressible the less chaotic [1]. And this is where we define explicitly chaos is in the regime of positive Lyapunov exponents. And so that creates an intrinsic link between dynamics and information. And information is what the universe is fundamentally made out of to say metaphorically.<p>But then that is why neural networks are so good at their jobs compared to anything else when predicting chaotic things [2] because they literally are attempting to compress the space of data. Which is a descriptor of chaos.<p>And the reason we say that neural networks are unexplainable or at least hard to explain is that they don’t show directly the solution. But if we switch our mental model to that the universe is simply only able to be described by literally compressing it into something, then neural networks are the best estimator for any function [3].<p>Which implies that neural networks are the best we can get. So all you can do is work towards more sophisticated architectures to understand anything. Because there isn’t anything that’s a true data model, function approximations are all we will ever get [4]. And also the reason they can predict further into the future than though possible is because previously we simply had worse compression algorithms than before.<p>And to me this all seems trivially true, or likely less than that because there seems to be no implications from it that are more than cursory observations and perhaps I am simply naive. So I am hoping someone can point me in the right direction to read or do more to understand these things. Because I have organized a conference to understand these ideas [5] and I think I should organize another one but also I’m not sure how to interact with these ideas because most of my time is spent doing geophysics and data science and not this.<p>[1] https:&#x2F;&#x2F;www.sciencedirect.com&#x2F;science&#x2F;article&#x2F;abs&#x2F;pii&#x2F;S0370157301000254<p>[2] https:&#x2F;&#x2F;link.aps.org&#x2F;doi&#x2F;10.1103&#x2F;PhysRevResearch.5.043252<p>[3] https:&#x2F;&#x2F;www.sciencedirect.com&#x2F;science&#x2F;article&#x2F;abs&#x2F;pii&#x2F;S089360809700097X<p>[4] https:&#x2F;&#x2F;projecteuclid.org&#x2F;journals&#x2F;statistical-science&#x2F;volume-16&#x2F;issue-3&#x2F;Statistical-Modeling--The-Two-Cultures-with-comments-and-a&#x2F;10.1214&#x2F;ss&#x2F;1009213726.full<p>[5] conference talks on YouTube (not all the talks were recorded and also the conference was broader in topic than this post). https:&#x2F;&#x2F;youtube.com&#x2F;playlist?list=PL6zSfYNSRHalAsgIjHHsttpYfxJ_XIPbt&amp;si=VVWAE-fsv_WfFwfK