Postgres上的唯一索引失败 - 是我的错
我最近在PostgreSQL上遇到了一个独特的索引故障。为了澄清,这并不是Postgres的问题,而是我的错误。
幸运的是,这个问题出现在测试环境而不是生产环境,但我认为其他PostgreSQL用户可能会对此感兴趣。
在写入一组数据和下一组依赖于ON CONFLICT来更新/插入行之间,我切换了Docker镜像,以便使用一些内置工具。
最初的镜像是基于alpine/musl的,而后来的镜像是基于Debian/Glib的。
据我所知,字节排序是不同的,索引操作没有找到应该存在的一组字节。
如果我进行了转储/恢复,这就不会成为问题,但我只是直接在现有存储卷上切换了镜像。哎呀。
表面上看,索引是活动的,并处于强制状态。
然而,REINDEX失败了,因为存在重复项!
对我来说,我修复了重复项,进行了REINDEX,然后进行了转储恢复。
但值得注意的是,考虑到许多备份机制依赖于存储卷快照,全局重建索引在恢复后会捕捉到这些类型的问题,如果这是任何人的潜在问题。
我在此邀请即将到来的批评浪潮,我会选择忽略,纯粹是为了让其他人能够找到这个信息,可能会和我做同样的事情。
查看原文
I recently had a unique index failure on PostgreSQL. To be clear it wasn't Postgres' fault - it was mine.<p>Thankfully it reared it's head on Staging not Live, but here's something other PostgreSQL users might find interesting.<p>In between writing one set of data, and the next which relies on ON CONFLICT to update/insert rows, thus fully relying on the unique index holding the integrity, I switched docker image, to get some tooling built in.<p>The original was alpine/musl based. The subsequent was Debian/Glib based.<p>As far as I can tell the byte collation is different and the index operations just didn't find one set of bytes where it ought to be.<p>If I had dumped/restored - it wouldn't have been a problem, but I just flipped the image on the existing storage volume. oops.<p>On the surface, the index was active, and in an enforcing state.<p>a REINDEX failed however, there were dupes!<p>For me, I fixed the dupes, REINDEXed, then dump-restored.<p>But worth bearing in mind given how many backup mechanisms rely on storage volume snapshots, and a global reindex after restore will catch these types of issues if that's a potential for anyone.<p>I hereby invite the forthcoming tsunami of criticism, that I will ignore,
purely so i can post this for others to find interest, and who might have done the same thing as me.