缺失的干净拓扑3D网格生成数据集?
我们需要一个用于自回归建模和网格生成的Blender bpy.ops数据集,而这一领域正面临着“发表或灭亡”的困境。
大家都知道神经网格生成有点像无用的玩具,但在2022年,Autodesk Research发布了SkexGen(ICML 2022),这是一种生成CAD构建序列的自回归模型。你可以绘制一个二维轮廓,进行拉伸、布尔运算,每一步都是有效的CAD操作。SCAD、Adam等已经将这一思路推广到大型语言模型(LLMs),这很好——如果它们真的专业化的话,效果会相当不错,而目前的差距在于这与LLM的视觉问题。
之所以可行,是因为CAD文件本身存储了它们的构建历史,而这种构建序列总是被保存,DeepCAD数据集为他们提供了数千个这样的序列。那么Blender的情况如何呢?
研究界选择了直接生成网格,将顶点和面标记为序列(如PolyGen、MeshAnything、MeshXL、MESHTRON等)或逆向网格简化(ARMesh)。这些方法在不断改进,但它们根本上是在与表示形式作斗争。一系列顶点坐标并不能编码边环存在的原因,而bpy.ops可以做到这一点。我们应该创建一个数据集。Blender已经将每个bpy.ops调用记录到其信息面板中。一个录制插件可以捕捉完整的bpy.ops调用及其所有参数、每一步的选择状态(哪些顶点/边/面被选中)、每一步的轻量级网格快照(或在关键间隔处)以及最终网格作为标签。
是的,噪声、上下文和规模之间存在挑战,但……?自回归方法已经被证明是有效的。问题完全在于数据收集基础设施。为什么不呢?
查看原文
We need a Blender bpy.ops dataset for autoregressive modeling and mesh generation is a dead end being eaten alive by publish or perish.<p>Everyone knows neural mesh generation is a bit of a useless toy, but in 2022, Autodesk Research published SkexGen (ICML 2022), an autoregressive model that generates CAD construction sequences. You sketch a 2D profile, extrude it, boolean it, and each step is a valid CAD operation. SCAD and Adam and whatnot are already extrapolating that to LLMs, and it's fine - if they were actually specialized, they'd be quite good, actually, and the gap is that and an LLM vision problem.<p>And it works because CAD files natively store their construction history, and it exists because the construction sequences are always stored, and the DeepCAD dataset gave them thousands of these sequences for free. Where is that for Blender?<p>The research community has chosen to pursue direct mesh generation, tokenizing vertices and faces into sequences (PolyGen, MeshAnything, MeshXL, MESHTRON, etc.) or reversing mesh simplification (ARMesh). These approaches are getting better, but they're fundamentally fighting the representation. A sequence of vertex coordinates doesn't encode why an edge loop is there, but by.ops does. We should make a dataset. Blender already logs every bpy.ops call to its Info panel. A recording addon could capture the full bpy.ops call with all parameters, the selection state at each step (which vertices/edges/faces were selected), a lightweight mesh snapshot at each step (or at key intervals) and the final mesh as the label.<p>Yes, there's challenges between noise, context and scale, but...? The autoregressive method is proven already. The gap is entirely in data collection infrastructure. Why not?