HackerNews中文版

嘿，HN——我正在开发一款应用，用户可以上传“现实生活”中的服装照片（例如，一件皱巴巴的衬衫折叠在地板上）。目标是将这张单一照片转换为干净的电商风格服装图片。一个关键的用户体验需求是：输出需要是带透明度（alpha）的PNG格式，以便我们能够一致地将服装裁剪/合成到一个固定的用户界面中（如卡片、服装布局等）。可以想象成“主体剪裁后干净地放入模板”。我目前的工作流程如下： 1. 用户上传的照片（背景杂乱，角度奇怪） 2. 用户上传的照片与“查询”图像（风格目标）匹配（目前使用Nano Banana） 3. 背景去除模型以获取透明度并保存为RGBA PNG 这个流程是可行的，但感觉有些hacky，并且偶尔会引入边缘伪影。此外，生成模型有时会创造出阴影/背景线索，这会干扰背景去除步骤。感觉这两个步骤在相互抵触。我想了解在这种工作流程中，什么样的“好”表现是理想的：人们是否仍然将生成/编辑与单独的背景去除作为标准流程？你们中有谁在生产中使用alpha原生生成（RGBA输出）？如果有，使用的技术栈/工作流程是什么？如果你们专门做过“杂乱的用户生成内容照片 → 目录资产”：最常出现的问题是什么，解决方案又是什么？我并不想听到供应商的推销——我主要想了解人们正在使用的实际模式（开源工作流程、模型类别、ComfyUI/SD管道、基于API的技术栈等）。如果需要，我很乐意分享更多细节。

查看原文

Hey HN — I’m building an app where users upload “real life” clothing photos (ex. a wrinkly shirt folded on the floor). The goal is to transform that single photo into a clean, ecommerce-style image of the garment.One key UX requirement: the output needs to be a PNG with transparency (alpha) so we can consistently crop/composite the garment into an on-rails UI (cards, outfit layouts, etc.). Think “subject cutout that drops cleanly into templates.”My current pipeline looks like: 1. User-uploaded photo (messy background, weird angles) 2. User-upload is matched to “query” image (style target) + promptCurrently using Nano Banana) 4. Background removal model to get transparency and save as RGBA PNGThis works, but it feels hacky + occasionally introduces edge artifacts. Also, the generation model sometimes invents shadows/background cues that confuse the background removal step. It feels like the two steps are fighting one another.I’m trying to understand what “good” looks like in production for this kind of workflow:Are people still doing gen/edit → separate background removal as the standard?Are any of you using alpha-native generation (RGBA outputs) in production? If so, what’s the stack/workflow?If you’ve done “messy UGC photo → catalog asset” specifically: what broke most often and what fixed it?I’m not looking for vendor pitches—mostly practical patterns people are using (open source workflows, model classes, ComfyUI/SD pipelines, API-based stacks, etc.). Happy to share more details if helpful.

问HN：有没有人在使用图像模型进行生产级的图像编辑？是如何实现的？