Search DiffusionModels on X

Search results for DiffusionModels

DiffusionModels community

One keyword maps to one global community path.

Create community

People

Not Found

Tweets including DiffusionModels

cv usk@cv_usk

33minutes ago

🖼 Test-time scaling for image editing tends to hand every edit the same compute budget, wasting a lot of it. By allocating budget by difficulty and pruning with edit-specific verification, this work hits up to 2.2x speedup while preserving quality. Title: From Scale to Speed: Adaptive Test-Time Scaling for Image Editing URL: 📝 Overview ADE-CoT is a test-time scaling method tailored to goal-directed image editing. Instead of reusing Image-CoT methods built for text-to-image generation, it combines three strategies, difficulty-aware allocation, edit-specific early verification, and opportunistic stopping, to cut compute substantially while preserving quality. ❓ Challenges Solved Prior methods had three mismatches. ・Fixed sampling budgets waste compute on easy edits that barely improve ・General MLLM scores wrongly prune about 40% of samples that start low but ultimately score high ・Large-scale sampling produces redundant identical correct outputs, adding needless compute 💡 Methodology & Proposed Approach ・It reads edit difficulty, giving easy edits a minimal budget and expanding the search for hard ones ・A one-step preview estimates clean latents from noisy intermediates without extra denoising, making early verification reliable ・Grounded SAM2 checks that only the intended region changed, and DINOv2 embeddings remove redundant candidates ・It generates candidates sequentially and stops, via depth-first opportunistic stopping, once enough intent-aligned results are found 🎯 Use Cases It fits complex pose changes, multi-object removal or replacement, fine-grained regional edits, multi-turn editing, and high-quality editing under compute constraints, and is especially valuable where inference cost matters, like a production image-editing API. 📊 Experimental Results ・On GEdit-Bench, FLUX.1 Kontext is 2.2x, BAGEL 1.8x, and Step1X-Edit 2.0x faster than Best-of-N ・Reasoning efficiency more than doubles on a fixed 32-sample budget, and outcome efficiency rises 4.9x, 2.7x, and 2.9x across three benchmarks ・On hard multi-object edits like "remove the person standing next to the lady in white," it fixes the baseline's misidentification #ImageEditing# #DiffusionModels#

Forward to community

cv usk@cv_usk

2026.06.13 04:29

🎨 The reason diffusion models lose quality during generation turns out to be an "SNR vs timestep" mismatch. A training-free correction cuts FID by up to 47%. Title: Elucidating the SNR-t Bias of Diffusion Probabilistic Models URL: 📝 Overview During training, a diffusion model's signal-to-noise ratio (SNR) is deterministically tied to the timestep. During generation, accumulated errors break that coupling, so a sample's SNR no longer matches its assigned timestep, an "SNR-t bias." This paper elucidates the mechanism and proposes a correction called DCW. ❓ Challenges Solved Reverse-denoising samples consistently have lower SNR than forward samples at the same timestep. As a result, the network systematically overestimates its outputs, degrading generation quality. 💡 Methodology & Proposed Approach ・At each denoising step it applies a differential correction using the difference between the predicted and reconstructed samples ・It works in the wavelet domain, leveraging how diffusion models reconstruct low frequencies first and high-frequency detail later ・A discrete wavelet transform splits samples into frequency subbands, with low-frequency weights decaying over time and high-frequency weights increasing ・It is training-free and plug-and-play, working with IDDPM, EDM, DDIM, FLUX, and many others 🎯 Use Cases It can boost the quality of existing pretrained diffusion models after the fact, ideal when you want high-quality generation with few sampling steps. 📊 Experimental Results ・On IDDPM (CIFAR-10) it cuts 20-step FID by 42.6% ・On EDM (CIFAR-10) it reduces FID by 47.1% / 47.4% / 36.4% at 13/21/35 NFE ・It adds further gains even on top of SOTA bias-correction methods (A-DPM-FR improves from 12.38 to 10.91 FID at 10 steps) ・Compute overhead is tiny: ~0.47% on CelebA and 0.08% on ImageNet #DiffusionModels# #GenerativeAI#

Forward to community