Diffusion sampling/samplers
Last updated
Last updated
There are many sampling methods available in AUTOMATIC1111.
What are samplers?
How do they work?
What is the difference between them?
Which one should you use?
The produce an image, Stable Diffusion first generates a completely random image in the latent space. The noise predictor then estimates the noise of the image. The predicted noise is subtracted from the image. This process is repeated a dozen times. In the end, we will get a clean image.
This denoising process is called sampling because Stable Diffusion generates a new sample image in each step. The method used in sampling is called the sampler of sampling method.
Below is a sampling process in action. The sampler gradually produces cleaner and cleaner images.
While the framework is the same, there are many ways to carry out this denoising process. It is often a trade-off between speed and accuracy.
The noisy image gradually turns into a clear one in the picture above. The noise schedule controls the noise level at each sampling step. The noise is highest at the first step and gradually reduces to zero at the last step.
At each step, the sampler's job is to produce an image with a noise level matching the noise schedule.
What's the effect of increasing the number of sampling steps? A smaller noise reduction between each step. This helps to reduce the truncation error of the sampling.
Compare the noise schedules of 15 steps and 30 steps below.
What are the differences between them?
The simplest possible solver
A more accurate but slower version of Euler
(Linear multi-step method) Same speed as Euler but (supposedly) more accurate.
An ancestral sampler adds noise to the image at each sampling step. They are stochastic samplers because the sampling outcome has some randomness to it.
Be aware that many others are also stochastic samplers, even though their names do not have an "a" in them.
The drawback of using an ancestral sampler is that the image would not converge. Compare the images generated using Euler a and Euler below.
Images generated with Euler "a" do not converge at high sampling steps. In contrast, images from Euler converge well.
For reproducibility, it is desirable to have the image converge. If you want to generate slight variations, you should use variational seed[#TODO].
The samplers with the label "Karras" use the noise schedule recommended in the Karras article. As you can see the noise step sizes are smaller near the end. They found that this improves the quality of images.
DDIM (Denoising Diffusion Implicit Model) and PLMS (Pseudo Linear Multi-Step method) were the samplers shipped with the original Stable Diffusion v1. DDIM is one of the first samplers designed for diffusion models. PLMS is a newer and faster alternative to DDIM.
They are generally seen as outdated and not widely used anymore.
DPM (Diffusion probabilistic model solver) and DPM++ are new samplers designed for diffusion models released in 2022. They represent a family of solvers of similar architecture.
DPM and DPM2 are similar except for DPM2 being second order (More accurate but slower).
DPM++ is an improvement over DPM.
DPM adaptive adjusts the step size adaptively. It can be slow since it doesnβt guarantee finishing within the number of sampling steps.
UniPC (Unified Predictor-Corrector) is a new sampler released in 2023. Inspired by the predictor-corrector method in ODE solvers, it can achieve high-quality image generation in 5-10 steps.
It simply refers to Katherine Crowsonβs k-diffusion GitHub repository and the samplers associated with it. The repository implements the samplers studied in the Karras 2022 article. Basically, all samplers in AUTOMATIC1111 except DDIM, PLMS, and UniPC are borrowed from k-diffusion.