California Institute of Technology
Diffusion models excel at creating visually-convincing images, but they often struggle to meet subtle constraints inherent in the training data. Such constraints could be physics-based (e.g., satisfying a PDE), geometric (e.g., respecting symmetry), or semantic (e.g., including a particular number of objects). When the training data all satisfy a certain constraint, enforcing this constraint on a diffusion model makes it more reliable for generating valid synthetic data and solving constrained inverse problems. However, existing methods for constrained diffusion models are restricted in the constraints they can handle. For instance, recent work proposed to learn mirror diffusion models (MDMs), but analytical mirror maps only exist for convex constraints and can be challenging to derive. We propose neural approximate mirror maps (NAMMs) for general, possibly non-convex constraints. Our approach only requires a differentiable distance function from the constraint set. We learn an approximate mirror map that transforms data into an unconstrained space and a corresponding approximate inverse that maps data back to the constraint set. A generative model, such as an MDM, can then be trained in the learned mirror space and its samples restored to the constraint set by the inverse map. We validate our approach on a variety of constraints, showing that compared to an unconstrained diffusion model, a NAMM-based MDM substantially improves constraint satisfaction. We also demonstrate how existing diffusion-based inverse-problem solvers can be easily applied in the learned mirror space to solve constrained inverse problems.
Berthy T. Feng, Ricardo Baptista, and Katherine L. Bouman. "Neural Approximate Mirror Maps for Constrained Diffusion Models." The Thirteenth International Conference on Learning Representations, 2024.
During training, the forward mirror map \( \mathbf{g}_\phi \) maps training images into a "mirror" space.
We perturb these mirror images with additive white Gaussian noise of varying noise levels to create training data for the inverse mirror map \( \mathbf{f}_\psi \).
The inverse mirror map attempts to map these noisy mirror images back to the original space.
Both maps are trained jointly to minimize the following objective, which consists of a cycle-consistency loss, a constraint distance loss, and regularization:
$$ \mathcal{L}(\phi,\psi):=\mathcal{L}_{\text{cycle}}(\mathbf{g}_\phi,\mathbf{f}_\psi)+\lambda_{\text{constr}}\mathcal{L}_{\text{constr}}(\mathbf{g}_\phi,\mathbf{f}_\psi)+\lambda_{\text{reg}}\mathcal{R}(\mathbf{g}_\phi). $$
An illustration of the training procedure is shown below.
Our method includes an optional finetuning step, which is detailed in the paper.
Improved constraint satisfaction. We tested our method on a variety of constraints, which we refer to as Total Brightness, 1D Burgers', Divergence-free, Periodic, and Count. As the figure below shoes, image samples from our approach are nearly indistinguishable from baseline samples, yet there is a significant difference in their distances from the constraint set. The baseline is a vanilla diffusion model trained on the constrained dataset.
The table below quantifies the improved constraint distance (CD) of samples from our method.
It includes measures of distribution-matching accuracy—maximum mean discrepancy (MMD) and Kernel Inception Distance (KID)—to verify that our approach does not sacrifice image quality for constraint satisfaction. In terms of KID, our approach actually improves distribution-matching accuracy.
Solving physics-constrained inverse problems. A compelling use case of NAMMs is to solve scientific inverse problems using constrained diffusion-model priors.
The figure below shows results for data assimilation of (a) a 1D Burgers' system and (b) a divergence-free, or incompressible, Kolmogorov flow.