Deep image prior

Jun-06-2024


Basic information realted to "Deep Image Prior (DIP)":

1. What's the idea of deep image prior (DIP)?

[Method Section of DIP paper]

A deep generator network is a parametric function \(x = f_{\theta}(z)\) that map a code vector \(z\) to an image \(x\). Generators are often used to model a complex distribution \(p(x)\) over images as the transformation of simple distribution \(p(z)\) over the codes, such as Gaussian distribution.

Some people may think that the distribution \(p(x)\) is encoded in the parameters \(\theta\) of the network model. However, the authors of DIP hold the idea that a significant amount of information about the image distribution is contained in the structure of the network even without performing any training of model parameters.

The generalized equation for tasks such as denosing, impainting, and superresolution etc.: $${x^* = \underset{x}{\mathrm {argmin}} E(x;x_0) + R(x) }$$ where \(x_0 \) is the noisy/occluded/low resolution image, and \(E(x; x_0)\) is a task-dependent data term. \(R(x)\) is a regularizer, such as TV/nuclear norm/ wavelet transformation.

Instead, the DIP will replace \(R(x)\) with implicit prior captured by the neural network parametrization: $${ \theta^* = \underset{\theta}{\mathrm{argmain}} E(f_{\theta}(z); x_0)}, \;\;\;\; {x^* = f_{\theta^*}(z)}$$ where the (local) minimizer \(\theta^*\) is obtained using an optimizer such as gradient descent, starting from a random initialization of the parameters \(\theta\).

2. DIP in MRI

I found several articles using DIP in dynamic imaging, motion correction, and parameter mapping, their links are given as follows:

In dynamic imaging, instead of reconstructing each single image at each time point independently, the temporal dependencies of dynamic measurements should be taken into consideration. In TDDIP method, they used a one-dimensional manifold parametrized by time to learn the time-dependencies. Even though a temporally meaningful manifolds was chosen, the hand-crafted design will limit the performance. Therefore, the author introduced a MapNet [1] to add flexibility to their model. The MapNet involves some fully connected layers with nonlinearities, which is for mapping a fixed manifold into the more expressive latent space. \(\Omega = g_{\phi}(Z)\)

Zhongsen Li et al., however, think that the TDDIP model would limit the model expression because it employed a pyramid-shaped CNN generator shared by all image frames. Therefore they proposed a two stage generative network: (1). employing independent CNN to recover the image of each time point; (2) exploiting the spatio-temporal correlations within the feature space parameterized by a graph model(GNN).

3. Q&A

1. Don't need to train MapNet(TDDIP)/GNN (Graph image prior) either?

You should not only take the CNN network \(h_\psi\) as the parameterized network \(f_\theta\). Instead, the whole network \(f_\theta = g_\phi \circ h_\psi\)

[to be countined]

© 2024 wrr6ps