Can Neural Networks Improve Classical Optimization of Inverse Problems?
We show that a joint neural network training procedure can enhance the solution quality for general inverse problems without requiring domain knowledge.
摘要
评审与讨论
This paper proposes to use neural networks instead of iterative optimisation algorithms (such as limited-memory BFGS or Gauss-Newton) to solve general inverse problems. The role of the neural networks is to infer the parameters of the system directly from observations. The paper formulates inverse problems as supervised machine learning problems. The approach is experimentally validated on six different inverse problems.
优点
The paper addresses an interesting topic.
缺点
The main weakness of this paper is its lack of novelty.
The authors cite a number of related papers and claim: "However, since these approaches rely on loss terms formulated with neural network derivatives, they are not applicable to general inverse problems".
I think the authors ignore a lot of work that is directly related to the question of solving inverse problems with neural networks. I mention some of them below, but there are many more. Compared to this literature, it's hard to find any originality in the proposed method. In any case, if there is any originality, it should be specified on the basis of this (not cited) literature on inverse problems with neural networks.
Aggarwal, H.K., Mani, M.P., Jacob, M., 2019. MoDL: Model Based Deep Learning Architecture for Inverse Problems. IEEE Trans. Med. Imaging 38, 394–405. https://doi.org/10.1109/TMI.2018.2865356
Lucas, A., Iliadis, M., Molina, R., Katsaggelos, A.K., 2018. Using deep neural networks for inverse problems in imaging: Beyond analytical methods. IEEE Signal Process. Mag. 35, 20–36. https://doi.org/10.1109/MSP.2017.2760358
Mukherjee, S., Carioni, M., Öktem, O., Schönlieb, C.-B., 2021. End-to-end reconstruction meets data-driven regularization for inverse problems, in: NeurIPS.
Ongie, G., Jalal, A., Metzler, C.A., Baraniuk, R.G., Dimakis, A.G., Willett, R., 2020. Deep Learning Techniques for Inverse Problems in Imaging.
Peng, P., Jalali, S., Yuan, X., 2020. Solving Inverse Problems via Auto-Encoders. IEEE Journal on Selected Areas in Information Theory 1, 312–323. https://doi.org/10.1109/JSAIT.2020.2983643
问题
What is new about this approach compared to the literature on neural networks for inverse problems?
This paper presents several algorithms for solving optimization of inverse problems using neural networks, including (1) supervised learning using simulated data; (2) reparameterization using untrained neural network as an implicit prior, and (3) neural adjoint that approximate the forward process with the neural network by training the network on simulated data, and then use the pre-trained network in the optimization problem solver.
优点
Various proposed algorithms covering different usage of neural network in the optimization of inverse problems. Various of inverse problems in experiments.
缺点
My main concerns are about the contributions of this paper:
Form the method aspect:
- The supervised learning has been widely used for decades.
- Reparameterization shares the same concept with well-known deep image prior. This paper claims that “these effects have yet to be investigated for general inverse problems or in the context of joint optimization”. Shouldn’t the imaging optimization belongs to the an instance of general inverse problem? Could the author also clarify the meaning of joint optimization, and if the proposed reparameterization has been applied to joint optimization in this work?
- For the neural adjoint, I cannot follow why do we need to approximate the forward process with the neural network as it is usually assumed to be known in the inverse problem.
From the application aspect: Experiments in this paper are all in small scale and both toy problems. How does the proposed method work in the real-world problem.
In short, I think this paper tries to use a unified notation to formulate the inverse problems, introduces existing works under these notations, and validate the existing works under various toy problems. I think the novelty of this paper is limited.
问题
See the weakness.
This manuscript proposes a neural network-based framework for solving multiple inverse problems in a joint fashion. The core idea is to parameterize the inverse mapping as a neural network and then optimize the model parameters by minimizing a mismatch loss. This loss is defined based on pairs of data points and a known forward model. This method is referred to as the "reparameterized method," and the paper argues that leveraging information across multiple inverse problems can mitigate the challenging landscapes often encountered in these problems.
This approach shares similarities with amortized optimization methods widely studied in contexts like variational inference and stochastic control/reinforcement learning, as discussed in the recent review "Tutorial on Amortized Optimization" (arXiv: 2202.00665). This is the first work I have seen that applies this idea to inverse problems, although I am not sure whether existing work has already explored this straightforward extension in the context of amortized optimization.
However, even if we put aside the paper's novelty, it still suffers from various technical and presentation issues. Specifically, the paper's setting and methodology are not clearly explained, and the rationale for its comparisons with other neural network methods is not well justified.
优点
The paper empirically demonstrates that the reparameterized approach, when augmented with BFGS refinement, delivers better solutions compared to the standard BFGS method for inverse problems.
缺点
Major Comments:
-
According to the introduction and conclusion, my understanding is that the paper's central thesis posits the reparameterized method as a more effective alternative to vanilla BFGS, particularly when enhanced with BFGS refinement, for solving multiple inverse problems. However, the paper, which has a very broad title, also delves heavily into comparisons with two other neural network approaches: supervised learning and the neural adjoint method. This leaves me uncertain: whether the authors aim to primarily promote the reparameterized method or to provide a broader discussion of various neural network-based strategies for inverse problems.
-
While I believe it is justifiable to compare the BFGS and reparameterized methods, I question the fairness of comparing the reparameterized approach with supervised learning and the neural adjoint method. This is because the latter two necessitate an additional dataset for pre-training, unlike the reparameterized method. Following this point, I am extremely confused by the meanings of subfigures (c) and (d) in the experiments.
2(a) What exactly does "loss" mean on the y-axis of subfigure (c)? Does this term refer to specific error metrics (in terms of L?) related to the obtained solutions? Initially, the term "loss" would seem to indicate the objective functions associated with the different methods. However, supervised learning and the neural adjoint method have their own unique loss functions for training the neural networks, making direct comparison with the loss in the reparameterized method unclear.
2(b) Similarly, what is meant by "dataset size" in subfigure (d) for the supervised learning and neural adjoint methods? Does it refer to the size of the training dataset for the networks, or the number of inverse problems to be solved? If it's the former, the numbers appear too small. If it's the latter, the confusion persists: during the problem-solving phase, methods like BFGS, supervised learning, and the neural adjoint solve each individual problem independently. The rationale for considering L/n across different n values is unclear, as these would merely represent the same statistics, recalculated from different sample sizes.
-
The paper introduces the reparameterized method under the assumption that is known. However, some experiments operate in settings where is unknown. I do not see a straightforward way to extend the method to this new context, given that the forward model relies on both and to produce . This omission leaves a critical gap in the technical exposition.
-
On page 4, within the paragraph discussing supervised learning, it is inaccurately stated that "if we additionally have a prior on the solution space , we can generate synthetic training data." What is actually needed is knowledge of the joint distribution of .
-
On page 5, in the first paragraph, the expression "" is used. Should actually be? Otherwise, I do not understand how the neural adjoint method works.
Minor Issues:
-
In the second line below Figure 2, the mathematical expression appears to contain an error.
-
The conclusion asserts that this is "the only network-based method that does not require domain-specific information." The term "domain-specific information" is ambiguous here. To utilize the proposed reparameterized method, a forward model is necessary, which could reasonably be categorized as domain-specific information.
问题
See questions above
Iterative optimization algorithms can find solutions to inverse problems for simple problems, but suffer from reliance on local information. Thus their effectiveness for complex problems involving local minima, chaos, or zero-gradient regions is severely limited. In the study, the authors employ neural networks to reparameterize the solution space and leverage the training procedure as an alternative to classical optimization. Numerical experiments demonstrate that the neural networks can indeed improve the accuracy of solutions.
优点
The idea of introducing a parameter \theta to reparameterize the problem is interesting. Involving neural networks to the procedure is also effective in optimizing the objective function. Extensive simulations illustrate the promising performance of the method.
缺点
In simulations, the graph Figure 2 (c) is not very informative, and in fact, a little confusing. From the graph, it seems that "neural adjoint" has larger loss than "BFGS", but the explanation below says that "The neural adjoint method finds better solutions than BFGS for about a third of examples for n = 256". For me, it looks like that the explanation is inconsistent with the graph. Please explain this or clarify it in the paper.
问题
No question.