SReNet: Spectral Refined Network for Solving Operator Eigenvalue Problem
摘要
评审与讨论
The authors propose a neural network based approach for solving operator eigenvalue problems. There is an interest in approximating eigenvalues and eigenfunctions for a given operator. The authors propose the Spectral Refined Network (SReNet) based on the power method.
优点
- Highly relevant problem in a variety of disciplines
- Sound introduction of eigenvalue basis given the page restrictions
缺点
- Difficult to follow derivation of the suggested method and it is also unclear to me what the advantage is of a neural network based approach.
- Convergence behaviour not well understood
- Comparison to outside of learning based approaches missing
- Notation for the neural network as confusing if in the next line appears as the number of sampling points
- The authors introduce in equation (10) but the meaning of the term only gets clarified in equation (12) when the term based on the power method is introduced
- Are the results in Table 1 really absolute errors? What about relative errors? These could be much more meaningful as it is not clear whether the methods produce anything sensible.
- Most of the references are formatted poorly. What is the first reference supposed to be? DME Zurich is likely not an author. Capitalize the names of Krylov, Schur, Fokker, Planck, etc
问题
- What do the authors know about the performance of the method? How does it depend on the spectral gap between the two dominating eigenvalues?
- Filtering and deflation are essentially sequential in nature if you compare to a state-of-at-art Krylov based eigenvalues solver. Why is it even sensible to switch to a learning based procedure?
This paper proposes a method to solve eigenvalue problems for differential operators by using neural networks to represent the eigenvectors.
优点
The writing is clear and the methodology is explained clearly.
缺点
-
The authors claim that eigenvalue problems suffer from the curse of dimensionality, however for the computation of first few eigenvalues is actually much more benign e.g. using sparse grids. So if the goal is to compute first few eigenfunctions, the authors should compare this method against existing sparse grid methods.
-
The approach of the author is not completely distinct from the NeuralEF method: their minimization problem can be written as finding the critical points of if one expands (12). In this light, the method can be viewed as a variant that seeks to accelerate the convergence of NeuralEF. The authors should comment on this connection.
问题
-
Did the authors really mean to cite the reference (LeVeque, 2002), which is a book on finite volume methods, not finite element methods?
-
Does the normalization by refer to a division by a pointwise scalar? This is then different from the usual power method. Do the authors mean normalization by , or similar?
This paper proposes a neural network based approach to solving operator eigenvalue problems. The main technical tools are derived from numerical linear algebra, particularly the power method and the deflation projection widely applied in solving matrix eigenproblems. The authors presented comparisons with existing approaches.
优点
The operator eigenproblem has broad applications in the physics and engineering domains. Recent works have actively explored the possibility of using deep learning tools to solve such problems and overcome the limitations of numerical linear algebra tools. This paper attempted to incorporate numerical tools with deep learning, an important topic.
缺点
Though the paper presented evaluations to suggest the advantages of the proposed method, there are significant technical flaws from the framework and algorithm design to evaluation approaches. Therefore, the conclusions in the manuscript could be misleading.
- Formulation: The paper extensively discussed and compared with previous works on neural eigensolvers, including NeuralEF [Deng2022], NeuralSVD [Ryu2024], and PMNN [Yang2023]. Though Figure 2 of the manuscript resembles Figure 1 of NeuralSVD, this work does NOT provide a training objective (loss function) for neural network training, unlike NeuralEF or NeuralSVD. Instead, the implementation requires an iterative optimization step [cf. Eq (12)] of two terms. However, the details, e.g., parameter design or the stability of the iterating procedure, were not discussed in the manuscript.
- Results & Evaluation: From the manuscript (appendix included), it is unclear how the authors addressed the generalization issue. From a high level, the work is built upon discretizing the function by taking random samples, while the samples are fixed during training [cf. Algorithm 2, P19]. Notably, in Eq. (33), the residual error is computed from the same collection samples. In other words, the training procedure was designed to minimize the error on a collection of fixed samples, and the training error was reported as the residual, which causes overfitting and is problematic. The main challenge in eigenfunction computation is generalizing to the whole domain, which was not mentioned in the manuscript.
- In D.2, the network architecture and optimization for SReNet differ from the other baselines (NeuralSVD and NeuralEF), making the comparison questionable.
- The result in Figure 1 of the manuscript showing the comparison between the proposed approach and NeuralSVD is inconsistent with the result reported in NeuralSVD paper [Ryu2024, Figure 4].
The presentation of the manuscript is not very clear, particularly the implementation of Eq. (12), i.e., "iterative optimization." The notations in Section 3 are hard to follow. Typos including , which should be .
- Zhijie Deng, Jiaxin Shi, and Jun Zhu. Neuralef: Deconstructing kernels by deep neural networks. In International Conference on Machine Learning, pp. 4976–4992. PMLR, 2022.
- J Jon Ryu, Xiangxiang Xu, HS Erol, Yuheng Bu, Lizhong Zheng, and Gregory W Wornell. Operator SVD with neural networks via nested low-rank approximation. arXiv preprint arXiv:2402.03655, 2024.
- Qihong Yang, Yangtao Deng, Yu Yang, Qiaolin He, and Shiquan Zhang. Neural networks based on power method and inverse power method for solving linear eigenvalue problems. Computers & Mathematics with Applications, 147:14–24, 2023.
问题
See the weakness part. In particular, it would be better if the authors could clarify where the performance gain comes from (suppose the evaluation issues have been fixed.)
Two specific questions are:
- how does the approach address the generalization issue?
- how much performance gain is due to the deflation approach? Or, more generally, the incorporation of numerical linear algebra and learning approaches.
伦理问题详情
N/A
The paper proposes a learning-based approach for solving an operator eigenvalue problem. The difference from existing studies is to incorporate deflation and shift-and-invert preconditioning technique into the learning approach such that the neural solver can find top-L eigen pairs and accelerate the convergence. Experiments are conducted on several operator eigenvalue problems.
优点
The strength of the paper is to improve the existing learning based approach to top eigen-pair computation for operators via the deflation and preconditioning. Extensive experiments are conducted.
缺点
There are several concerns.
- It's unclear why Eq. (10) can find the largest eigenpair. The optimal value of the objective function in Eq. (10) is 0, which can be attained at any eigenpair.
- There is no analysis on the sample complexity such that the optimal solution is guaranteed.
- It's unclear how to understand Eq. (18) and Eq. (22) since they are a mix of operator and matrices.
- Why can the inverse operation be omitted in Eq. (24) given the goal of amplifying the target eigenvalues?
- What's the difference of the proposed algorithm from the work by Yang et al. 2023?
- In Table 1, 9 orders of magnitude improvement occurred at the largest eigenpair, which doesn't require deflation at all. It's unclear how the proposed approach makes it.
问题
see above
I have read and agree with the venue's withdrawal policy on behalf of myself and my co-authors.