Neural Time Integrator with Stage Correction
摘要
评审与讨论
This paper proposes a neural time integrator NeurTISC based on stage correction. Unlike NeurVec, NeurTISC compensates for the errors at each stage. NeurTISC achieves better results on three different equations.
优点
By correcting errors at each stage, NeurTISC can predict more accurately and more stably than NeurVec and traditional numerical methods.
缺点
- Note the indentation at the beginning of the paragraph.
- Many of the quotes require parentheses: Use parentheses in citations when the author’s name is not directly mentioned in the sentence. This helps clearly separate citation information from the main content.
- The lack of related works. There are quite a lot of other works focusing on introducing neural networks to accelerate traditional methods. [1, 2]
- Only two time windows are considered in the Viscous Burgers Equation and Kuramoto–Sivashinsky Equation. Too few types of will make the results not convincing enough, and it is difficult to reflect the changing trend of the results of different methods as increases.
- It will be more convincing to consider and compare the methods on different coarse time step,like multiply by 2 and divide by 2 based on the current time step.
- Experiments on the accelerated ratio regarding the time of the traditional method and NeurTISC achieving the same accuracy can help demonstrate the effectiveness of NeurTISC. You may reduce the time step of the traditional numerical method or increase the time step of NeurTISC to achieve similar accuracy.
- l146: the letter after commas should be lowercase: In this section, we...
- l202: average the loss
[1] Bar-Sinai, Yohai, et al. "Learning data-driven discretizations for partial differential equations." Proceedings of the National Academy of Sciences 116.31 (2019): 15344-15349. [2] Greenfeld, Daniel, et al. "Learning to optimize multigrid PDE solvers." International Conference on Machine Learning. PMLR, 2019.
问题
- l122: what is and ?
- l208: what is ?
- In the experiments, what kind of metric is used in the loss function?
This paper proposes a new type of neural time integrator based on a stage correction strategy (NeurTISC) to address the challenges of balancing accuracy and computational efficiency in dynamical system simulations. In order to achieve that, the authors introduce a novel integrator with stage corrections, inspired by methods like Runge-Kutta, to correct errors at each stage and allow for larger time steps while maintaining stability and accuracy. The algorithmic contribution is demonstrated with a suite of numerical experiments including elastic pendulum, viscous burgers equation, and viscous burgers equation.
优点
(1) The method proposed is easy to use and effective. (2) The related works are adequately stated. (3) The method proposed is well demonstrated which makes it relatively understandable.
缺点
(1)The paper does not provide mathematical proof of the convergence stability of NeurTISC. (2) The numerical experiments are not sufficient to demonstrate the superiority of NeurTISC. (3) The paper is imprecise and unpolished. The format of the paper needs to be checked for better readability. Some paragraphs begin with an indentation while some do not, which makes your paper look messy. Make sure the format of the paper is aligned with the rules of ICLR. In addition, more information should be included in the figures to make them more understandable. For example, you just mark the horizontal axis as ‘x’ in figure 3, which is very ambiguous. I advise you to use a more understandable word to replace ‘x’ or explain what the meaning of ‘x’ is in the caption of the figure. (4)There is a statement in the conclusion that “therefore, it allows the use of large step sizes to accelerate the simulation of dynamical systems, including pure ODEs and ODEs converted from PDEs after spacial discretization.” However, the authors do not perform any experiment to compare the number of iteration steps or the computing times between different methods.
问题
(1) Will the stage-by-stage correction approach you introduc affect the stability of the integrator and make it difficult to converge? You should provide a mathematical proof of the stability of your method or at least cite relevant papers which give a proof. (2) In your numerical simulations, you just compare NeurTISC with NeurVec and ETDRK4. Why don’t you provide more numerical simulations of other relevant methods? I think it will not consume much time.
In this paper, the authors propose numerical integration methods for differential equations in which Runge-Kutta methods are modified by neural networks. According to the authors, the proposed numerical method provides more reliable numerical results than purely machine-learning approaches. In particular, in the proposed method, the numerical values at each stage are modified by a neural network, thereby improving the accuracy.
优点
Unlike operator learning and PINNs, the proposed method yields modifications of Runge-Kutta methods. Hence numerical solutions obtained by this approach may be able to accurately approximate solutions of differential equations.
缺点
I believe that this paper has several weaknesses.
- In the method proposed in this paper, no constraints on neural networks are imposed. Therefore, numerical methods obtained by this approach is not guaranteed to approximate solutions of the differential equation(i.e., the order of accuracy of the numerical methods is decreased to 0.) This fact disrupts the authors' claim that their method is based on numerical integrators and, hence, highly reliable. For example, for the Heun method described on page 3, must approximate to define a method with at least first order accuracy. To this end, a certain condition is required for and ; for example, if NN_1+NN_2=0, then will approximate .
- As far as I understand, this method requires the neural network to be re-trained whenever the step size is changed. This is not practical.
- The authors discuss the computational time below (6). It is stated that the proposed method is faster when is small, but in what cases is this expected? Evaluating neural networks requires matrix operations, which require a certain amount of computation.
问题
My questions are about the practical aspects of the proposed method. Under what conditions can we expect this method to be practical?
This paper combines use of the neural networks with conventional time integration methods, e.g. Heun’s method (a second-order Runge-Kutta scheme) to speed up the time integration while maintaining accuracy and stability. The conventional method advances in time with much larger values, and the error of such inaccurate marching is compensated by the use of NNs. While this method has been explored in other papers, e.g. NeurVec, the authors propose use of a network for each "stage" of RK scheme (stage-by-stage correction), observing that various stages have different magnitude, frequency and error properties. They apply this method to both ODEs and PDEs, for the latter they use march in time method (first they discretize the spatial part and then they use NeurTISC for time integration).
优点
Kuramoto–Sivashinsky in chaotic regime is a good choice to show the strength of method. The idea is explained clearly and the paper is well-written.
缺点
Use of Neural networks for time integrator is really now a mature field of its own. This paper does not succeed in positioning itself in the large body of work in this domain. For instance, state-of-the-art is Neural ODE (NODE) which is not mentioned in this paper. There are also other method like hierarchical methods, e.g. Hierarchical Deep Learning of Multiscale Differential Equation Time-Steppers by Liu et al, where a very different architecture is used to achieve the same results by authors. Also other advanced methods also address other issues that this method is not able to fix; for example large dimensionality of PDEs can be handled by methods like NIF (Neural Implicit Flow: a mesh-agnostic dimensionality reduction paradigm of spatio-temporal data by Pan et al) or CROM (CROM: Continuous Reduced-Order Modeling of PDEs Using Implicit Neural Representations by Chen et al) that not only can take care of large dimensionality of the dynamical system to be integrated, but also provide a continue in time manifold of solution, meaning at each arbitrary instant the solution can be evaluated by the neural network. The contribution is very incremental, even compared to NeroVec. Multi-frequency of stages can still be captured by a single NN at the last stage in theory, say by using more advanced RNN or autoregressive models. There is no theoretical analysis to show guarantees and it is mostly based on experimental results which are not the most complicated. problems arise in dynamical systems.
问题
The introduction part could be revised; For example authors method data-driven and physics-informed; but there could be hybrid version of the two even without referring to conventional methods. What do authors think of this? How large time step for RK could be chosen for this method? What is the bound after which this method may fail? There should be some analysis on such a choice. The NN architecture is very simple; which is not necessarily something negative, as it can help with the fast inference evaluation. But have authors also considered my complicated architectures for the correction part?
This paper explores the integration of neural networks with multi-stage Runge-Kutta (RK) methods to accelerate time integration of dynamical systems. Unlike prior work such as NeurVec, which applies a neural network correction after completing all RK stages in an integration step, this paper introduces a stage-by-stage correction, using a separate neural network for each stage. This approach aims to address stage-specific variations in magnitude, frequency, and error properties. The method is evaluated on several ODEs and PDEs, showing improved integration speed and reasonable accuracy.
Despite these results, as reviewers noted, the contribution of this work is quite incremental. The paper does not provide a theoretical explanation for how the proposed method outperforms NeurVec, nor does it include computational efficiency comparisons, such as the number of iterations required to meet a specific error threshold. Furthermore, the method’s practicality is very limited, as the network must be retrained whenever the step size changes.
Given these limitations, the paper is not ready for publication in its current state.
审稿人讨论附加意见
The authors did not respond to the reviewers’ comments during the discussion period, and no updates were provided.
Reject