PaperHub
6.1
/10
Poster4 位审稿人
最低3最高4标准差0.4
4
3
3
3
ICML 2025

Analytical Lyapunov Function Discovery: An RL-based Generative Approach

OpenReviewPDF
提交: 2025-01-21更新: 2025-07-24

摘要

关键词
Control TheoryLyapunov FunctionSymbolic TransformerRisk-seeking Policy OptimizationAnalytical Function DiscoveryStability

评审与讨论

审稿意见
4

This paper proposes to use transformers and reinforcement learning (RL) to construct local analytical Lyapunov functions for high-dimensional non-polynomial systems. The proposed framework consists of three components: 1) a symbolic transformer to generate candidate Lyapunov functions; 2) a numerical verifier for finding counterexamples; and 3) a risk-seeking policy gradient algorithm to optimize the symbolic transformer based on the Lyapunov risk. Furthermore, to improve the training efficiency, a genetic programming component is also included in the training framework. Experimental results on dynamical systems with up to 10 dimensions show that the proposed algorithm successfully found the analytical Lyapunov function in most cases, and even discovered new Lyapunov functions for the 2-bus lossy system.

update after rebuttal

The rebuttal addresses my concerns, and I think the paper can benefit from the additional results provided in the rebuttal. I have also read other reviewers' comments and the authors' responses. The comments on the scalability of the SOS methods raised by Reviewer uz6L were also my concerns on the additional experimental results, but I think the authors' reply is fair. Overall, I think this is a good paper that provides valid insights on learning-based Lyapunov function discovery. I have increased my score.

给作者的问题

  1. In Figure 2, where does the x1x_1 in the bottom-right green box (the input to the encoder) come from?

  2. Can the authors discuss the completeness of the proposed framework? What if the algorithm is applied to an unstable system?

  3. Is it possible to add more experiments considering my comments in Experimental Designs Or Analyses?

I will be willing to increase the score if the questions are addressed.

论据与证据

The claims made in the submission are supported by clear and convincing evidence.

方法与评估标准

The proposed methods and/or evaluation criteria (e.g., benchmark datasets) make sense for the problem or application at hand.

理论论述

I checked the correctness of any proofs for theoretical claims. They are standard results from the literature, and there are no issues.

实验设计与分析

I checked the soundness/validity of the experimental designs or analyses. Below are some potential improvements that could be made.

  1. The baselines of the experiments can be improved. The proposed framework specifically targets analytical Lyapunov functions, but the introduced baselines are for finding neural Lyapunov functions. It would be great if more baselines for analytical Lyapunov functions were introduced, e.g., sum-of-squares methods (at least for the polynomial systems).

  2. For the experiment to demonstrate the scalability of the proposed framework on the 10-D system, it would be more interesting to see if the ground truth Lyapunov function is not a simple form like i=110xi2\sum_{i=1}^{10}x_i^2 since this is generally the first analytical form that a human would try. Potentially, the authors could change the variables a bit to design a more difficult ground truth Lyapunov function, which would introduce more impressive results.

补充材料

I reviewed the appendix.

与现有文献的关系

The key contributions of the paper are strongly related to the construction of Lyapunov functions using machine learning techniques. This is crucial for verifying the stability of dynamical systems and developing stable controllers for dynamical systems.

遗漏的重要参考文献

N/A

其他优缺点

Other Strengths:

  1. The writing of the paper is clear and easy to follow.

  2. All parts of the proposed framework are ablated in the experiments.

  3. The newly discovered Lyapunov function part is impressive.

其他意见或建议

I suggest the authors to use \citep and \citet carefully. The current citation format is messy.

作者回复

We thank reviewer s9Wv for the valuable time and constructive feedback. We provide point-by-point responses below.

Q1: It would be great if more baselines for analytical Lyapunov functions were introduced, e.g., sum-of-squares methods (at least for the polynomial systems).

Thanks for the valuable suggestion. We have added comparison to sum-of-squares methods as a baseline for both polynomial and non-polynomial systems detailed as below.

Polynomial System: For a given nn-dimensional polynomial (poly) dynamics f(x)f(x) and a degree 2dd, to check the globally asymptotical stability of f(x)f(x), SOS method aims to find a poly V(x)V(x) of degree 2d2d, such that 1) V(x)i=1nj=1dϵijxi2jV(x)-\sum_{i=1}^{n}\sum_{j=1}^{d}\epsilon_{ij}x_i^{2j} is SOS where j=1dϵij>γ,i=1,...,n\sum_{j=1}^{d}\epsilon_{ij}>\gamma,\forall i=1,...,n with γ>0\gamma>0, and ϵij0\epsilon_{ij}\geq0 for all ii and jj, and 2) Vxf(x)-\frac{\partial V}{\partial x}f(x) is SOS [1].

For local stability analysis, consider a ball of radius rr centered at origin Br(0)\mathcal{B}_r(0), which can be represented by the semialgebraic set

S=\\{x:g(x,r)\geq0, where g(x,r)=r-\sum_{i=1}^{n}x_i^2\\}.

We require that the stability condition holds in SS. Retaining the same optimization objective and constraints on V(x)V(x) as before, a modified constraint on Lie derivative is imposed: Vxf(x)s(x)g(x,r)-\frac{\partial V}{\partial x}f(x)-s(x)g(x,r) is SOS for some SOS poly s(x)s(x). If such an s(x)s(x) exists, we can certify local stability.

We develop our code based on findlyap function from SOSTOOLS (MATLAB) and issue-16 of SOSTOOLS' official GitHub repo to examine the SOS method on poly systems in Appendix F. Table summarizes the results, and SOSTOOLS identifies valid Lyapunov functions for systems in Appendix F.1, F.2, and F.3 (up to 3-D). For example, V(x)=0.86x120.60x1x26.63e7x1x3+0.90x22+4.49e7x2x3+0.79x32V(x) =0.86x_1^2-0.60x_1x_2-6.63e^{-7}x_1x_3+0.90x_2^2+4.49e^{-7}x_2x_3+0.79x_3^2 certifies the global stability of 3-D systems in Appendix F.2.

Table: Performance of SOS Method on Poly Systems

SystemDegree 2dRuntimeRegion
App. F.1220.6970.697sB1(0)\mathcal{B}_1(0)
App. F.2220.8320.832sGlobal
App. F.3 - I220.4970.497sGlobal
App. F.3 - II442.5092.509sGlobal

However, for higher-dimensional systems considered in Appendix F.4, F.5,& F.6, SOSTOOLS fails to certify global stability - as expected, since these systems are inherently not globally stable. Furthermore, it is unable to verify local stability due to computational limitations arising from the complexity of the s(x)g(x,r)s(x)g(x,r) term in the Lie derivative constraint, which grows quadratically with system dimension. For system in Appendix F.4, we have g(x,1)=1i=16xi2g(x,1)=1-\sum_{i=1}^6x_i^2 containing 7 terms, and degree 2 SOS poly s(x)s(x) has 36 terms at most. Thus, the constraint on lie derivative would have hundreds of terms, which is unsolvable by SOSTOOLS.

Please refer to Response to Reviewer uz6L (Q1) for SOS method on non-poly systems.

Q2: It would be more interesting to see if the ground truth Lyapunov function is not a simple form like i=1nxi2\sum_{i=1}^nx_i^2.

Consider the synthetic dynamics adapted from Appendix F.4 & G.2 with interactions between two subsystems:

x˙1=x1+0.5x20.1x52,\dot{x}_1=- x_1+0.5x_2-0.1x_5^2, x˙2=0.5x1x2+0.1x8,\dot{x}_2=-0.5x_1-x_2+0.1x_8, x˙3=x3+0.5x40.1x12,\dot{x}_3=-x_3+0.5x_4-0.1x_1^2, x˙4=0.5x3x4,\dot{x}_4=-0.5x_3-x_4, x˙5=x5+0.5x6,\dot{x}_5=-x_5+0.5x_6, x˙6=0.5x5x6+0.1x22,\dot{x}_6=-0.5x_5-x_6+0.1x_2^2, x˙7=x8,\dot{x}_7=x_8, x˙8=sin(x7)cos(x7)x8sin(x9)cos(x9)0.1x2,\dot{x}_8=-\sin(x_7)\cos(x_7)-x_8-\sin(x_9)\cos(x_9)-0.1x_2, x˙9=x8x9.\dot{x}_9=x_8-x_9.

To properly address the trigonometric terms in x8˙\dot{x_8}, the Lyapunov function for this dynamics can't be a simple form like i=1nxi2\sum_{i=1}^{n}x_i^2 and should include some trigonometric terms. Setting the state space D=xR9xi1.5,i=1,,9\mathcal{D}=\\{x\in\mathbb{R}^9| |x_i|\leq1.5,\forall i=1,\cdots,9\\}, our method successfully identifies a valid Lyapunov function V=i=16xi2+sin(x7)2+x82cos(x9)+1V=\sum_{i=1}^6x_i^2+\sin(x_7)^2+x_8^2-\cos(x_9)+1, which passes formal verification following settings in Section 5. This example and power system examples in Appendix G.3 & G.5 demonstrate our method's capability to recover the Lyapunov function with complex structures.

Q3: Messy citation format.

We thank the reviewer for the helpful suggestion. We will solve the format issue in the final version.

Q4: In Figure 2, where does the x1x_1 in the bottom-right green box come from?

Consider the binary tree in the top-right corner, when generating the last token x2x_2, its parent is '+', and its sibling is x1x_1. As a result, the x1x_1 shows up in the bottom-right green box.

Q5: Can the authors discuss the completeness of the proposed framework? What if the algorithm is applied to an unstable system?

Please refer to response to reviewer uz6L (Q1) for the completeness of proposed method and reviewer ve2p (Q3) for the analysis on unstable systems.

[1] A. Papachristodoulou, and S. Prajna. A tutorial on sum of squares techniques for systems analysis. In Proceedings of the IEEE American Control Conference, 2005.

审稿人评论

Thanks for the rebuttal! The rebuttal addresses my concerns, and I think the paper can benefit from the additional results provided in the rebuttal. Please include them in the final submission. I have also read other reviewers' comments and the authors' responses. The comments on the scalability of the SOS methods raised by Reviewer uz6L were also my concerns on the additional experimental results, but I think the authors' reply is fair. Overall, I think this is a good paper that provides valid insights on learning-based Lyapunov function discovery. I have increased my score.

作者评论

Thank you very much for your thoughtful review and positive feedback! We are glad that the additional results addressed your concerns and improved the paper. We will definitely include these results in the final paper. Thanks again for your time and helpful suggestions.

审稿意见
3
  1. Similarly to other lines of work, authors train a new model for a single nonlinear dynamical system.
  2. Key innovation lies on the symbolic formulation of the problem: instead of training a fixed model M (which would serve as the Lyapunov candidate) as per previous line of work, they work with symbolic transformer.
  3. They employ an RL loop to discover local Lyapunov candidates. A good component introduced by authors is the SHGO algorithm.
  4. The most real-world result is the fact that (a) the paper is not constrained by dimensions, (b) they discovered new local Lyapunov functions.

给作者的问题

  1. Given the symbolic answer provided by the model, can this help to enlarge the stability area given you are not overfitting on a particular locality domain?
  2. It's not clear how much general can the approach be: due to the nature of symbolic representation, this forces the candidate to be always the same function in the entire domain. Can this be problematic?
  3. Did you find some systems where the algorithm cannot find a Lyapunov candidate?
  4. To get a better sense of the proposed improvement, would it be possible to sample random dynamical systems (similar to Alfarano et al) and see how many you can recover compared to Alfarano, ANLC and FOSSIL 2.0?

论据与证据

The new framework can effectively discover valid analytical Lyapunov. Authors support the claim by considering different Lyapunov systems where a solution was not known in the past. Table 1 provides data on runtime, verification time, discovered Lyapunov functions, stability type (local or global asymptotic stability), and success rates for various systems.

方法与评估标准

Adding the symbolic component was really missing from previous line of work where they parametrized Lyapunov function as a non-linear neural network.

理论论述

There is no particular theoretical claim made by authors.

实验设计与分析

Authors compared the new framework with standard baselines (ANLC and FOSSIL 2.0).

补充材料

N/A

与现有文献的关系

The current scientific literature relies on either parametrized model (Lyapunov candidate) and use backpropagation + falsification to find a local Lyapunov function or train a general model and ask the model to guess a global Lyapunov candidate. The key contributions is the symbolic parametrization and the use of RL to successfully train the model.

遗漏的重要参考文献

N/A

其他优缺点

N/A

其他意见或建议

N/A

作者回复

We thank reviewer ve2p for the valuable time and constructive feedback. We provide point-by-point responses below.

Q1: Given the symbolic answer provided by the model, can this help to enlarge the stability area given you are not overfitting on a particular locality domain?

Thanks for the insightful question.

Firstly, our approach implicitly promotes generalization beyond the local sample region. Since the symbolic transformer directly operates on analytical expressions of dynamics rather than finite samples from the state space, the resulting Lyapunov functions naturally generalize beyond the sampled region. Empirically, as demonstrated by the 6-D polynomial system (Appendix F.4), our method successfully identified Lyapunov functions valid across expanded state-space domains. Additionally, the Lyapunov function obtained for the 6-D quadrator (Appendix G.4) is globally valid.

Furthermore, to explicitly enlarge the stability region, a promising future direction is to characterize the region of attraction (ROA) of the learned symbolic Lyapunov function and incorporate it into the reward function of risk-seeking policy gradient. Related efforts to enlarge the ROA in learning-based Lyapunov function search include Chang et al. (2019), Wu et al. (2023), and Yang et al. (2024). The corresponding references can be found in the reference section of the original manuscript.

Q2: It's not clear how much general can the approach be: due to the nature of symbolic representation, this forces the candidate to be always the same function in the entire domain. Can this be problematic?

As mentioned in Section 3.1 Line 156 of our manuscript, ``Without loss of generality, we assume the origin to be the equilibrium point.'' For a general nonlinear system with equilibrium not at the origin or more than one equilibrium, our approach applies in the following way.

Suppose x˙=f(x),xRn\dot{x}=f(x),x\in\mathbb{R}^n has two equilibrium points, x1,x2Rn,f(x1)=f(x2)=0x_1^*,x_2^*\in\mathbb{R}^n,f(x_1^*)=f(x_2^*)=0. For each equilibrium, define a change of variable. First for equilibrium point x1x_1^*, define x~1=xx1\tilde{x}_1=x-x_1^*. The dynamics of the transformed coordinate x~˙1=f(x~1+x1):=f1(x~1)\dot{\tilde{x}}_1=f(\tilde{x}_1+x_1^*):=f_1({\tilde{x}}_1) has an equilibrium at the origin. We can apply the proposed method to find a local Lyapunov function for f1f_1 near the origin, which corresponds to a local Lyapunov for f(x)f(x) near equilibrium point x1x_1^*. Similar procedures can be applied to x2x_2^* to obtain a local Lyapunov for f(x)f(x) near equilibrium point x2x_2^*.

In summary, our method can apply to general nonlinear systems. Users need to first compute the equilibrium points of the original nonlinear system. For each equilibrium, define the change of variables and transform system dynamics to have equilibrium point at origin. Then, users apply the proposed approach to find local Lyapunov function for each of the transformed system separately, which correspond to different local Lyapunov functions near each equilibrium. We will include the above discussion in the final version.

We hope the above explanation clarifies this question. Should there be any additional question, we would be happy to address them in the discussion period.

Q3: Did you find some systems where the algorithm cannot find a Lyapunov candidate?

Thanks for the valuable question. Our framework will fail to return a valid Lyapunov function if the dynamical system is unstable. Also, the framework may encounter difficulties in highly complex dynamics. For example, the success rate for high-dimensional systems is not consistently 100%, indicating occasional failures within a predefined number of training epochs under some random seeds. Importantly, such failures do not imply system instability. As shown in Table 1, one can successfully find valid Lyapunov functions from other random seeds (with different network initializations). In comparison, for unstable inverted pendulum without control, using the same experiment settings as in simple pendulum (Appendix G.1), our framework fails to recover an expression satisfying Lyapunov conditions over 200 epochs' training under different initializations.

Q4: To get a better sense of the proposed improvement, would it be possible to sample random dynamical systems (similar to Alfarano et al) and see how many you can recover compared to Alfarano, ANLC and FOSSIL 2.0?

Thanks for the constructive suggestion. To complete the comparison, we contacted the authors of Alfarano et al. (2024), who conducted the evaluation of their pre-trained model on our test systems. Due to constraints related to global stability and the dimensionality of their training data, their model can only successfully find Lyapunov functions for systems in Appendix F.1, F.2, F.3, & G.1. These systems have dimensions lower than 6 and the found Lyapunov functions offer global stability guarantees. We will include these results in the experiment section of the final paper.

审稿人评论

Thanks for these answers! For Q2, I was referring to the fact that you are forcing the candidate Lyapunov function representation to be defined in R^n, so you are excluding other possible candidates in this way.

作者评论

We really appreciate reviewer ve2p for the followup clarification. We are not sure if we correctly understand the original question "Due to the nature of symbolic representation, this forces the candidate to be always the same function in the entire domain. Can this be problematic?" and the followup clarification "For Q2, I was referring to the fact that you are forcing the candidate Lyapunov function representation to be defined in Rn\mathbb{R}^n, so you are excluding other possible candidates in this way". Thus, we tried to provide a response based on our two interpretations of the question. If our response does not fully address your question, please let us know and we are happy to provide further clarification.

Our first interpretation of the question is maybe the reviewer is concerning that the Lyapunov function form is constrained to be same from across the whole domain, which may be restrictive for complex nonlinear systems with multiple equilibrium points. That's why in the original response, we provide a clarification about how to apply our method for systems with multiple equilibrium points, and the recipe for identifying different Lyapunov functions around the different equilibrium point. Thus, our method is not limited to the same Lyapunov function for the entire domain, but can be used for identifying different local Lyapunov functions around the different equilibrium points.

However, if that is not the reviewer is asking about, our second interpretation about the question is: for a dynamical system x˙=f(x),xRn\dot{x} = f(x), x \in \mathbb{R}^n, whether the Lyapunov function candidate should be defined in i) Rn\mathbb{R}^n, or ii) can it be defined on some Rm,m<n\mathbb{R}^m, m < n, or iii) can it be defined on some Rm,m>n\mathbb{R}^m, m > n.

Based on the definition of Lyapunov function (Nonlinear Systems, Khalil, 2002)[Theorem 4.1], for a dynamical system in Rn\mathbb{R}^n, a Lyapunov function is a continuous differentiable function defined as V:DRV: D \to \mathbb{R}, where DRnD \subseteq \mathbb{R}^n. If D=RnD = \mathbb{R}^n, VV is a global Lyapunov function. If DD is a subset of Rn\mathbb{R}^n, i.e. DRnD \subset \mathbb{R}^n, then VV is a local Lyapunov function, which only certifies the stability within the region DD. Thus, case i) and case ii) are possible, and case iii) is not possible.

Next, we would like to clarify that our method, can represent Lyapunov functions in both case i) and case ii). Note that in our method, region DD is specified by the user and is provided as an input into the transformer-based reinforcement learning pipeline. DD can be either the entire state space Rn\mathbb{R}^n, i.e. D=RnD = \mathbb{R}^n (case i)) or a subset of Rn\mathbb{R}^n, i.e. DRnD \subset \mathbb{R}^n (case ii), depending on the task. Furthermore, in our symbolic library, different dimensions of the state variable are included as different tokens, e.g., {x1,...,xn}\{x_1, ..., x_n\}, together with the operation tokens. Thus, the Lyapunov function candidates produced by our transformer model may consist all of x1,...,xnx_1, ..., x_n or only a subset of them.

However, we do want to note that for Lyapunov asymptotic stability, it is required that VV is a positive definite function with V(0)=0V(0) = 0 and V(x)>0,xD\0V(x) > 0, \forall x \in D \backslash \\{0\\}, and the Lie derivative is negative definite LfV(0)=0,LfV(x)<0,xD\0L_f V(0) = 0, L_f V(x) < 0, \forall x \in D \backslash \\{0\\}. Therefore, when DD is a set in Rn\mathbb{R}^n (not in a lower dimension subspace), then the Lyapunov function should consist all of x1,...,xnx_1, ..., x_n. Otherwise, it will not meet the positive definiteness requirement and will require methods beyond Lyapunov stability theory to handle it (such as LaSalle's invariance principle) - which is beyond the scope of this work.

审稿意见
3

The paper presents an RL framework for discovering analytical Lyapunov functions for nonlinear dynamical systems. The proposed approach trains a symbolic transformer from scratch. The transformer generates candidate Lyapunov functions in a symbolic form, which is then refined and verified via a combination of global optimization (using SHGO), risk-seeking policy gradient updates, and genetic programming (GP) as expert guidance. The authors compared their approach with two other tools over a series of benchmarks.

给作者的问题

See previous sections.

Thanks for the additional effort. I have gone through the additional response. My concern is fairly addressed, with some reservations about the local robustness. I have updated my rate. I suggest the authors further compare the latest SOS-based techniques on local robustness in the later version.

论据与证据

The authors claimed to address the challenge: "Can neural networks effectively discover valid analytical Lyapunov functions directly from complex system dynamics?"

It is partially supported by the experiment results, which show an improvement in scalability compared to ANLC and FOSSIL. However, all the benchmarks are addressable for classical SDP-based Lyapunov function synthesis techniques, e.g., using [1] to convert ODEs with elementary functions to polynomial ODE and using SOSTools with Sedumi to address them. Classical SDP-based techniques can typically handle the systems up to about 10 dimensions, e.g., in the demos provided by SOSTools, sosdemo5 is an 8-dimensional example using SDP, though it is not for Lyapunov stability analysis. Thus, the given experiments cannot fully support the claim.

Similarly, the authors claimed in the introduction that "despite the progress of SOS methods, several theoretical results (Ahmadi et al.,2011; Ahmadi, 2012; Ahmadi & El Khadir,2018) on asymptotic systems may not agree with a polynomial Lyapunov function of any degree. In addition, SOS methods suffer numerical sensitivity issues in practice." No experiment is given to demonstrate that the proposed technique outperforms SDP/SOS in sensitivity issues. In addition, the authors did not provide the proof of completeness of their approach. Thus, it is not fair to blame the incompleteness of the SDP/SOS techniques.

Finally, except for the classical SDP/SOS techniques, there are existing learning-based techniques, e.g., [2], that can address harder systems (hybrid systems) with up to 23 dimensions to generate analytical (Lyapunov-like) barrier functions. Considering the fundamental similarity between the Lyapunov function and barrier function, it is necessary to compare with these works.

[1] Liu, Jiang, et al. "Abstraction of elementary hybrid systems by variable transformation." International Symposium on Formal Methods. Cham: Springer International Publishing, 2015.

[2] Zhao, H., Liu, B., Dehbi, L., Xie, H., Yang, Z., & Qian, H. (2024). Polynomial Neural Barrier Certificate Synthesis of Hybrid Systems via Counterexample Guidance. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 43(11), 3756-3767.

方法与评估标准

Overall, the proposed technique follows the paradigm of data-driven or counter-example guided abstraction refinement (CEGAR) techniques.

However, there is a concern regarding Section 4.2. Section 4.2 introduces an intermediate soft verification -- I refer to it as soft verification as it does not provide hard guarantees and an additional SMT-based verification step is needed. It is not clear why this step is necessary. In existing approaches, no soft verification is used during the training, for instance, Chang et al.(2019), and hard verification is conducted when the training converges. Any specific reason to introduce this step into the framework?

理论论述

This paper does not involve theoretical claims.

实验设计与分析

As mentioned in the Claims And Evidence, it is expected to see the direct comparison with SOS/SDP methods over the given benchmarks. It is also expected to see the experiment on higher dimensional systems (>10), which is beyond the capacity of SOS/SDP methods. It helps demonstrate the advantages of the proposed technique.

补充材料

The authors provided codes for two examples, which are runnable.

与现有文献的关系

None.

遗漏的重要参考文献

There is a large amount of work [2] on the analytical neural barrier function learning that needs to be cited, considering the relation between barrier function and Lyapunov function. In [2], they handled a 13D continuous system and a 23D hybrid system, which is a significant result compared to the examples considered in this work.

其他优缺点

The idea of using a symbolic transformer is quite interesting, and it is expected to see deeper analysis and discussion in this part, e.g., convergence, (statistical) completeness.

其他意见或建议

As mentioned in the Other Strengths And Weaknesses, it is expected to see deeper analysis and discussion in symbolic transformer, e.g., convergence, (statistical) completeness. While the whole framework follows the classical paradigm, it is suggested to mostly focus on the symbolic transformer in this work.

作者回复

Q1 and Q2: Justify the claimed advantage over SOS techniques. Sensitivity issues of SOS. Completeness of their approach.

Thanks for this valuable advice. We tested SOS using SOSTOOLS on polynomial (poly) and non-polynomial (non-poly) systems. Below is a summary of the setup and results.

Polynomial systems: SOS techniques successfully identify Lyapunov functions for poly systems up to 3-D (Appendix F.1, F.2, & F.3) but failed to retrieve valid local Lyapunov functions for systems in Appendix F.4, F.5, & F.6 of dimension 6\geq6. Please refer to our response to reviewer s9Wv (Q1) for detailed setup and results.

Non-polynomial systems: Following [3], we apply the SOS method on two non-poly systems: the simple pendulum and 3-D trig dynamics (Appendix G.1 & G.2). Both are recast into poly form by introducing new variables and necessary (in)equality constraints. SOSTOOLS identifies valid Lyapunov functions for both cases (see Table). For the pendulum, it succeeds under 20s. However, for the locally stable 3-D trig system, the addition of one state introduces four extra constraints, and the resulting formulation leads to over an hour of solving time, in contrast to ours’ 157s.

Table: Performance of SOS Method on Non-Poly System

SystemDegree 2dRuntimeRegion
App. G.1219.58sGlobal
App. G.226163sxR3xi1.5\\{x\in\mathbb{R}^3\|\|x_i\|\leq1.5\\}

To apply SOS methods on non-poly system z˙=f(z),zD1\dot{z}=f(z),z\in\mathcal{D}_1, define x1=zx_1=z as the original states and x2x_2 as the new variables. Let x=(x1,x2)x=(x_1,x_2). The system dynamics can then be written in rational poly forms: x˙1=f1(x),x˙2=f2(x),\dot{x}_1=f_1(x),\dot{x}_2=f_2(x), with constraints: x2=F(x1),G1(x)=0,G2(x)0,x_2=F(x_1),G_1(x)=0,G_2(x)\geq0, where F,G1,G2F,G_1,G_2 are vectors of functions.

Define g(x)g(x) as the collective denominator of f1,f2f_1,f_2, and local region of interest as a semialgebraic set: (x)Rn+mGD(x)0,\\{(x)\in\mathbb{R}^{n+m}|G_D(x)\geq0\\}, where GD(x)G_D(x) is a vector of poly designed to match the original state space.

Let x2,0=F(0)x_{2,0}=F(0). Suppose there exist poly functions V(x)V(x), λ1(x),λ2(x)\lambda_1(x),\lambda_2(x), and SOS poly σi(x),i=1,2,3,4\sigma_i(x),i=1,2,3,4 of appropriate dimensions, s.t. V(0,x2,0)=0,;(1)V(0,x_{2, 0})=0,\\;(1) V(x)λ1T(x)G1(x)σ1T(x)G2(x)σ3T(x)GD(x)ϕ(x)SOS,;(2) V(x)-\lambda_1^T(x)G_1(x)-\sigma_1^T(x)G_2(x)-\sigma_3^T(x)G_D(x) -\phi(x)\in\text{SOS},\\;(2) g(x)(Vx1(x)f1(x)+Vx2(x)f2(x))λ2T(x)G1(x)σ2T(x)G2(x)σ4T(x)GD(x)SOS,;(3)-g(x)(\frac{\partial V}{\partial x_1}(x)f_1(x)+\frac{\partial V}{\partial x_2}(x)f_2(x))-\lambda_2^T(x)G_1(x)-\sigma_2^T(x)G_2(x)-\sigma_4^T(x)G_D(x)\in\text{SOS},\\;(3) where ϕ(x)\phi(x) is some scalar poly with ϕ(x1,F(x1))>0,x1D1\0\phi(x_1,F(x_1))>0,\forall x_1\in\mathcal{D}_1\backslash \\{0\\}. If (1), (2), and (3) hold, then z=0z=0 is stable.

Compared with ours, the SOS method on non-poly systems has a few limitations: a) it requires substantial domain expertise for recasting (GD,F,G1,G2G_D,F,G_1,G_2) and manual design of coefficient constraints (ensuring V(0)=0,V(x)>0V(0)=0,V(x)>0); b) additional state variables and rapidly growing constraint complexity hinder scalability—constraints (1), (2), and (3) are more complex than those in polynomial systems, which already struggle at dimension 6\geq6; c) the approach does not extend to certifying asymptotic stability.

As for completeness, symbolic transformers have shown strong empirical performance in expression search, supported by recent work (Alfarano et al., 2024; Holt et al., 2023). However, if our method fails to return a valid Lyapunov function, it does not imply the system is unstable as other constructive Lyapunov function methods. Formal completeness proofs remain an open question for future research. We will clearly state this limitation in the final paper.

Sensitivity issues are noted in Dawson et al. (2023b). Since our work lacks a thorough analysis of it, we agree with the reviewer and will remove the claim.

Q3: soft verification

We introduce soft verification to improve training efficiency and candidate quality. Specifically, our framework employs SHGO to find the minimizer of the Lyapunov function VV and the maximizer of the Lie derivative LfVL_f V. We sample around these points to find counterexamples, that avoids the frequent timeouts observed with SMT-based verification during training and enhances Lyapunov quality (Appendix H.2). Formal SMT verification is applied post-training. Similar strategies include adaptive sampling (Grande et al., 2023) and projected gradient descent (Yang et al., 2024) to reduce verification costs.

Q4: Discussion of [2]

This paper uses NN poly expansions to reformulate barrier function search from nonconvex bilinear problems to CEGAR training with LMI verification. It handles hybrid systems up to 3-D and continuous systems up to 23-D, achieving scalability by using SOS only for verification. However, it is limited to polynomial dynamics and candidates. We will include [2] in the final paper.

[3] P. Antonis, and S. Prajna. Analysis of non-polynomial systems using the sum of squares decomposition. Positive polynomials in control. Springer Berlin Heidelberg. 2005.

审稿人评论

Thanks for the effort.

I still have concerns about the supplementary experiments. The results of the supplementary experiments by the authors showed that the SOS relaxation with SDP can only scale to 3D examples, which is not quite convincing, as in the tutorial about SOS 20 years ago, Antonis already showed a 4D example [a], 12D examples were given in the paper [b] a decade ago, and even 22D examples were addressed by improved SOS (DSOS, SDSOS [b]). May I know what solver is used? Please try MOSEK as the backend SDP solver.

I suggest the authors conduct two more experiments over the 10D example (5-link pendulum system) and the 12D example (6-link pendulum system) in [b], as we already know SOS can address these two. In [b], the authors tried to find the ROA, so the V function in the optimization is the local Lyapunov function.

[a] Papachristodoulou, Antonis, and Stephen Prajna. "A tutorial on sum of squares techniques for systems analysis." In Proceedings of the 2005, American Control Conference, 2005., pp. 2686-2700. IEEE, 2005. [b] Majumdar, Anirudha, Amir Ali Ahmadi, and Russ Tedrake. "Control and verification of high-dimensional systems with DSOS and SDSOS programming." In 53rd IEEE Conference on Decision and Control, pp. 394-401. IEEE, 2014.

作者评论

We thank reviewer uz6L for constructive feedback. Here we provide point-by-point responses to address your concerns.

  1. Scalability of the SOS methods:

First of all, we would like to clarify the correctness and the rationale of the SOS comparison results in the test cases. The 6-D, 8-D, and 10-D polynomial test systems in our Appendix F.4-F.6, are locally stable systems but not globally stable. Identifying Lyapunov functions for local stability is significantly more challenging in SOS method due to the added complexity of the s(x)g(x,r)s(x)g(x,r) terms, which complicate optimization constraints on the Lie derivative. Consequently, SOSTOOLS is only able to find Lyapunov function for the polynomial systems in Appendix F.1-F.3 (2-D and 3-D), and becomes infeasible for the high-dimensional test systems (6-D, 8-D, and 10-D). Detailed results can be referred to s9Wv (Q1).

Furthermore, we added experiments of the SOS method on globally stable polynomial systems in response to the reviwer's question. We tested our implementation on the globally stable 4D system (Example 7 from [a]) and an additional globally stable synthetic 10D polynomial system. Using SOSTOOLS with the Sedumi and MOSEK solvers, we successfully computed degree-4 Lyapunov functions for the 4D system in 10.104 seconds with Sedumi, 9.441 seconds with MOSEK and for the 10D system in 95.878 seconds with Sedumi, 15.54 seconds with MOSEK. These results illustrate SOS methods can indeed scale effectively up to 10 dimensions for globally stable systems. Here is the dynamics of synthetic 10D polynomial system used in test:

x˙1=x1+0.5y1+0.1x1x42,\dot{x}_1=-x_1+0.5y_1+0.1x_1x_4^2, y˙1=0.5x1y1+0.2y13y52,\dot{y}_1=-0.5x_1-y_1+0.2y_1^3y_5^2, x˙2=x2+0.5y2,\dot{x}_2=-x_2+0.5y_2, y˙2=0.5x2y2,\dot{y}_2=-0.5x_2-y_2, x˙3=x3+0.5y3,\dot{x}_3=-x_3+0.5y_3, y˙3=0.5x3y3,\dot{y}_3=-0.5x_3-y_3, x˙4=x4+0.5y40.1x12x4,\dot{x}_4=-x_4+0.5y_4- 0.1x_1^2x_4, y˙4=0.5x4y4,\dot{y}_4=-0.5x_4-y_4, x˙5=x5+0.5y5,\dot{x}_5=-x_5+0.5y_5, y˙5=0.5x5y50.2y14y5.\dot{y}_5=-0.5x_5-y_5-0.2y_1^4y_5.

  1. Fair comparison

For a fair comparison considering the non-polynomial systems, we followed your original suggestion to recast non-polynomial systems into equivalent polynomial forms. We tested the recasting approach with the SOS optimization for non-polynomial systems in both the 2D simple pendulum (Appendix G.1) and the 3D Trig dynamics (Appendix G.2). Although SOS-based methods handle these recast systems, the approach reveals practical limitations. Recasting requires significant domain expertise, and the introduced additional variables and the complicated constraints increase computational demands substantially. For instance, the recast 3D Trig dynamics required more than 6000 seconds (Sedumi) / 5800 seconds (MOSEK) for SOS optimization, while our proposed method discovered a valid Lyapunov function in just 157 seconds.

Regarding the suggested 5-link (10D) and 6-link (12D) pendulum examples from [b], we note two critical differences from our experimental settings:

(1) Reference [b] employs polynomial approximations of non-polynomial chaotic systems, reducing complexity at the expense of potential modeling errors. Our work, however, deals explicitly with original nonlinear dynamics without approximations. Thus, we regard the recasting approach as a more appropriate benchmark.

(2) Additionally, the examples in [b] focus on computing the region of attraction (ROA) by optimizing ρ\rho with a fixed pre-defined Lyapunov function (cost-to-go from LQR controller). This pre-selection using domain knowledge significantly simplifies the optimization task, whereas our method requires no expert-driven Lyapunov function selection. For completeness, the optimization in (4) [b] is maxρ,L(x)ρ,\max_{\rho, L(x)} \rho, s.t.  (xTx)(V(x)ρ)+L(x)V˙(x)DSOS2N,6,s.t.\; (x^Tx)(V(x)-\rho)+L(x)\dot{V}(x)\in DSOS_{2N,6}, where DSOS means the diagonally dominant sum of squares. We envision that the success of the method highly depends on the selection of V(x)=xTSxV(x)=x^TSx.

Furthermore, when reference [b] optimizes both the Lyapunov function and ROA simultaneously, it addresses only a 4D system, aligning with our previous results.

  1. Polynomial Systems without Polynomial Lyapunov Functions

An explicit example of a 2-D polynomial system is presented by Ahmadi & El Khadir, (2018), which is globally asymptotically stable, but does not admit a (global) polynomial Lyapunov function. As a result, SOS techniques can't certify the global stability of this system. In contrast, our framework is not limited by this constraint. As a proof-of-concept, we applied our approach to this system using three different random initializations. Our approach successfully identified a valid Lyapunov function V(x)=x22+log(x12+1)V(x)=x_2^2+\log(x_1^2+1), with an average runtime under 185 seconds.

We sincerely thank the reviewer for the valuable feedback. We hope the above new experimental results and explanations help clarify the questions.

审稿意见
3

This paper presents a novel and promising approach to discovering analytical Lyapunov functions for nonlinear dynamical systems using reinforcement learning and transformer-based generative models. The work addresses a fundamental challenge in control theory with significant practical implications. The framework consists of three key components: (1) a symbolic transformer that generates candidate Lyapunov functions, (2) a global-optimization-based numerical verifier that checks Lyapunov conditions and provides counterexamples, and (3) a risk-seeking policy gradient algorithm that optimizes the transformer parameters. The approach is enhanced with genetic programming for expert guidance.

update after rebuttal: I have decided not to change my assessment and maintain my score.

给作者的问题

I am curious about the exploration-exploitation trade-off in your approach: Since the space of potential analytical expressions grows exponentially with expression complexity, how did you address the exploration challenge in your RL framework?

论据与证据

Primary claim: The RL-based framework can discover valid Lyapunov functions for complex systems

Evidence:

  1. The method achieves 100% success rate on most low-dimensional systems and maintains 60-80% success on higher-dimensional systems
  2. The framework successfully handles non-polynomial systems with trigonometric terms

Secondary Claim 1: The approach discovers previously unknown Lyapunov functions

Evidence:

  1. For the 4-D lossy power system, they discover two distinct valid Lyapunov functions
  2. The authors claim these are the first analytical Lyapunov functions for certifying local stability of a 2-bus lossy power system

Secondary Claim 2: Risk-seeking policy gradient improves discovery performance Evidence:

  1. Ablation study (referenced in section 5.5) shows risk-seeking with α=0.1 achieves 66.67% success rate compared to 33.33% with α=0.5 and 0% with standard policy gradient (α=1)

方法与评估标准

The paper doesn't use standard benchmark datasets but rather a collection of dynamical systems of varying complexity. It compares against couple of baselines (ANLC and FOSSIL). Formal verification is done on discovered Lyapunov functions using various methods.

理论论述

I did not check for the correctness of any proofs or theoretical claims.

实验设计与分析

The experiments are quite thorough and the design seems valid. The evaluation metrics are clear and the ablation studies are well designed.

Limitations of the experiments:

  1. Alfarano et al. (2024) was excluded from direct comparison due to its focus on global Lyapunov functions
  2. Limited to only two neural network-based baselines
  3. No comparison with traditional symbolic regression or optimization-based methods

补充材料

No, I did not review the supplementary material.

与现有文献的关系

The topic is very much related to learning based search and can be applied to many areas of sciences and not just control theory. The approach could be adapted to discover invariants and correctness certificates for software verification. Instead of dynamical systems, the framework could analyze program state transitions and generate invariants that prove program correctness. Automated theorem proving could be another area of application.

遗漏的重要参考文献

None, that i know of.

其他优缺点

Weakness:

  1. The method is quite complicated and contains multiple parts. Its difficult to understand which part provides the most gains.

其他意见或建议

Could not find typos.

作者回复

We thank reviewer SwEU for the valuable time and constructive feedback. We provide point-by-point responses below.

Q1: Alfarano et al. (2024) was excluded from direct comparison

To complete the comparison, we contacted the authors of Alfarano et al. (2024), who conducted the evaluation of their pre-trained model on our test systems. Due to constraints related to global stability and the dimensionality of their training data, their model can only successfully find Lyapunov functions for systems in Appendixs F.1, F.2, F.3, & G.1. These systems have dimensions lower than 6 and the found Lyapunov functions offer guarantees of global stability. We will include these results in the experiment section of the final paper.

Q2: No comparison with traditional symbolic regression or optimization-based methods.

  1. We conducted a comparison to the Genetic Programming (GP) algorithm, a representative symbolic regression method, on 6-D polynomial dynamics. See Appendix H.3 of the manuscript. Ours can achieve a 100% success rate within 20 epochs, while GP alone failed all trials. As shown in Figure 7, GP initially improved generation quality (measured by Lyapunov risk reward) rapidly from random initialization, its evolutionary exploration failed to identify any valid Lyapunov functions due to insufficient incorporation of system dynamics.

  2. We added the sum-of-square method as a baseline for optimization-based method. In our experiments, SOS techniques successfully identify Lyapunov functions for polynomial systems of dimension up to 3 (Appendix F.1, F.2, & F.3) but failed to retrieve any valid local Lyapunov functions for polynomial test systems in Appendix F.4, F.5, & F.6 of dimension 6\geq6. Due to the character limit, please refer to Response to Reviewer s9Wv (Q1) for the detailed comparison setup and results.

  3. In addition, we apply the SOS method on non-polynomial systems by recasting the non-polynomial dynamics to polynomial dynamics. It successfully identifies valid Lyapunov function for the pendulum system (Appendix G.1) but suffers scalability issues for complex high-dimensional non-polynomial systems and requires substantial domain expertise in the recasting process and (in)-equality constraints design. Please refer to our response to reviewer uz6L (Q1) for detailed results and analysis.

We will highlight the comparisons to classic symbolic regression methods (GP) and add the comparison to optimization-based SOS method in the final version.

W1: It's difficult to understand which part provides the most gains.

Thanks for the valuable feedback. The backbone of our framework is a symbolic transformer trained via a risk-seeking policy gradient algorithm to efficiently discover Lyapunov functions, which provides the most gains. To clarify the gains of other parts of the model design, we performed detailed ablation studies in the appendices. The global optimization module is deployed to evaluate the generated expressions and guide the training, as shown in Appendix H.2 (Figure 6). The GP module provides expert guidance to accelerate training, illustrated in Appendix H.3 (Figure 7 & 8). Together, the global optimization and GP modules assist the training of the symbolic transformer.

Q4: Exploration-exploitation trade-off in your approach:

Thanks for this insightful question. In our framework, exploration occurs at the candidate expression generation step and GP's evolutionary operations, while risk-seeking policy gradient and expert guidance loss are responsible for the exploitation.

As symbolic tokens in candidates are sampled according to some learned conditional probability distribution (the output of the decoder), and GP algorithms employ mutation, crossover, and selection operations to introduce randomness for the search of higher quality candidates, exploration is encouraged in these two steps.

For exploitation, the risk-seeking policy gradient optimizes around high-quality expressions, reinforcing high-reward candidates to focus the search on promising regions. The expert guidance loss also facilitates exploitation by encouraging the learned distribution to match the best-known candidates. We tuned the risk-seeking hyperparameter α\alpha to achieve the trade-off. In practice, we find α=0.1\alpha=0.1 achieves the best performance, details are available in Appendix H.1.

最终决定

This paper presents a novel reinforcement learning (RL) framework that uses a symbolic transformer to discover analytical Lyapunov functions for nonlinear dynamical systems. The approach combines a risk-seeking policy gradient, a numerical verifier based on global optimization, and optional guidance from genetic programming. Reviewers found the paper potentially impactful, particularly due to its symbolic formulation, ability to handle non-polynomial systems, and success in discovering Lyapunov functions for challenging benchmarks such as the 2-bus lossy power network.

A key strength highlighted by reviewers is the use of a symbolic transformer, which contrasts favorably with prior approaches that rely on parameterized neural models. The method demonstrates strong empirical performance across a diverse suite of benchmarks, notably solving cases that were previously unsolved and overcoming limitations of dimensionality and polynomial constraints found in prior work.

Before the rebuttal, reviewers raised concerns about the lack of direct comparisons with established SDP/SOS-based techniques, which can scale up to 10-dimensional systems. The rebuttal partially addressed these concerns, but a more direct empirical comparison should be included in the camera-ready version. Additionally, several reviewers suggested that the paper would benefit from a theoretical or statistical analysis of the symbolic transformer’s behavior—e.g., convergence guarantees or completeness. The authors should clearly address this in the revision. Further motivation or ablation to justify the design choice of soft verification would also strengthen the paper.

In summary, this work introduces a promising new direction for learning-based discovery of analytical Lyapunov functions, effectively bridging symbolic synthesis and reinforcement learning. The rebuttal has helped alleviate major concerns raised by reviewers. The meta reviewer recommend acceptance, contingent on the authors incorporating the suggested additional comparisons and clarifications in the final version.