PaperHub
6.4
/10
Rejected4 位审稿人
最低2最高5标准差1.2
5
5
2
4
3.3
置信度
创新性3.0
质量3.0
清晰度3.0
重要性2.5
NeurIPS 2025

Discovering Symbolic Differential Equations with Symmetry Invariants

OpenReviewPDF
提交: 2025-04-16更新: 2025-10-29
TL;DR

Enforcing symmetry via the use of differential invariants in symbolic differential equation discovery algorithms to improve their accuracy and efficiency.

摘要

关键词
equation discoverysymbolic regressionsymmetryPDELie point symmetryequivariance

评审与讨论

审稿意见
5

This paper addresses the problem of discovering partial differential equations (PDEs) under symmetric constraints by employing differential invariants of the symmetry group as the variable set in symbolic regression (SR) algorithms. The authors provide a theoretical analysis demonstrating that differential invariants are capable of capturing the full range of PDEs that can be discovered under such constraints. Additionally, the paper illustrates how to effectively incorporate differential invariants into three commonly used SR algorithms and presents evaluation results across a variety of dynamical systems to demonstrate improved performance.

优缺点分析

Strengths:

  1. [Significance] The paper is well-motivated and addresses an important challenge in symbolic regression for PDEs—namely, how to effectively incorporate prior knowledge about invariance structures into the discovery process.
  2. [Novelty] The idea of using differential invariants instead of the original variables is both novel and theoretically grounded, offering a principled approach to encoding symmetry constraints.
  3. [Clarity] The proposed method is clearly presented, with consistent notation and well-structured formulations. The paper is well-written and accessible to readers with a relevant technical background.
  4. [Quality] The authors explore multiple integration strategies for incorporating differential invariants into SR algorithms and evaluate their performance on a range of challenging scenarios, including cases involving imperfect symmetry and noisy data.

Weakness:

The limitations of the approach are relatively minor and are adequately discussed by the authors.

问题

A major advantage of the proposed approach is its compatibility with various existing algorithms, such as symbolic regression (SR) and SINDy. However, all the methods discussed in the paper rely on RMSE (with regularization) as the objective function. It would also be worthwhile to discuss and clarify in the paper whether the proposed approach can be extended to different objective functions, such as the variational objective used in [1].

[1] D-CIPHER: Discovery of Closed-form Partial Differential Equations. NeurIPS 2023.

局限性

Yes

最终评判理由

The author have addressed my concerns in the rebuttal with additional experimental results.

格式问题

N/A

作者回复

We thank the reviewer for the constructive feedback. We address the question below:

A major advantage of the proposed approach is its compatibility with various existing algorithms, such as symbolic regression (SR) and SINDy. However, all the methods discussed in the paper rely on RMSE (with regularization) as the objective function. It would also be worthwhile to discuss and clarify in the paper whether the proposed approach can be extended to different objective functions, such as the variational objective used in D-CIPHER [1].

First, we would like to note that we have implemented our method on Weak SINDy, which uses a variational objective. Weak SINDy is used in all experiments in Sec. 4.4.

Nonetheless, this is a great suggestion, and we have conducted an additional experiment to implement our method on D-CIPHER and tested both the original D-CIPHER algorithm and our symmetry-enforced method on the Darcy flow dataset. The results are available in the table below.

MethodDiscovered equation
D-CIPHERuy+1.16uxy=18.74(u+eu1.35)ey2u_y + 1.16 u_{xy} = 18.74 (u + e^u - 1.35) e^{y^2}
D-CIPHER^*xux+yuy2.09xuy2.09yux0.19uy=7.98x2y2+2.51xy+0.80xu_x + yu_y - 2.09 xu_y - 2.09 yu_x -0.19 u_y = 7.98 x^2y^2 + 2.51 xy + 0.80
D-CIPHER-SI (ours)(xux+yuy)0.13(uxx+uyy)=0.13e4.12(x2+y2)(xu_x + yu_y) - 0.13 (u_{xx} + u_{yy}) = 0.13 e^{4.12(x^2+y^2)}

We further explain the three methods in the table. The first method, D-CIPHER, uses the original implementation from the D-CIPHER official codebase. While their formulation of the extended differential operator in the paper is a(x)α[h(x,u)]a(\mathbf x) \partial_\alpha [h(\mathbf x, \mathbf u)], their implementation only addresses the operators in the form α[h(x,u)]\partial_\alpha [h(\mathbf x, \mathbf u)]. This does not cover the terms like xuxxu_x and yuyyu_y in the Darcy flow equation. Therefore, in the second method of the table, D-CIPHER^*, we extended their code implementation by allowing function coefficients a(x)a(\mathbf x) in the operator. Only in this way is the method theoretically capable of finding the correct functional form of Darcy flow. The final row represents our method, where we apply the procedure in Appx. C.3 (Line 778-809) to provide D-CIPHER with a set of rotationally invariant differential operators and fundamental rotational invariants, so that the spatial rotation symmetry is enforced in any discovered equation. More specifically, the inner optimization of D-CIPHER solves a 5-dimensional coefficient vector for 5 invariant operators in a constrained least-squares problem, and the outer optimization uses genetic programming to find a free-form expression of the 2 fundamental rotational invariants, x2+y2x^2 + y^2 and uu.

It can be seen from the table that our method can find the correct functional form, while D-CIPHER with the original variables and derivative operators cannot. The benefit of symmetry is even greater here for D-CIPHER than for other SR methods like SINDy, because D-CIPHER requires the user to specify both the function coefficient a(x)a(\mathbf x) and the function to be differentiated h(x,u)h(\mathbf x, \mathbf u) for an extended derivative (i.e., step 1 in their Figure 1). Such choices of functions can be largely arbitrary if no prior knowledge is available. On the other hand, our symmetry-based approach automatically selects this dictionary of differential functions.

In the revision, we will include this experiment and clarify that our approach can be extended to different objective functions, such as the variational objective.

评论

Thank you for the additional clarification and experiments. I'll keep the positive review.

审稿意见
5

The submission addresses the problem of discovering symbolic (differential) equations with known symmetries. The key contribution is to incorporate differential invariants into the set of candidate functions for equation discovery (e.g., via SINDy, SR) to automatically satisfy symmetry constraints like SO(2). Experiments show improved performance over symmetry-agnostic baselines on differential equations with known symmetries.

优缺点分析

For context, I am writing this review as someone with more background in differential equation discovery than in differential invariants. For example, I have not read Olver's (1993) book. This background affects my evaluation because some parts of the paper assume certain prior knowledge (I mention this as a weakness below). Regardless, I enjoyed reading the paper, and I think it should be accepted. Below, I detail the strengths and weaknesses of the submission.

Strengths

The proposed algorithm, which combines equation discovery methods such as SINDy or SR with candidate functions that are certain differential invariants, is a convincing approach to enforcing symmetry in equation discovery. It is not surprising that the proposed approach outperforms non-symmetry-enforcing ones on problems that have this kind of structure. I especially appreciate the careful presentation: even if some nuances of the technical aspects might require relatively advanced backgrounds in, for example, Lie groups, the rest of the paper is still easy to follow.

Weaknesses

As mentioned above, the mathematical formulation is more involved than some of the papers that cover a similar subject; e.g., it assumes knowledge of "jet spaces" or "prolongations", which might not be typical knowledge at NeurIPS. I understand that these concepts may be necessary for the mathematical setup. However, including some examples, mainly in Section 2.1 and Appendix B.2, would make the content a bit more accessible in my view. I also recommend removing terminology like "trivial", e.g., in Appendix B. However, I leave this decision to the author's discretion.

Another weakness, which is also presentational, is the main paper's heavy reliance on cross-references to the appendix. I think it is okay that the proofs of the propositions and some detailed experiment setups are deferred to the appendices. However, the paper delegates related work, examples for computing differential invariants, implementation details, and most experiments to the appendices, and sometimes explicitly links equations from the appendix in the main paper (e.g. in line 222). Overall, I like the submission, but the reliance on back-and-forth cross-referencing across 30+ pages makes it difficult to navigate.

Summary

I recommend accepting this work. Most weaknesses I identify pertain to the paper's presentation, particularly the amount of background knowledge required in certain parts of this paper and the extensive usage of the appendix. Nonetheless, I believe the conceptual contribution -- enforcing symmetries via differential invariants -- is novel, well-executed, and makes a valuable addition to the (differential) equation discovery literature.

问题

  • Line 71: "though generalization is possible" - could you maybe elaborate?
  • Line 108/109: "we estimate the partial derivative terms" - how expensive is that? Would it be problematic for high-order differential operators?
  • Line 276: Why is η(1,0)\eta_{(1, 0)} excluded?

局限性

/

最终评判理由

My original evaluation already stated appreciates the contributions of this work and recommends acceptance. The rebuttal clarified some specific questions about the presentation, which I believe improve the work (but not to the point of raising my score to a 6). Therefore, I continue recommending acceptance with a score of 5.

格式问题

/

作者回复

We thank the reviewer for the constructive feedback. We address the questions/concerns below:

Presentational issues

We thank the reviewer for pointing out the presentational issues. In Sec. 2.1, we will elaborate on the intuition of prolonged group actions on jet spaces: to analyze the symmetry of PDEs, we must know how it transforms not just the variables, but also their derivatives. We will also include a visualization of the prolonged group action on a 2D space in the revision. We will also change the terminology in Appx. B, where “trivial” refers to “identical” transformation/function in that context.

As suggested by the reviewer, we will also reduce the paper’s reliance on cross-references to the appendix in the revision. We will make sure to include important implementation details and experiment results, such as those in Appx. C.1, C.2 and D.2, in the main text. We will remove the explicit link to equations in the Appendix in L222, and instead write out the full example.

L71: Elaborate on generalization to multiple variables and equations

Here, we mean our method can also be applied to discovering PDEs or systems of multiple PDEs containing more than one dependent variable (q>1q>1). Several definitions/statements need to be extended in this case. For example, the infinitesimal generator in Eq. (3) becomes v=ξjxj+ϕkuk\mathbf v = \xi^j \frac{\partial}{\partial x^j} + \phi^k \frac{\partial}{\partial u^k} to account for the multiple dimensions in u\mathbf u. The computation of differential invariants still follows the same procedure; however, in Prop. 3.3, because the total space becomes X×URp×RqX \times U \simeq \mathbb R^p \times \mathbb R^q, the number of lower-order invariants needed also becomes larger: qn=q(p+n1n)q_n = q\cdot \binom{p+n-1}{n}. The rest of our method remains the same.

In Sec. 2, we restricted the setting to a single dependent variable to avoid introducing too much notational and conceptual complexity at once. This is used as the default setup in the paper. When the discussion of multiple dependent variables is needed, we explicitly mention such generalization, such as in Prop. 3.4. Again, this is a choice of how we present the paper – we could also make the most general statement at the beginning, possibly at the cost of increased presentational complexity and less clarity. We’d appreciate the reviewer’s thoughts on this.

Line 108/109: "we estimate the partial derivative terms" - how expensive is that? Would it be problematic for high-order differential operators?

In our experiments, we use the central finite difference method to estimate the partial derivatives (e.g., using numpy.gradient). The computational complexity is linear w.r.t. the dataset size and the differential order, so it is not expensive even for high-order differential operators. The problem for high-order derivative estimation is not computational cost, but robustness to noise. A small noise in the solution function would lead to a highly inaccurate estimation of high-order derivatives with the finite difference method. In that case, we use the variational objective proposed by WSINDy [1] to bypass the need for derivative estimation.

References

[1] Messenger, D. A. and Bortz, D. M. Weak sindy for partial differential equations. Journal of Computational Physics, 443:110525, 2021a.

Line 276: Why is η(1,0)\eta(1,0) excluded?

Because η(1,0)=uxux=1\eta(1,0)=\frac{u_x}{u_x}=1 is a constant. Constant terms are already handled by symbolic regression algorithms. Thus, we do not add a constant function to the SINDy library or a constant feature to genetic programming.

评论

Thank you for the helpful reply!

Regarding presentation: thanks for these changes, they sound helpful.

Regarding L71: Thanks for elaborating. Maybe summarising the explanation in a subclause to L71 would help, but this is quite minor. In general, I like that the explanation in Section 2 starts with intuition, not with the most general result, I was simply asking for a brief "how" when it comes to generalising because while reading L71 it was not immediately obvious whether this generalisation would be easy or difficult.

Line 108: Thanks for the explanation. Is this explained in the submission and I have missed it? If not, it would be good to include it.

Thanks again for the replies! My review already recommends accepting this work, so I will maintain my score.

审稿意见
2

This paper proposes a method that uses symmetry invariants of pde, instead of the original variable (i.e. u, x and their differentials), for equation discovery. Given a solution of an unknown pde with known symmetries, this algorithm first computes its symmetry constraints (up to order K, K predefined), and then runs various symbolic equation discovery algorithms.

优缺点分析

Strengths:

This method proposed a novel equation discovery algorithm for pdes. This method actively uses symmetry invariants, which are likely to be more ‘important’ or ‘intrinsic’ variables than the original variables uu and xx. This method showed finding symbolic equations can be more promoted by using symmetries of the differential equations.

Weaknesses:

  • This method considers the scenario that the symmetries of the equations are known while the equations themselves are unknown. Is this a reasonable assumption? In a real-world scenario, is there a situation where the equations are known, symmetries are unknown?
  • Isn’t this method just a reparametrization of the original variable (xx, uu and their differentials) by symmetry invariants? Can you clarify any more novelties of this method?
  • Is this method applicable when the symmetries are more complicated, so that the symmetry invariants are difficult to compute? The paper mentioned a generic algorithm for computing symmetry invariants, but didn’t use it – it instead used a symbolic computation of symmetry invariants for simple symmetries.
  • This paper only verified the method with three pdes, having small numbers of variables. Is this method applicable for more general, real-world equations?
  • e.g. The darcy flow contains explicit x2+y2x^2 + y^2 term, and the reaction-diffusion contains explicit u2+v2u^2 + v^2 term – they are symmetry invariants. Aren’t these equations well-chosen equations that work especially well with this method?

问题

  • The paper mentioned a generic algorithm for computing differential invariants (proposition 3.3), but in fact, it used precomputed differential invariants. Is it feasible to compute differential invariants using the proposition 3.3? For example, there are determinant operations, which might cause numerical instability.
  • How beneficial is it that this method can handle non-sindy-format (i.e. not linearized) equations? e.g. fractions can be linearized, and even when there are non-linearizable terms, series expansion may handle them.

局限性

yes

最终评判理由

After reading the rebuttal and engaging in the discussion, I appreciate the authors’ clarifications. However, I remain concerned that the assumption of the knowledge of Lie point symmetries before knowing the equation is unrealistic, and the novelty of this method is limited. I remain my score.

格式问题

None

作者回复

Thank you for the constructive feedback. We address the questions below:

  1. This method considers the scenario that symmetries are known while equations themselves are unknown. Is this a reasonable assumption?

Yes, this is a reasonable assumption. As we mentioned in Sec. 5, this assumption aligns with common practice – physicists often begin by hypothesizing the symmetries of a system and seek the most economical equations respecting those symmetries. Classic examples include deriving Navier-Stokes from rotational+translational symmetry and conservation laws, writing effective field-theory Lagrangians by enumerating all operators allowed by a symmetry group, etc. The assumption of symmetry is also commonly used in related works on equation discovery [1,2].

References

[1] Gurevich, Daniel R., et al. "Learning fluid physics from highly turbulent data using sparse physics-informed discovery of empirical relations (SPIDER)." Journal of Fluid Mechanics 996 (2024): A25.

[2] Messenger, Daniel A., Joshua W. Burby, and David M. Bortz. "Coarse-graining Hamiltonian systems using WSINDy." Scientific Reports 14.1 (2024): 14457.

  1. In a real-world scenario, is there a situation where the equations are known, symmetries are unknown?

Such cases indeed exist. Some complex PDEs are written down before their full Lie-symmetry structure is classified. A separate line of work tackles this task of automated symmetry detection. We have cited those works in Sec. 5 and mentioned the integration of their methods and ours as future work.

While interesting, that scenario is orthogonal to our contribution: we show how given symmetry knowledge can be injected into any symbolic regression method to reduce the hypothesis space and guarantee physically valid outputs.

Note: We did not quite understand the transition from the previous question to this one. If the reviewer meant the reverse, please refer to the answer to Q1.

  1. Isn’t this method just a reparameterization of the original variable by symmetry invariants?

No, our contributions are much more than that. Simply reparameterizing the variables is not enough because it is not compatible with some symbolic regression (SR) algorithms, such as GP which fits an explicit equation, and Weak SINDy which uses a variational objective. Thus, in addition to this general idea of using invariants:

  • We developed Alg. 1 and Alg. 2 (in Appx. C.1) to implement this idea in different SR methods, with the additional procedure of LHS selection. As mentioned in Sec. 3.3, these adaptations are necessary because base SR methods have their own hypothesis spaces for equations, and we aim to implement the intersected hypothesis spaces (Fig. 2).
  • For sparse regression (SINDy), we developed another technique to convert the symmetry condition (Thm. 3.2) to linear constraints on SINDy parameters. This allows us to impose symmetry on SINDy and its variants (e.g., Weak SINDy) easily. In this case, we do not reparameterize the variables, but use a derived theoretical result (Prop. 3.4) for easier implementation when it is applicable.
  • In Sec. 3.4, we also described how to relax the symmetry constraint in our method. This extends the applicability of our method to systems with imperfect symmetry. The experiments showed that our “SI-relaxed” variant still outperforms an unconstrained baseline, showing that symmetry information, even if slightly misspecified, provides visible benefits.
  1. Is this method applicable when the symmetries are more complicated, so that the symmetry invariants are difficult to compute? The paper mentioned a generic algorithm for computing symmetry invariants, but didn’t use it.

This is a misunderstanding. We did use the generic algorithm – see Appx. B.4, for example, for a full derivation for the rotation symmetry. And yes, the method is applicable in generic cases. The first step solves a first-order linear PDE (the characteristic equations for the Lie generators). This is algorithmic and implemented in standard computer-algebra systems such as sympy (pdsolve, for relatively simple systems) and Mathematica (DSolve). Then, the second step uses Eq. (6) from Prop. 3.3 to recursively construct higher-order functions. This reduces to symbolic differentiation and multiplication, again provided by off-the-shelf CAS libraries.

Regarding more complicated symmetry: although some classical PDEs, e.g. the 1D heat equation, admit large (even infinite-dimensional) Lie-symmetry algebras once the equation is known, our setting is the reverse: the equation is unknown and the symmetry is supplied a priori as an inductive bias. Hence, we typically posit only the symmetries that are evident from empirical observations or fundamental principles, such as spatial homogeneity (translations), isotropy (rotations), simple scalings, etc, not the full algebra that might emerge later. In other words, the groups we consider are often low-dimensional products of these elementary transformations, all readily handled by the method in Sec. 3.2. Moreover, should an unusually large group be encountered, the pipeline still functions; the only effect is a larger (but still finite) invariant basis.

  1. Is this method applicable to more general, real-world equations?

Yes. To demonstrate this, we conduct another experiment on a simulated dataset for the Navier-Stokes equation. We follow the setup in [3], aiming to discover the vorticity equation ωt=x(ωu)y(ωv)+0.01Δω\omega_t=-\partial_x(\omega u)-\partial_y(\omega v)+0.01\Delta\omega. The following table shows the results. Our method uses invariants of the rotation symmetry and has fewer variables and functional terms. As a result, it can discover a parsimonious equation matching the functional form of the ground truth.

MethodNumber of variables/termsDiscovered equation
SINDy16, 136ωt=58.17ω+57.41uy59.85vx+0.01ωyy0.74vωy+...\omega_t=58.17\omega+57.41u_y-59.85v_x+0.01\omega_{yy}-0.74 v\omega_y+...
SINDy-SI (ours)10, 55ωt=0.99uωx0.99vωy1.26ωux1.26ωvy+0.01ωxx+0.01ωyy\omega_t=-0.99u\omega_x-0.99v\omega_y-1.26\omega u_x-1.26\omega v_y+0.01\omega_{xx}+0.01\omega_{yy}

We note that the result is based on our implementation of SINDy and our own dataset. The code and data for exactly reproducing the results in [3] are unavailable. Nonetheless, this result shows that our method using symmetry invariants is better than using original variables under the same backbone method implementation and data for equation discovery.

[3] Messenger, Daniel A., and David M. Bortz. "Weak SINDy for partial differential equations." Journal of Computational Physics 443 (2021): 110525.

  1. The selected equations contain explicit invariant terms. Aren’t those well-chosen equations that work especially well with this method?

They are not deliberately chosen, but rather, those invariant terms arise naturally once the assumed symmetry holds. For example, the rotational invariance of Darcy flow implies isotropic permeability, and x2+y2x^2+y^2 is the only ordinary function satisfying that property. These terms were not injected to “fit” our algorithm; they are the physically correct building blocks that any symmetry-respecting model would employ.

Also, our experiments tested beyond the perfect symmetry setting. In Sec. 4.4, we showed that even if there is symmetry breaking in the system and non-invariant terms in the equation, our method can still discover the correct equations via constraint relaxation. This further demonstrates that our method not only works on tailored problems.

  1. The paper mentioned a generic algorithm for computing invariants, but in fact, it used precomputed invariants.

This is a misunderstanding. Please see our answer to Q5 for details. In short, the “precomputation” is exactly implementing the generic algorithm in Sec. 3.2.

  1. Is it feasible to compute differential invariants using Prop. 3.3? E.g., there are determinant operations, which might cause numerical instability.

Yes, it is feasible. In Prop. 3.3, we compute the invariant functions instead of their evaluations on data points, which is purely symbolic and does not involve numerical issues.

We do note that evaluation of the precomputed invariant functions can cause numerical problems such as division by 0. However, handled properly, these numerical problems won’t affect the feasibility of the method. For example, we might encounter a rational function f(x,u(n))/g(x,u(n))f(x,u^{(n)}) / g(x, u^{(n)}) as one of the invariants. In this case, we apply a filter g>ϵ|g| > \epsilon for some small threshold ϵ\epsilon to the dataset to discard large values that destabilize the regression. We will make sure we describe such procedures in the experimental details section in the revision.

  1. How beneficial is the ability to handle non-sindy-format (not linearized) equations? Even when there are non-linearizable terms, series expansion may handle them.

The ability to handle “non-linearized” equations is critical. The series expansion of nonlinear terms introduces infinite power series and explodes the search space. While truncation is possible, we may still need higher-degree polynomials in order for the equation to match the data accurately. As a result, the regression is less efficient due to a larger function library, less accurate due to truncation errors, and the results are less interpretable due to the mix of expanded terms from nonlinear functions. Apart from our method, other related works, such as D-CIPHER [4] mentioned by other reviewers, also used genetic programming to handle non-SINDy-format equations and demonstrated its advantage over SINDy-based approaches.

[4] Kacprzyk, Krzysztof, Zhaozhi Qian, and Mihaela van der Schaar. "D-CIPHER: discovery of closed-form partial differential equations." Advances in Neural Information Processing Systems 36 (2023): 27609-27644.

We are happy to discuss further if there are remaining questions. If your concerns have been addressed, we kindly request that you adjust the evaluation accordingly.

评论

I thank the authors for their responses, and some of my concerns (2, 4, 6, 7, 8, 9) are addressed. However, I have some concerns remaining about this paper.

  1. Lie point symmetries are highly PDE-specific and can be quite nontrivial. Some are elementary—such as translations or rotations—whereas others are far more intricate. For example, the KdV equation ut+6uux+uxxx=0u_t + 6uu_x + u_{xxx} = 0 admits the dynamical-scaling generator 3tt+xx2uu3t\partial_t + x\partial_x - 2u\partial_u [1]. The cubic nonlinear Schrödinger equation iϕt+ϕxx+2ϕ2ϕ=0i\phi_t + \phi_{xx} + 2|\phi|^2\phi = 0 has a “pseudo-conformal inversion” symmetry generated by
    t2+txx+i2x2ϕϕ+tϕϕt^2 + tx \partial_x + \frac{i}{2}x^2 \phi \partial_{\phi} + t\phi \partial_{\phi} [2].

Because of this variety, identifying a Lie point symmetry is not as straightforward as, e.g., recognizing that the isometry group of R3\mathbb{R}^{3} is SO(3)SO(3). Could you give concrete guidance—or name an explicit class of physical equations—for which your method applies? In other words, how large is the shaded region in your Venn diagram (Figure 2)?

  1. The novelty of the proposed work is still uncertain.

(a) (SR) The algorithm 1 is a brute-force application (for LHS search) of SR algorithm using symmetry invariants.

(b) (Sindy) If I understood it correctly, the algorithm explained in the “sparse regression” section could be seen as parametrization of the symmetry invariants by linear combinations of library functions, by imposing linear constraints on W. This method seems novel, but its effectiveness seems to depend strongly on the particular library selected. Could you clarify how broadly applicable the method is and how sensitive it is to the choice of basis functions?

[1] Cantwell, Brian J. Introduction to Symmetry Analysis. Cambridge University Press, 2002.

[2] Devi, Preeti, and K. Singh. “Lie Symmetry Analysis of the Nonlinear Schrödinger Equation with Time Dependent Variable Coefficients.” International Journal of Applied and Computational Mathematics 7 (2021): 23.

评论

Thank you for following up! We are glad that our previous response addresses most of your concerns. We address the remaining below:

Lie point symmetries are highly PDE-specific and can be quite nontrivial. Some are elementary—such as translations or rotations—whereas others are far more intricate. Because of this variety, identifying a Lie point symmetry is not always straightforward. Could you give concrete guidance on where your method applies?

We agree with the reviewer’s point that some Lie point symmetries can be intricate and PDE-specific. However, our method is intentionally aimed at the relatively simple symmetries, such as products of rotations, translations, scalings, etc, which we can usually read off from basic experiments or first principles before the governing equation is known. These symmetries are ubiquitous in a wide range of physical systems, and their presence can be easily identified by the infinitesimal criterion for Lie point symmetry (Thm 2.31, [1]). Assuming such elementary knowledge about symmetry therefore aligns with common practice and injects only a moderate inductive bias, rather than the full, equation-specific Lie algebra that would possibly reveal the PDE itself immediately.

We deliberately did not incorporate more intricate symmetries, because those are rarely known a priori. For example, in the reference you provided, identifying the conformal symmetry, as well as other symmetries, of the nonlinear Schrodinger equation requires extremely tedious calculations, even when we know the equation. Discovering such symmetries from data is itself an open research problem and is beyond the scope of this paper. Admittedly, integrating automatic symmetry discovery tools with our method would be a powerful extension. We have highlighted this synergy as future work in Sec. 5.

[1] Olver, P. J. Applications of Lie groups to differential equations, volume 107. Springer Science & Business Media, 1993.

Novelty

First, we would like to highlight the value of our key contribution of switching to using invariant functions instead of regular variables in equation discovery. While it is straightforward, it has proved very effective in experiments across different PDE systems, compressing the search space and dramatically increasing the chance of discovering the correct equations. This key principle of our paper is also recognized by other reviewers to be convincing (Reviewer Aa7F), technically sound (Reviewer 5nPf), novel and theoretically grounded (Reviewer KgEF). We also respectfully argue that simplicity does not deny the novelty of our method.

Then, regarding your comments on the other parts of our contribution:

Algorithm 1 is a brute-force application (for LHS search) of SR algorithm using symmetry invariants.

Yes, we agree. However, this LHS search is essential for accommodating the use of the invariant functions to the SR algorithms designed for finding explicit functions, as we have discussed in Sec. 3.3.

If I understood it correctly, the algorithm explained in the “sparse regression” section could be seen as parametrization of the symmetry invariants by linear combinations of library functions, by imposing linear constraints on W.

Yes, your understanding is correct.

This method seems novel, but its effectiveness seems to depend strongly on the particular library selected. Could you clarify how broadly applicable the method is and how sensitive it is to the choice of basis functions?

This method is applicable when the conditions in Prop. 3.3 are met with the chosen SINDy basis functions and the symmetry invariants. This is true when the SINDy basis functions contain all monomials of variables and derivatives up to a certain degree and order, and the symmetry invariants are also polynomials of those terms. This setup of SINDy basis functions is widely used in related literature, including almost all of the experiments in the seminal works of PDE-FIND [2] and Weak SINDy [3].

We would like to additionally comment that, in general, if the assumption from SINDy is correct (that the equation can be written down as a linear combination of given candidate terms), but the conditions in Prop. 3.3 are not necessarily met, then the set of all possible WW can be an algebraic variety arising from some rank constraint, instead of a linear space. We will discuss this briefly in the revision, but leave the actual implementation of such symmetry constraints to future work.

[2] Rudy, Samuel H., et al. "Data-driven discovery of partial differential equations." Science advances 3.4 (2017): e1602614.

[3] Messenger, Daniel A., and David M. Bortz. "Weak SINDy for partial differential equations." Journal of Computational Physics 443 (2021): 110525.

We hope this clarifies your questions!

评论

Dear Reviewer FfNQ,

thx for engaging in a discussion with the authors. Your opinion is diverging from that of the other reviewers, so it would help my work if you could give a qualified statement during the AC-reviewer discussion period.

If you need further information from the authors, can you try to use the remaining time for clarifying questions or answering their comment?

Thx, AC

评论

While I appreciate the authors' responses, I believe key issues remain unresolved. The method's focus on simple Lie point symmetries limits its applicability to more complex PDEs, which are often encountered in practice. Additionally, the novelty of the approach is still unclear, as it heavily depends on specific assumptions about the choice of basis functions. The method’s general applicability is not well-established, and these concerns undermine its impact. Therefore, I withhold my score.

评论

Thank you for the follow-up. While we believe our previous response already discusses the two points you raised, we would like to elaborate a little further on those.

The method's focus on simple Lie point symmetries limits its applicability to more complex PDEs.

First, it is necessary to distinguish between the complexity of Lie point symmetries and the complexity of PDEs. Many complex PDEs also exhibit simple symmetries from first principles, e.g., the Navier-Stokes equation with spatial rotation symmetry. Thus, our method can be applied to complex PDEs in real world; see, for example, our initial answer to Q5 for the results on discovering the Navier-Stokes equation in vorticity form.

Then, if the reviewer is still concerned that our method mainly aims at simple symmetries, we would reiterate the explanations in our last response: these simple symmetries are ubiquitous, and their presence can be easily tested numerically, making them accessible inductive biases. Meanwhile, more complex symmetries are often PDE-specific, enumerating the possible forms of these symmetries is nontrivial, and it is difficult to obtain such knowledge before discovering the equation itself.

In terms of methodology, given any symmetry group, our pipeline of computing and using the invariant functions as regression features still works -- see our initial answer to Q4. But we intentionally choose not to consider more complex symmetries because such knowledge is not often available. In other words, this is not an intrinsic limitation of our method, but a limitation of available prior knowledge, which could possibly be addressed by incorporating symmetry discovery methods.

The novelty of the approach is still unclear, as it heavily depends on specific assumptions about the choice of basis functions. The method’s general applicability is not well-established.

We respectfully disagree. Consider our base method of using invariant functions instead of regular variables. It does not require additional assumptions about the choice of (SINDy) basis functions. It is general and can be applied to different SR algorithms and PDE datasets.

We understand that the reviewer has been questioning the novelty of this base method, and may not have referred to it when making the above statement. We would like to again highlight that this method of using invariants, alone, is a valuable and novel contribution recognized by all other reviewers.

Then, we understand that the above statement may specifically refer to the variant of our method that imposes linear constraints on SINDy parameters. We agree that it cannot be applied to any given symmetry group and SINDy basis functions. However, we have clearly established the conditions for it to work, which can be easily verified with a few symbolic calculations. We also show that common setups in related works on SINDy satisfy these conditions. And when these conditions are not met, we can always fall back to the base method, directly using invariant functions as features and targets for SINDy regression.

We hope the reviewer can take these clarifications into account when forming the final judgement. Thank you again for engaging in these discussions.

审稿意见
4

Conventional symbolic regression methods face several challenges, including difficulty dealing with the vast search space and the tendency to produce complex but less interpretable equations. Efforts have been made to incorporate symmetry into equation discovery, but they are often limited to specific types of equations or algorithms. This paper aims to fill this gap and proposed a general approach to discover symbolic differential equations that reflect the intrinsic symmetries of complex systems. Specifically, the proposed strategy constructs a complete set of invariant functions under a given symmetry group to replace the original set of variables used in symbolic regression. The authors incorporated their new approach in three algorithmic classes: (1) sparse regression, (2) genetic programming, and (3) a pretrained symbolic transformer. The authors compare the performance of their symmetry invariant method with the original approach that uses standard jet space variables. They performed experiments on three different PDE systems with clean and noisy datasets as well as imperfect symmetry. The results show relatively robust performance of the symmetry invariant method, and the symmetry invariant relaxed strategy outperforms the baseline as the symmetry breaking factor increases.

优缺点分析

Strengths:

  1. Symmetry plays a key role in physical laws and equations, so it is expected that incorporating symmetry into equation discovery shows superior performance compared to regular symbolic regression methods. The authors proposed a systematic and general approach for constructing symmetry invariants for a given symmetry group. The effectiveness of this approach is demonstrated in the experimental results.
  2. The proposed strategy appears technically sound, and the authors' claims are supported by experimental results on three different PDE systems and three SR algorithms.
  3. The paper is clearly written and well organized. The authors provided more details in the supplementary materials.

Weaknesses:

  1. The new approach using symmetry invariants as the variable set relies on explicitly known symmetry groups without end-to-end learnable capability to extract symmetries from data, making it less flexible for data-driven discovery when prior knowledge is limited. Thus, the application of this approach is limited.
  2. Although the authors performed experiments on sparse regression, genetic programming, and transformer methods, there is a lack of sufficient comparison to recent symbolic regression methods developed for PDE systems in the related work and benchmarks, e.g. Ref. [1] Kacprzyk et al. (2023).

Reference:

  1. Kacprzyk, Krzysztof, Zhaozhi Qian, and Mihaela van der Schaar. "D-CIPHER: discovery of closed-form partial differential equations." Advances in Neural Information Processing Systems 36 (2023): 27609-27644.

问题

  1. How does this symmetry invariant approach compare with recent symbolic regression methods developed for PDE systems in the related work and benchmarks, e.g. Ref. [1] Kacprzyk et al. (2023)?

  2. The authors stated that the proposed method has the ability to recover interpretable equations. However, even if symmetry is enforced by using symmetry invariants as variable set in the SR algorithms, the resulting symbolic equations can still be very complex and not necessarily physically meaningful, especially when dealing with noisy data and imperfect symmetry. So, how can the proposed method avoid complicated solutions with excessively high order that lose physical meaning? Can the authors elaborate further?

局限性

yes

最终评判理由

The authors have conducted additional experiments to compare with D-CIPHER and addressed the questions, so I adjust my score accordingly.

格式问题

No paper formatting issues were found.

作者回复

We thank the reviewer for the constructive feedback. We address the questions/concerns below:

The method relies on explicitly known symmetry groups without end-to-end learnable capability to extract symmetries from data. Thus, the application of this approach is limited.

Discovering symmetries and discovering equations from data are two distinctive tasks. Extracting symmetries from data was done by a separate line of work [1,2,3]. It is not the task we are solving in this paper.

We argue that using known symmetry groups is a reasonable problem setup. As we mentioned in Sec. 5, this aligns with common practice – physicists often begin by hypothesizing the symmetries of a system and seek the most economical equations respecting those symmetries. Classic examples include deriving Navier-Stokes from rotational and translational symmetry and conservation laws, or writing effective field-theory Lagrangians by enumerating all operators allowed by a symmetry group. The assumption of symmetry is also very common in related works on equation discovery [4,5].

In addition, the application of our approach is not limited to systems matching the given symmetry perfectly. We have also developed a method to relax the symmetry constraint in Sec. 3.4. This extends the applicability of our method to systems with imperfect symmetry. The experiments also showed that our “SI-relaxed” variant still outperforms an unconstrained baseline, demonstrating that symmetry information, even if slightly misspecified, provides a visible benefit.

References

[1] Yang, Jianke, et al. "Latent space symmetry discovery." arXiv preprint arXiv:2310.00105 (2023).

[2] Ko, Gyeonghoon, Hyunsu Kim, and Juho Lee. "Learning infinitesimal generators of continuous symmetries from data." Advances in Neural Information Processing Systems 37 (2024): 85973-86003.

[3] Hu, Lexiang, Yikang Li, and Zhouchen Lin. "Symmetry discovery for different data types." Neural Networks (2025): 107481.

[4] Gurevich, Daniel R., et al. "Learning fluid physics from highly turbulent data using sparse physics-informed discovery of empirical relations (SPIDER)." Journal of Fluid Mechanics 996 (2024): A25.

[5] Messenger, Daniel A., Joshua W. Burby, and David M. Bortz. "Coarse-graining Hamiltonian systems using WSINDy." Scientific Reports 14.1 (2024): 14457.

Comparison with D-CIPHER (Kacprzyk et al. (2023))

MethodDiscovered equation
D-CIPHERuy+1.16uxy=18.74(u+eu1.35)ey2u_y + 1.16 u_{xy} = 18.74 (u + e^u - 1.35) e^{y^2}
D-CIPHER^*xux+yuy2.09xuy2.09yux0.19uy=7.98x2y2+2.51xy+0.80xu_x + yu_y - 2.09 xu_y - 2.09 yu_x -0.19 u_y = 7.98 x^2y^2 + 2.51 xy + 0.80
D-CIPHER-SI (ours)(xux+yuy)0.13(uxx+uyy)=0.13e4.12(x2+y2)(xu_x + yu_y) - 0.13 (u_{xx} + u_{yy}) = 0.13 e^{4.12(x^2+y^2)}

We agree with the reviewer that D-CIPHER is a relevant baseline and should be included. We ran D-CIPHER and also implemented our method upon D-CIPHER in an additional experiment. We tested the methods on the Darcy flow dataset in our paper and report the results in the table above.

First, we explain the three methods in the table. The first method, D-CIPHER, uses the original implementation from the D-CIPHER official codebase. While their formulation of the extended differential operator in the paper is a(x)α[h(x,u)]a(\mathbf x) \partial_\alpha [h(\mathbf x, \mathbf u)], their implementation only addresses the operators in the form α[h(x,u)]\partial_\alpha [h(\mathbf x, \mathbf u)]. This does not cover the terms like xuxxu_x and yuyyu_y in the Darcy flow equation. Therefore, in the second method of the table, D-CIPHER^*, we extended their code implementation by allowing function coefficients a(x)a(\mathbf x) in the operator. Only in this way is the method theoretically capable of finding the correct functional form of Darcy flow. The final row represents our method, where we apply the procedure in Appx. C.3 (Line 778-809) to provide D-CIPHER with a set of rotationally invariant differential operators and fundamental rotational invariants, so that the spatial rotation symmetry is enforced in any discovered equation. More specifically, the inner optimization of D-CIPHER solves a 5-dimensional coefficient vector for 5 invariant operators in a constrained least-squares problem, and the outer optimization uses genetic programming to find a free-form expression of the 2 fundamental rotational invariants, x2+y2x^2 + y^2 and uu.

It can be seen from the table that our method can find the correct functional form, while D-CIPHER with the original variables and derivative operators cannot. The benefit of symmetry is even greater here for D-CIPHER than for other SR methods like SINDy, because D-CIPHER requires the user to specify both the function coefficient a(x)a(\mathbf x) and the function to be differentiated h(x,u)h(\mathbf x, \mathbf u) for an extended derivative (i.e., step 1 in their Figure 1). Such choices of functions can be largely arbitrary if no prior knowledge is available. On the other hand, our symmetry-based approach automatically selects this dictionary of differential functions.

Even if symmetry is enforced by using symmetry invariants in the SR algorithms, the resulting symbolic equations can still be very complex and not necessarily physically meaningful, especially when dealing with noisy data and imperfect symmetry.

While it may be true that our method can produce complex equations in the presence of imperfect symmetry or just generally challenging setups, this type of complexity is the result of the underlying system, not a failure mode of our algorithm.

Our method aims to find the equation, ideally more parsimonious, that can explain the data well. But if the data itself comes from a complex system, then our method would unavoidably need to include more terms in the symbolic equation. E.g., for imperfect symmetry, we have developed a method to relax the symmetry constraint and discover symmetry-breaking components (Sec. 3.4) and showed how it worked experimentally in Sec. 4.4. The discovered equations indeed contain more terms, but they fit and explain the data better than simpler equations. Also, the additional terms are still physically meaningful, as they correspond to different ways symmetry can be broken in that reaction-diffusion system.

The reviewer also mentioned noisy data. We believe this is a somewhat different challenge – the underlying system can be simple, but the data can be noisy due to inaccurate measurements. However, in this scenario, our symmetry-based approach still works well with weak-form SR algorithms that improve robustness to noise, including Weak SINDy (Sec. 4.4) and D-CIPHER (as described above).

How can the proposed method avoid complicated solutions with excessively high orders that lose physical meaning?

The maximal derivative order is a hyperparameter of our proposed method. We only compute invariant functions up to this order and use them as input features to symbolic regression. We agree that excessively high-order derivatives lose physical meaning and make regression more difficult by introducing irrelevant features. We will include this discussion in the revision.

If the reviewer meant something different by “high orders”, we would appreciate clarification.

Please let us know if our response has addressed your concerns. We are happy to discuss further if there are remaining questions. If your concerns have been addressed, we kindly request that you adjust the evaluation accordingly.

评论

The authors have conducted additional experiments to compare with D-CIPHER and addressed the questions, so I will adjust my score accordingly.

最终决定

The submission proposes injecting differential invariants of a supplied Lie point symmetry group into symbolic PDE discovery (SINDy, SR/GP, and a transformer), arguing this reduces hypothesis space and enforces symmetry by construction. Experiments on several PDEs (incl. noisy / imperfect-symmetry settings) show improvements over symmetry-agnostic baselines; the rebuttal adds a D-CIPHER comparison and a Navier–Stokes vorticity example.

Reviewer:

  • R_Aa7F (5: accept): Finds the core idea convincing and the paper well executed; main concern is presentation and over-reliance on the appendix. After rebuttal, remains positive and believes issues are fixable.
  • R_KgEF (5: accept): Rates quality/originality high; asked about compatibility with variational objectives. After authors’ D-CIPHER add-on, keeps a positive score.
  • R_5nPf (4: borderline accept, after rebuttal): Initially wanted broader baselines (e.g., D-CIPHER) and worried about interpretability under noise; raised to borderline-accept after the added D-CIPHER experiment.
  • R_FfNQ (2: reject): Questions realism of assuming known Lie symmetries, the extent of novelty (reparameterization vs. algorithmic contribution), dependence on chosen basis libraries, and applicability beyond simple symmetries. Engaged, but ultimately withheld a score change.

The discussion was active and constructive. The majority favors acceptance on technical merit; one reviewer remains firmly negative based on scope/novelty.

For my own assessment as AC, I performed a careful read because of the split opinions.

  • The main technical novelty that adapts invariants to specific SR backbones (notably the SINDy constraint formulation and LHS selection logic) is largely buried in Appendix C. This is not just a style nit: it materially obscures the algorithmic contribution in the 9-page main paper, which otherwise leans heavily on standard invariant theory. The authors promise to surface Appendix C/D content, but we must judge the submission as submitted. In its current form, the main paper under-communicates the ML contribution.
  • The clarity suffers from heavy cross-referencing; many implementation details and examples live in the appendix. Several reviewers flagged accessibility for the NeurIPS audience. I have no specific suggestion for improvement here. The topic is hard to digest and math heavy, but the current write-up does not help.
  • Review and rebuttal largely treat invariants locally. However, data are global, and evaluating invariant maps on full domains raises singularities and potential monodromy that the paper does not address beyond numerical filtering. I became particularly concerned about monodromies around singularities (e.g., phenomena akin to 2D Laplace on R2{0}\mathbb{R}^2\setminus\lbrace0\rbrace), and the paper explicitly does not assume simply connected domains. Once branch ambiguity/monodromy enters (nontrivial fundamental group), composing invariants with SR models can break global consistency (“same loop, different value after one turn”). The rebuttal’s point that invariants are symbolically computed up to order 2 sidesteps topological (not numerical) failure modes: rank drops/degenerate orbits, undefined charts, and branch cuts are not treated. If these arise, “all hell breaks loose” for training targets/features and any global regression objective. I did not find a principled treatment (charting/patching, domain restrictions, coverings, or explicit assumptions) in the current submission. In a future version, you could make domain/topology assumptions explicit. E.g., require simply connected domains (or specify coverings/patchwise training), characterize singular sets of invariants/moving frames, and outline chart-stitching or branch-selection strategies. If relying on invariants that are only locally defined, formalize a patchwise regression scheme with consistency constraints.
  • I agree with R_FfNQ that assuming a given Lie point symmetry is restrictive in practice, although I also agree with other reviewers that it can still be a reasonable and useful setup. The paper would benefit from a crisper statement of exact applicability: which symmetry classes (rot/translation/scale products) and which SR bases guarantee the linear-constraint formulation (Prop. 3.3/3.4) without hidden singular sets? The current text spreads this across main + appendix.
  • I found the idea novel enough for NeurIPS.

Taken together, these issues go beyond camera-ready polish: they touch correctness and clarity of the global problem formulation and of the stated contribution.

Despite my problems, the papers has clear strength, as indicated by the reviewers:

  • It uses a clear, appealing principle: the use of invariant features/targets to encode symmetry and shrink the search space.
  • It allows and tests multiple backbones (SINDy, GP, transformer; plus Weak SINDy and D-CIPHER extensions in rebuttal) show practical compatibility.
  • It shows evidence that “SI-relaxed” helps under imperfect symmetry; overall empirical signals are positive.
  • Theoretical statements sketch when invariant dictionaries cover symmetry-respecting PDE classes.

While I suggest rejection, I find the idea very promising and the empirical evidence is very very encouraging, but in its current form the paper does not meet the bar on (i) clearly surfacing the algorithmic/ML contribution in the main paper and (ii) addressing global mathematical issues (singularities, monodromy, domain assumptions) that can undermine correctness when invariants are used on real, global datasets. Given that I could quickly surface such issues, I am concerned more will arise upon deeper scrutiny. I would like to see an improved version published at a high ranking venue.