PaperHub
7.8
/10
Spotlight4 位审稿人
最低4最高5标准差0.4
5
5
5
4
3.5
置信度
创新性3.8
质量3.8
清晰度3.0
重要性3.5
NeurIPS 2025

Boundary-Value PDEs Meet Higher-Order Differential Topology-aware GNNs

OpenReviewPDF
提交: 2025-05-08更新: 2025-10-29

摘要

关键词
Partial differential equationsNeural operatorHigher-order graph neural networkExterior differential calculusElectromagneticism

评审与讨论

审稿意见
5

This paper introduces a novel method called DEC-HOGNN in the category of neural operators. It makes use of higher-order graph neural networks that can handle interactions between simplices of different orders, not just same orders. It also designs an encoder-decoder structure for differential forms of different orders involved to make predictions. By writing down fields as differential forms, the physical losses can also be easily implemented. This approach is tested on boundary value problems in electrostatics and magnetostatics for both 2D and 3D.

优缺点分析

Strengths:

  1. This is an interesting attempt to implement physical losses elegantly and effectively in neural operators, which an inspiring and meaningful first exploration for AI for PDE community. This method does not require differentiation for physical losses.
  2. The mathematical motivations of this work are demonstrated in an easy to follow manner.
  3. There are clear proofs for important theorems mentioned in this paper, including one universal approximation property for this proposed neural operator.

Weaknesses:

  1. Though a large part of this paper is devoted to FEEC, the data generation is still achieved with traditional FEM methods, but not FEEC. This point makes this work not in a "pure" FEEC manner.
  2. The datasets explored in the experiments of this work do not cover a wide enough range of problems. Only electrostatics and magnetostatics are covered, which does not provide enough evidence for the superiority of this approach.
  3. The inference speed is not discussed and compared with other methods.

问题

  1. How are the boundary conditions ϕ=u(x),xΩ\nabla \phi = u(\bf{x}), \bf{x} \in \partial \Omega implemented as an input for this neural operator?
  2. Related to point 1 in weaknesses, why are the data still generated with traditional FEM, instead of FEEC? If data are generated with FEEC, then no encoding is required and all differential forms are naturally solved out.

局限性

Yes, the limitations are discussed in this paper.

最终评判理由

All of my concerns have been resolved from this rebuttal period. This paper is of good quality, so I do not have further concerns. Therefore, I recommend acceptance for this paper.

格式问题

No obvious concerns are noticed by the reviewer.

作者回复

We sincerely appreciate your insightful feedback and valuable assessment of our work. Below we address each comment in detail.

Q1. The inference speed is not discussed and compared with other methods.

Thanks for your comment. We have updated the manuscript to compare our model with baselines on computation cost. Below we provide various relative indices on the 2D electrostatics benchmark and 3D electrostatics benchmark respectively.

Since our experiments mainly focused on the effectiveness of the model, there is still much room to reduce the overheads of the model. Also, higher-order elements come with extra expenditures. The adjacency amount soars up when lots of tetrahedrals appear in a finer tetrahedralization, among which some might be unnecessary. An interesting direction for future research is how to fully leverage higher-order topological adjacencies without being hindered by the proliferation of small tetrahedrals. Nevertheless, our results show that the inference efficiency of the proposed model is acceptable and remains significantly faster than classical solvers, which typically require several seconds per inference.

Tab 1: Comparison on the 2D electrostatics benchmark.

ModelFLOP(M)Memory(MB)Inference Time(ms)
DeepONet0.102.740.33
MKGN0.2219.803.73
Galerkin-Type2.9618.201.67
GNOT1.9925.301.55
Transolver1.3419.051.84
GAT-based0.107.691.48
Graph UNet-based0.143.364.69
GT-based10.3553.961.54
DEC-HOGNN(Ours)2.8555.595.00

Tab 2: Comparison on the 3D electrostatics benchmark.

ModelFLOP(M)Memory(MB)Inference Time(ms)
DeepONet0.1093.410.28
MKGN0.53957.255.50
Galerkin-Type0.951122.035.29
GNOT1.99281.992.13
Transolver1.35235.512.46
GAT-based0.11264.422.08
Graph UNet-based0.78257.226.76
GT-based10.35367.423.41
DEC-HOGNN(Ours)5.379380.3394.09

Q2. How are the boundary conditions ϕ=u(x)\nabla \phi=u(x) implemented as an input for this neural operator?

Usually a scalar potential ϕ\phi is unavailable in many physics scenarios. In electromagnetism, ϕ\nabla \phi may be represented by electric intensity E\mathbf{E} or magnetic intensity H\mathbf{H}, which can be easily measured by sensors in practice. Therefore, in our approach, we take these vector fields as both input and final output. In fact, we introduce ϕ\phi only to show the theorems.

But still, given a scalar field ϕ\phi, we can leverage the gradient operator defined in graph calculus [1] or simply learn one like NIsoGCN [2] to yield a vector field ϕ\nabla \phi. It can then be interpreted as a vector proxy of a 1-form and thereby represented as integrated forms on boundary edges.

Ref.

[1] Calculus on Graphs. Joel Friedman et al.

[2] Physics-embedded neural networks: Graph neural pde solvers with mixed boundary conditions.

Q3. Though a large part of this paper is devoted to FEEC, the data generation is still achieved with traditional FEM methods, but not FEEC. This point makes this work not in a "pure" FEEC manner. Related to point 1 in weaknesses, why are the data still generated with traditional FEM, instead of FEEC? If data are generated with FEEC, then no encoding is required and all differential forms are naturally solved out.

Reviewer dQYA has similar concerns. We can fully understand your concerns on the compatibility between traditional FEM and its FEEC and DEC counterparts. For data generation, any of these methods is acceptable, as they yield the same solution for a well-posed PDE. Therefore, the method used for data generation does not compromise the purity or integrity of this work.

Regarding your second point, if input data is presented in a FEEC manner, then we can indeed cast away the encoding process. However, such an assumption can be too ideal to fit more applications. Because it is easy to deploy sensors to sample vector intensity like B,E\mathbf{B},\mathbf{E} while it is hard to measure a differential form down there. Hence, the encoding process is still necessary for a broader applicability.

Q4. The datasets explored in the experiments of this work do not cover a wide enough range of problems. Only electrostatics and magnetostatics are covered, which does not provide enough evidence for the superiority of this approach.

We use Maxwell's equations as a case study, as they encompass a broad range of boundary value problems (BVPs) and exhibit a well-structured formulation within the framework of exterior calculus. Specifically, they involve differential operators such as div\text{div} and curl\text{curl}, which also appear in BVPs like Darcy Flow, as well as in time-dependent PDEs such as the Heat and Wave equations. Upon closer examination, these equations share structural similarities with Maxwell's equations, suggesting they follow the same underlying patterns. Consequently, experiments on electrostatics and magnetostatics may provide evidence for the framework’s broader applicability to a variety of linear PDEs. We will include them in the next version of our paper.

We also mention that this framework can also potentially be applied to non-linear PDEs like Navier-Stokes Equations (NSE). This is mainly because div,curl\text{div},\text{curl} can be encoded as higher-order element features. We briefly sketch a possible approach and kindly invite you to the response to Reviewer 5x2r's Q4.

评论

I would like to thank the authors for their answers to my questions. They have resolved my concerns. Since I originally recommend for acceptance for this paper, I will just directly finalize my original rating.

审稿意见
5

This paper proposed a novel GNN neural operator (NO) that is topology-aware by leveraging discrete exterior calculus (DEC). The advantage of this novel NO includes: 1. preserving the topological structure of the manifold which is also the problem domain of PDE; 2. It can turn some equations involving divergence and curl into boundary conditions by Stokes' theorem, such that the physical loss from equation no longer needs taking derivatives, e.g., divergence and curl, like PINN does.

In order to leverage the DEC, the authors designed a GNN that handles 4 types of neighborhood (see eq (8)-(12)). In experiments of 2D and 3D problems, the proposed method outperforms all baselines and showing strong advantage.

优缺点分析

Strength

This paper is highly innovative and origin by introducing DEC to learning operators with GNN. As mentioned in summary, there are two advantages: 1. Preserving topology structure of the manifold; 2. Simplifying the equation constraints to boundary constraints (see eq (19)-(22)). In prior works, topology structure is either ignored (e.g., transformer-based NO and FNO) or partially encoded (with graph adjacency of GNN). In contrast, DEC utilizes 4 types of neighborhood, which is a complete methodology. Ablation study (Table 2) shows that the complete setting outperforms incomplete ones. The second advantage naturally enables the proposed method to incorporate physics-informed loss without taking derivatives (for some equations, e.g., Maxwell, Navier-Stokes, etc.).

The paper is well written and elegantly presented. Also, for self-completeness, the paper provides universal approximation analysis.

Weakness

  1. In Table 3, I think the bold numbers are the best of each column. Please clarify accordingly to avoid confusion.
  2. The current data is generated with finite element method, which may not be compatible with DEC (as the discussion in line 104-105). If data generation by DEC is not so convenient, this may limit the application of the method to some extent.
  3. Typo? On line 476, HΩ1(M)H\in\Omega^1(\star M) -> HΩ1(M) H\in\Omega^1(M) to align with Fig. 2.

问题

N.A.

局限性

N.A.

最终评判理由

After reading other reviews and rebuttals, I see that all reviews are positive and concerns are addressed. Therefore, my score remains.

格式问题

N.A.

作者回复

We sincerely appreciate your constructive feedback and meticulous evaluation of our work. Below, we provide responses to each point raised.

Q1. The current data is generated with finite element method, which may not be compatible with DEC (as the discussion in line 104-105). If data generation by DEC is not so convenient, this may limit the application of the method to some extent.

Thanks for your comments. Reviewer RTKj has similar concerns. We can fully understand your concerns on incompatibility between data and model design. However, FEM-based solvers are significantly more mature and widely adopted than their DEC and FEEC counterparts. Theoretically, all of these solvers can converge to the same solution, provided that the PDE is well-posed. As such, FEM, FEEC, and DEC are all valid choices for data generation when no numerical issues arise. This is why data generation using DEC is not strictly necessary.

Q2. On minor issues. (i) Typo? On line 476,  HΩ1(M)HΩ1(M)H\in \Omega^1(\star M)\to H\in\Omega^1(M)  to align with Fig. 2. (ii) In Table 3, I think the bold numbers are the best of each column. Please clarify accordingly to avoid confusion.

Thanks for your careful reading and corrections. For (i), HH should be indeed defined on MM since BB is on M\star M; for (ii), we will clarify its meaning in the caption.

评论

Thank you for the clarification.

审稿意见
5

This paper proposes a neural operator framework for solving boundary value problems (BVPs) governed by partial differential equations (PDEs). The key innovation lies in leveraging Discrete Exterior Calculus (DEC) and Finite Element Exterior Calculus (FEEC) to explicitly encode higher-order topological structures (nodes, edges, faces, cells) as k-simplex features in graph neural networks (HOGNNs). This allows for principled modeling of scalar and vector fields as differential forms, enabling a topologically-aware neural solver.

The approach includes physics-informed encoder-decoder mappings between vector fields and forms, integrated physical loss functions that correspond to conservation laws (e.g., Gauss’s law), and theoretical guarantees of universal approximation for certain BVP classes. There are limited experimental evaluations on 2D and 3D electrostatics and magnetostatics.

优缺点分析

Strengths

  • The methodology is technically sound and grounded in well-established mathematical frameworks (DEC and FEEC), with rigorous theoretical support (e.g., encoder-decoder preservation, universal approximation).
  • This work is an important step toward building structure-preserving, physically grounded neural solvers for BVPs, providing a new approach. The proposal to use higher-order GNNs aligned with exterior calculus opens a new direction for learning-based PDE solvers in domains like electromagnetism, fluid dynamics, and elasticity.
  • The synthesis of DEC/FEEC with HOGNNs for BVPs appears novel and nontrivial. While GNNs and physics-informed learning are well-trodden, the topological interpretation via differential forms and the dual-manifold message passing are particularly unique contributions (for solving PDEs).
  • The physics-informed loss design is very elegant and potentially efficient! I love how the work leverages integrated features (which appear naturally in DEC) directly, which avoids the need for expensive quadrature.

Weakness

  • This is a very dense paper, with some parts written with excessive jargon (particularly early on), making it hard for a broader audience to parse. Key ideas (like Eq. 5) are underexplained at first reading.
  • Figures (e.g., Figure 1 and 2) are central to the understanding but require much more extensive captions and annotations to be fully self-contained.
  • It is unclear why the 3D magnetostatic results are not competitive (relative to other baselines) even though the model is designed to generalize naturally to 3D.
  • The comparison misses modern BVP-specific baselines such as FNO, CNO, SCOT, etc., which are more appropriate than DeepONet or Transolver (designed for time-evolving systems).
  • The experimental results are very perfunctory, and should be strengthened.
  • While the stated focus is on time-dependant PDEs (in the abstract), all analysis and results are for the time-independant PDE. This is rather confusing.

问题

  • The formulation of the neural operator in Eq. (5) is too abstract and hard to interpret until the reader reaches the concrete instantiation in Eq. (7). The authors could either move the motivating example earlier or introduce a simplified case (e.g., scalar Poisson) first.

  • Figures 1 and 2 are critical but lack self-contained descriptions. For a technical audience unfamiliar with DEC or the specific PDE being solved, these need extensive legends explaining what each symbol and arrow denotes.

  • More focussed experiments: Pick 2-3 shapes in 2D and 3D (exhibiting different topological characteristics), and comprehensively evaluate performance against other state of art. Additionally, while the paper briefly notes that performance is affected by mesh shape and quality, this is an important limitation for deployment in practical engineering settings. Additional empirical analysis (e.g., mesh convergence study) would strengthen the impact.

局限性

Yes

最终评判理由

Thank you to the authors for actively engaging during this process. I also appreciate the additional results that the authors added. I am happy to increase my score to 5

格式问题

none

作者回复

We are sincerely grateful for the time and effort you have dedicated to reviewing our manuscript. Below, we address each of your comments in detail.

Q1. While the stated focus is on time-dependent PDEs (in the abstract), all analysis and results are for the time-independent PDE. This is rather confusing.

We actually describe in the abstract that the time-independent boundary value problems (BVPs) in electromagnetism are instantiated to illustrate the proposed framework, not time-dependent.

Though this framework can be easily generalized to time-evolving PDEs, we evaluate our method on time-independent cases, aiming to verify its capability of capturing local behaviors by introducing higher-order elements.

Q2. More focused experiments: Pick 2-3 shapes in 2D and 3D (exhibiting different topological characteristics), and comprehensively evaluate performance against other state of art.

Due to character limit, only the results on 2D magnetotastics are provided. 8 relative benchmarks (from A1 to C2) are used for evaluation, each having different topological properties on connected component amount and hole amount:

Benchmark IdA1A2A3B1B2B3C1C2
Connected Component Amount11122244
Hole Amount24824824

Performance on these benchmarks are compared with other baselines. The table below shows the test loss of each model. It turns out that the advantage of our approach persists as the underlying topology changes.

Model/Benchmark IdA1A2A3B1B2B3C1C2
FourierType3.9693.6703.5324.9634.6395.5613.4574.211
GalerkinType2.4572.8833.2071.9072.5112.9122.0933.275
MKGN9.1775.2086.7698.4285.8134.6894.2596.722
Transolver2.7332.479\underline{2.479}2.167\underline{2.167}2.3012.273\underline{2.273}2.112\underline{2.112}2.2341.901
Geo-FNO9.4239.4489.4539.4489.4509.4519.4489.451
GNOT2.406\underline{2.406}3.2773.0092.7993.0062.5883.3732.213
DEC-HOGNN(Ours)1.5731.4871.4982.088\underline{2.088}1.9011.7272.181\underline{2.181}2.176\underline{2.176}

Rmk. Baselines achieving top 1 and top 2 performance are marked via bolding and underscoring, respectively.

Q3. Additionally, while the paper briefly notes that performance is affected by mesh shape and quality, this is an important limitation for deployment in practical engineering settings. Additional empirical analysis (e.g., mesh convergence study) would strengthen the impact.

Thanks for your suggestion. We then analyze the negative impact out of mesh degradation by randomly dropping a certain amount of edges on a mesh hierarchically. Several extra edges are further removed to fix the topology. Due to character limit, only partial data can be demonstrated.

To measure the mesh quality quantitatively, three indicators are adopted where the arrows imply the direction in which the mesh degrades. It is reflected that dropping edges from a triangularized mesh usually comes with mesh degeneration.

Then, we train models on the meshes suffering from different levels of degeneration. The average mesh quality and validation loss within 500 epochs are reported. It is found that minor degeneration would not affect the model performance violently while a major one leads to salient performance drop. Note that it is also infeasible to adopt classical solvers using these meshes with prominent quality issues. Thus these negative effects are tolerable.

Edge Drop#E#V#FAREAS ATR0100200300400500
04871803061.2860.5950.1979.754.602.952.141.851.44
254621802811.3010.5940.2519.324.242.892.181.651.42
504371802561.3180.6070.3169.324.443.262.422.011.51
1003871802061.3650.6700.4049.334.583.522.662.322.04
1503341771561.4010.7830.4879.414.903.762.992.512.11
2002751681061.4580.8110.5849.595.223.893.353.042.70
300139114241.9290.8160.86010.676.465.785.274.934.82

Rmk. #E, #V, #F are the amount of edges, vertices and faces, respectively. Usually, a mesh is considered having good quality if its elements are uniform and regular. We briefly introduce these three indicators:

  • Aspect Ratio (AR). There are many versions of AR. We define it as the ratio between the maximum edge length over the minimum.
  • Equi-Angle Skewness (EAS). An N-polygon's ideal angle is θ:=π(N2)/N\theta:=\pi (N-2)/N. For an actual angle θi\theta_{i}, its offset is δi:=θiθ/max{θ,πθ}\delta_{i}:= |\theta_{i}-\theta|/\max\{\theta,\pi-\theta\}. And its EAS is maxiδi\max_{i}\delta_{i}.
  • Area Transition Ratio (ATR). It depicts the uniformity of element size. Let Ai,AjA_i,A_j be the area of two neighboring elements, then its ATR is AiAj/max{Ai,Aj}|A_i-A_j|/\max\{A_i,A_j\}.

Q4. It is unclear why the 3D magnetostatic results are not competitive (relative to other baselines) even though the model is designed to generalize naturally to 3D.

This seemingly abnormal result mainly comes with different complexities of the datasets. Take Poisson equation (Δu(x)=f(x),xΩ;u(x)=g(x),xΩ\Delta u(\mathbf{x})=f(\mathbf{x}),\mathbf{x}\in\Omega;u(\mathbf{x})=g(\mathbf{x}),\mathbf{x}\in \partial\Omega) as an example, this BVP is dominated by both source term f(x)f(\mathbf{x}) and boundary condition g(x)g(\mathbf{x}). As stated in Appendix G, the behavior of the source term f(x)f(\mathbf{x}) in 2D cases is more complicated in the sense that every triangle is endowed with random charges. While in 3D cases, only several embedded cubes with charge or circuits are involved. And as we know, such higher-order structure only captures local features since these higher-order features are originated by differential operators, which only reflect local behaviors. Consequently, the more the BVP is dominated by local restriction f(x)f(\mathbf{x}) instead of global restriction g(x)g(\mathbf{x}), the better performance our proposed method would achieve. In this perspective, the result that 2D cases appear better than the 3D counterparts, is consistent with both the data and the architecture.

Besides, our method focuses more on how to capture local behaviors governed by f(x)f(\mathbf{x}) rather than the global by g(x)g(\mathbf{x}), due to the nature of integrated form representation. We will leave how to further combine global behaviors reflected by boundary conditions with integrated form representations for future work.

Q5. The comparison misses modern BVP-specific baselines such as FNO, CNO, SCOT, etc., which are more appropriate than DeepONet or Transolver (designed for time-evolving systems).

Thanks for your suggestion. However, these baselines cannot directly fit our experiment settings due to their requirement of a regular mesh. Both FNO and CNO requires the spatial domain or its parametric space to be a regular grid (e.g., Elliptic mesh for the airfoil problem(p.55 fig.23)[1], Appendix 3[2]). This fact is also supported by "state-of-the-art neural operators such as FNO, CNO, and scOT which are tailored for Cartesian grids" [3] (p.8).

Thus these baselines are improper to handle irregular meshes as shown in fig.5, which are more common in practical use. Also, it is worthwhile mentioning that involved baselines like GNOT also outperform earlier baselines like interpolated FNO, Geo-FNO in regular-mesh settings (Sec.4 Tab.1)[4].

Still, we can understand your concerns on insufficient experiments. Therefore, we further evaluate the performance against Geo-FNO and FourierType Transformer Operator, which can handle arbitrary meshes. We will clarify why baselines like FNO, CNO, and scOT are not adopted in our manuscript and add extra comparisons in the updated version.

Model2D Eletrostatics2D Magnetostatics
Geo-FNO2.0173.289
FourierType1.5531.986
GNOT2.0641.142
DEC-HOGNN(Ours)0.6230.875

Ref.

[1] (CNO) Convolutional Neural Operators for robust and accurate learning of PDEs. arxiv.

[2] (FNO) Fourier Neural Operator for Parametric PDE.

[3] RIGNO: A Graph-based framework for robust and accurate operator learning for PDEs on arbitrary domains. arxiv.

[4] GNOT:A General Neural Operator Transformer for Operator Learning.

Q6. Suggestions on writing. (i) This is a very dense paper, with some parts written with excessive jargon (particularly early on), making it hard for a broader audience to parse. Key ideas (like Eq. 5) are underexplained at first reading. (ii) Figures (e.g., Figure 1 and 2) are central to the understanding but require much more extensive captions and annotations to be fully self-contained. (iii) The formulation of the neural operator in Eq. (5) is too abstract and hard to interpret until the reader reaches the concrete instantiation in Eq. (7). The authors could either move the motivating example earlier or introduce a simplified case (e.g., scalar Poisson) first.

Thanks for your advice. For (i) and (ii), we have added more detailed explanations on these terms, equations and figures, making it more friendly to broader readers. For (iii), we have exchanged the abstract presentation of the neural operator and the concrete example properly.

评论

Thank you for the effort to address my questions. I appreciate the additional comparisons with other techniques. I am happy to increase the score to 5.

审稿意见
4

solve boundary value PDE problems. The authors incorporate discrete and finite element exterior calculus into GNNs to deal with geometrical relationships between differential forms (e.g., charge density and electric field). They also derived the physics-informed loss to measure the consistency of the output against the considered physical law. In addition, the authors provided the universal approximation theorem for Poisson problems, justifying the use of the model for electromagnetic phenomena. The experimental results suggest that the proposed method has a high capacity to predict electromagnetic phenomena for 2D and 3D settings.

优缺点分析

Strengths:

  • The writing is well-construcgted and easy to follow. The authors reveal the connection of the present method to existing ones from the viewpoint of DEC (discrete exterior calculus), clarifying the position of the research.
  • The derivation of physics-informed loss for GNN is useful. Thanks to the DEC formulation, a certain physical constraint turned out to be the sum of integration, which is easy to compute.
  • The method is demonstrated to have high accuracy compared to the considered baseline models, showing the expressibility of the model.

Weakness:

  • The relation to existing exterior-calculus-based or simplicial-complex-based methods is not clearly stated. For instance, differences and advantages against research on cellular complexes [Alain+ ICML 2024] and simplicial complexes [Ebli+ NeurIPS Workshop 2020] could be stated, and possibly, compared experimentally.
  • There is no evaluation of computation time. Since the classical solvers have high accuracy (with possibly a lot of computation resources), the evaluation of the accuracy alone is not enough. Therefore, the reviewer recommends evaluating the speed-accuracy tradeoff with changing spatiotemporal resolution (at least for classical solvers).

Minor points:

  • Simplex complex (l.63 p. 2) could be called simplicial complex instead.
  • Since x is not used in the PDE, xΩx \in \Omega in Equation 6 is confusing. Either using x in the PDE or writing in Ω\mathrm{in} \ \Omega would be preferable.

问题

  • In the experiments, how much was the error of PDE constraints, e.g., divergence-free condition?
  • What would be necessary to extend the method for other type of PDEs, e.g., Navier–Stokes equations?

局限性

yes

最终评判理由

Although most of the concerns are addressed by the authors, the evaluation of the speed-accuracy tradeoff compared to classical solvers, which is the most essential one, is yet to be addressed. Therefore, I keep the original score.

格式问题

N/A

作者回复

We are grateful for your thoughtful review and constructive suggestions. All your comments have been carefully addressed in the point-by-point responses below.

Q1. For instance, differences and advantages against research on cellular complexes [Alain+ ICML 2024] and simplicial complexes [Ebli+ NeurIPS Workshop 2020] could be stated, and possibly, compared experimentally.

Thanks for pointing out these important works on incorporating higher-order topology. A crucial observation is that one can well define the neighborhood of a kk-cell as a set of (k1),k(k-1),k or (k+1)(k+1)-cells, respectively. Since lots of canonical results of graphs are based on a certain type of adjacency, one can generalize the usual graph Laplacian to higher-order Laplacians, based on which Ebli et al. proposed a higher-order GCN while Alain et al. derived a higher-order graph Gaussian process. The message passing mechanism in our paper actually covers Ebli et al.'s work. These methods take advantage of multi-body interactions in different scales in a topological structure since a kk-cell can be interpreted as the result of the interactions among kk individuals (nodes).

Our approach is a step beyond merely leveraging topological inductive bias. Indeed, by connecting higher-order features with differential forms, it paves the way for further leveraging geometry inductive bias since forms are more than commonplace in differential geometry. By encoding input data as forms and building their relations properly, our method is able to turn differential operators into higher-order feature interactions. In all, our approach is an attempt to open the door towards the geometry world.

Q2. Since the classical solvers have high accuracy (with possibly a lot of computation resources), the evaluation of the accuracy alone is not enough. Therefore, the reviewer recommends evaluating the speed-accuracy tradeoff with changing spatiotemporal resolution (at least for classical solvers).

Thanks for your suggestion. The neural operators often outperform classical solvers by several orders of magnitude in terms of speed and thus the time cost comparison is less talked about. As stated in FNO[1] (Sec. 1), "On a 256×256 grid, the Fourier neural operator has an inference time of only 0.005s compared to the 2.2s of the pseudo-spectral method used to solve Navier-Stokes."

In our setting, the Ansys 3D electrostatics solver takes 14 seconds in total and 427MB memory for adaptive meshing to reach 0.1% energy loss, let alone time-consuming work on preliminary mesh checking. Conversely, most prevailing neural operators are on the milisecond level. Furthermore, our experiments mainly concentrate on the effectiveness of differential-form neural representations, and thus only time-independent PDEs are considered, i.e., it does not involve spatiotemporal resolution.

But still, to validate our approach on time efficiency, we compare our methods with neural baselines. We kindly invite you to refer to the response to Reviewer RTKj 's Q1 where a comprehensive comparison on computation cost against neural operators is presented.

Q3. In the experiments, how much was the error of PDE constraints, e.g., divergence-free condition?

Let us further clarify some technical details on this topic.

In the magnetostatics experiments, we optimize the divergence-free loss Ldiv\mathcal L_{\text{{div}}} weighted by face area (in 2D cases) or cell volume (in 3D cases), and the data loss Ldata\mathcal L_{\text{{data}}} simultaneously. Note that divergence-free loss should be viewed as a requirement instead of a regularization term. Otherwise, it may overfit the data at the vertices and the complete field recovered via interpolation can thereby go against physics laws.

A further experiment is conducted: we set the total loss L:=Ldata+102Ldiv\mathcal L:=\mathcal L_{\text{{data}}}+10^2\mathcal L_{\text{{div}}} and then observe the training curves with and without this extra divergence-free condition.

We define the ratio R:=102Ldiv/LR:=10^2\mathcal L_{\text{{div}}}/\mathcal L. The higher RR is, the more likely that the model overfits the data at the vertices and breaks the physics law. And also, we study whether this extra term would have a negative impact on fitting given data points by observing the validation data loss Ldata\mathcal L_{\text{{data}}}. The table below shows two validation curves within 500 epochs. The result shows that merely fitting the data points is likely to violate the divergence-free condition.

Train with Ldiv\mathcal L_{\text{{div}}}Item0100200300400500
RR1.05%51.76%74.89%85.88%90.09%92.47%
Ldata\mathcal L_{\text{{data}}}9.444.392.972.261.811.54
102Ldiv10^2\mathcal L_{\text{{div}}}0.104.718.8613.7416.4518.91
RR0.73%1.56%4.00%5.23%6.67%8.98%
Ldata\mathcal L_{\text{{data}}}9.475.063.602.902.522.23
102Ldiv10^2\mathcal L_{\text{{div}}}0.070.080.150.160.180.22

Thus the conclusion is: without this divergence-free loss term, the model is prone to overfit the data at sampled points, hence, greatly violating the divergence-free condition. Though sometimes fitting slower at given data points, some properties of the field are preserved, like divergence-free, coinciding with the insights of preserving symmetries in Lagrangian Network and Hamiltonian Network.

Q4. What would be necessary to extend the method for other type of PDEs, e.g., Navier–Stokes equations?

As stated, our current method can cover various linear PDEs including common operators like grad, curl, div\text{grad, curl, div}. While some geometric neural representations are needed to achieve non-linear PDEs. Below, we briefly demonstrate some possible technical approaches for solving Navier-Stokes Equation (NSE) and leave them for future work.

  • The viscosity μΔu\mu\Delta \mathbf{u} term in NSE can be decomposed into μgrad (div u)\mu \text{grad } (\text{div }\mathbf{u}) and μcurl (curl u)\mu\text{curl }(\text{curl } \mathbf{u}). In 2D cases, the integration of un\mathbf{u}\cdot \mathbf{n} on an edge ee (n\mathbf{n} is the normal of a primal edge and also the tangent of its dual counterpart e\star e) can be interpreted as the flux through ee and thus a 1-form on e\star e. This again represents div u\text{div }\mathbf{u} properly via the edge coboundary in HOGNN, while grad\text{grad} can be implemented on the dual graph based on graph calculus. The other term μcurl (curl u)\mu\text{curl }(\text{curl } \mathbf{u}) can be likewise handled.
  • Directly handling the non-linear term uu\mathbf{u}\cdot \nabla \mathbf{u} may be difficult. However, this term is associated with Lie derivative LX\mathcal L_{X}, which is more profound in differential geometry and allows generalization to non-Euclidean spaces. There already exist some preliminary implementations of Lie derivative and vector fields onward. Nevertheless, how to represent them effectively for neural networks remains open and is worthwhile to explore in the future.

Q5. On some minor issues. (i) Simplex complex (l.63 p. 2) could be called simplicial complex instead. (ii) Since x is not used in the PDE, xΩx \in \Omega in Equation 6 is confusing. Either using x in the PDE or writing in Ω\Omega would be preferable.

Thanks for your correction. We have modified both in our manuscript. For (ii), it is indeed a bit confusing though x\mathbf{x} is implicitly used in the fields.

评论

Thank you for the response.

Q2: Evaluation of the speed is not enough. We have to evaluate the speed-accuracy tradeoff. Because machine learning is not perfectly accurate, the comparison based only on computation time is not complete. If we are allowed to reduce accuracy, we can reduce spatial resolution or increase the convergence threshold on the classical solvers.

That's why I recommend evaluating the speed-accuracy tradeoff both for machine learning and classical solvers. I expect to see a fair comparison, similar to Figure 4 of [Horie and Mitsume, Physics-Embedded Neural Networks: Graph Neural PDE Solvers with Mixed Boundary Conditions, NeurIPS 2022], as we are aware that machine learning methods often struggle to improve the speed-accuracy tradeoff compared to classical solvers in a fair manner.

Q3 I see that the divergence-free loss term is important for the setting in the paper. But I guess we can achieve (at least mathematically) perfect divergence-free when we use the vector potential. Why is the vector potential not used for the experiments?

评论

Thanks for your further explanation.

Q6. Evaluation of the speed is not enough. We have to evaluate the speed-accuracy tradeoff. Because machine learning is not perfectly accurate, the comparison based only on computation time is not complete. If we are allowed to reduce accuracy, we can reduce spatial resolution or increase the convergence threshold on the classical solvers.

Thanks for your suggestion. It is indeed important to evaluate the speed-accuracy tradeoff of different methods. But due to limited time in rebuttal session, we cannot offer a comprehensive result by changing many factors like grid resolution and various model hyperparameters. Here we only choose some competitive baselines and change model hyperparameters as [1] did. More results would be updated in the next version of our manuscript.

By changing the hyperparameters of the model, we evaluate the computation overhead of a model by recording its computation time on one core of a CPU following [1]. It can be seen that this trade-off is more complicated than the one in numerical methods. More computation overhead (lower speed) does not always come with faster convergence and higher accuracy since the performance is also constrained by dataset amount, the sparsity of model parameters, overfitting and mesh regularity (some appear in numerical methods).

This pheonomenon on the 2D magnetostatics benchmark (see tables below) also aligns with Fig. 4 in [1] where data points on the MSE-time plane are not necessary to be strongly related (PENN and OpenFOAM are the most salient). Also, as in Table 5[1] (row 2 and 4; row 3 and 5), more computation can sometimes bring about performance drop and training difficulty. In summary, under finite data and other constraints, neural architectures exhibit distinct computational requirement intervals. In the tradeoff against accuracy, speed is a key factor but not all.

Tab 1: Models with different computation cost on the 2D magnetostatics benchmark.

ModelConfigurationComputationTime(s)TestLoss
GNOT321.606.128
GNOT642.423.362
GNOT1284.752.406
GNOT25612.324.189
DEC-HOGNN(Ours)323.582.959
DEC-HOGNN(Ours)645.782.831
DEC-HOGNN(Ours)12811.681.573
DEC-HOGNN(Ours)19218.822.437
Transolver(128,8,32)2.663.312
Transolver(256,8,32)4.592.829
Transolver(512,8,32)10.113.888
Transolver(512,16,64)10.933.758
GalerkinType322.992.649
GalerkinType485.362.899
GalerkinType6411.811.932
GalerkinType12862.772.492

Tab 2: Best performance with different levels of computation.

Time[0,5)[5,10)[10,15)[15,∞)
Best ModelGNOT(128)GalerkinType(64)DEC-HOGNN(128)DEC-HOGNN(192)
Loss2.4061.9321.5732.437

Rmk. The configurations of GNOT, DEC-HOGNN and GalerkinType only involve latent space dimension. As for Transolver, (x,y,z)(x, y, z) means it has xx-dimensional hidden features, yy heads and zz slices.

Ref.

[1] Horie and Mitsume. Physics-Embedded Neural Networks: Graph Neural PDE Solvers with Mixed Boundary Conditions.

 Q7. Why is the vector potential not used for the experiments?

By introducing a vector potential A\mathbf A, it can indeed guarantee the divergence condition. However, it has several flaws.

  • First, one must estimate its curl since B=×A\mathbf B=\nabla\times\mathbf A. But an accurate computation is infeasible because only finite observations on the graph vertices are available. Though one may obtain an approximation via interpolation-based methods like reproducing Hilbert kernel or machine-learning approaches, in whichever way, the divergence-free condition is still violated.
  • Second, if we directly use a network to represent this potential (that is, take in a 3D coordinate and output the potential A\mathbf A), the notorious issues in PINN arise:
    • It is usually hard to converge since higher-order derivative behaviors are harder to control during learning.
    • Also, in our benchmarks, medium are different in diferent parts of the domain, indicating that such a potential is not continuous around these borders.
    • Some irregularities of this potential can also arise in some scenarios (e.g., around a current-carrying thin wire).
    • Furthermore, such a vector potential is not unique unless being restricted by some gauges, like divergence-free condition (Coulomb gauge). Whether this extra degree of freedom would give rise to learning difficulties is unknown. In all, by introducing integrated forms, our proposed approach avoids directly handling higher-order derivatives of quantities like vector potential which has irregular behaviors and is usually tough to learn.
评论

Thank you for the answer. I see the complexity behind the vector potential.

Although I agree that the time for rebuttal is quite limiting, I believe the authors should clearly present the benefit of using machine learning over classical solvers (and other machine learning models, which is addressed in the last reply) in a quantitative way. Therefore, I will keep my score as is.

Nevertheless, I acknowledge the mathematical solidity of the work.

最终决定

The paper presents a GNN-framework for solving partial differential equations that incorporates higher-order interactions based on discrete and finite element exterior calculus. The method outperforms existing neural operators and provies theoretical guarantees of universal approximation for certain voundary value problem classes. Reviewers noted that the methodology is technically sound, highly innovative, and appreciated the rigorous theoretical support. They also commented that the work opens up a new direction for learning-based PDE solvers, that the physics-informed loss is elegant, and they appreciated the improved accuracy compared to baselines. On the other hand, reviewers raised concerns about clarifying the relationship to existing methods based on exterior calculus or simplicial complexes, the clarity of the writing, the relatively limited number of experimental results, and the omission of certain baselines (e.g., FNO, CNO, SCOT).

The authors responded to the reviewer concerns with a rebuttal, which provided additional experimental evaluation, clarified why certain baselines were not included (e.g., due to their requirement for a regular grid), added an assessment of computational costs, and replied to a number of other technical questions. The reviewers agreed that all or most of their concerns were addressed by the rebuttal, and there was consensus to accept the paper.

I recommend the paper be accepted, and I also note that it stands out from other papers because of the thoroughness of the work, the technical novelty of the approach, and the potential for broad impact in the area of neural operators and physics-informed neural networks.