PEINR: A Physics-enhanced Implicit Neural Representation for High-Fidelity Flow Field Reconstruction
摘要
评审与讨论
The paper introduces two contributions:
- Multi-Fidelity Large-Scale CFD Dataset: A comprehensive dataset generated using the WENO solver, comprising five different configurations of 2D and 3D flows that include shock phenomena. Each configuration is provided at two resolutions: a low-resolution version (approximately 100×100) and a corresponding high-resolution version (approximately 1000×1000).
- Physics-Enhanced Implicit Neural Representation (PEINR) Framework: A novel neural architecture designed to learn the discrepancy between low-fidelity and high-fidelity simulation data. The PEINR incorporates several specialized components:
- Spatial and temporal encoding using RBF and PCA
- Transformer blocks
- Spectral blocks
The framework aims to bridge the gap between computationally efficient low-resolution simulations and more accurate high-resolution simulations by learning the difference between these fidelity.
给作者的问题
N/A
论据与证据
The dataset presented appears to be a valuable contribution to the community. However, I have several concerns regarding the PEINR model that require clarification:
Model Architecture and Training Approach
- Is a separate model trained for each flow case (as in traditional INR approaches), or is a single model trained to handle a family of cases (similar to super-resolution/operator learning paradigms)?
- What precisely serves as the model input? Are these spatial coordinates or low-resolution images?
Unclear Motivation
The fundamental motivation for using an INR to represent the difference between low- and high-fidelity simulations requires further explanation:
- If the goal is efficient storage of high-resolution data, why not directly downsample the high-resolution dataset and use a low-resolution solver instead?
- If the aim is super-resolution from low-fidelity to high-fidelity simulations, how does the model generalize across different simulation cases?
Model Components
The model architecture incorporates several components that appear somewhat arbitrarily selected. Additional ablation studies would significantly strengthen the paper by demonstrating the necessity and contribution of each component.
方法与评估标准
The paper proposes a new dataset. But I am not sure what is the motivation to do recontruction of low- and high- fidelity simulation.
理论论述
N/A
实验设计与分析
N/A
补充材料
The dataset looks great.
与现有文献的关系
If the goal is to reconstruct a fluid field, it is very similar to physics informed neural nework (PINNs)[1] which should be discussed. If the goal is super-resolution. It should discuss standard models in super-resolution as well as neural operators [2]. [1] Karniadakis, George Em, et al. "Physics-informed machine learning." Nature Reviews Physics 3.6 (2021): 422-440. [2] Kovachki, Nikola, et al. "Neural operator: Learning maps between function spaces with applications to pdes." Journal of Machine Learning Research 24.89 (2023): 1-97.
遗漏的重要参考文献
Previous works on augmenting coarse grid solution: Pathak, Jaideep, et al. "Using machine learning to augment coarse-grid computational fluid dynamics simulations." arXiv preprint arXiv:2010.00072 (2020). Li, Zongyi, et al. "Physics-informed neural operator for learning partial differential equations." ACM/JMS Journal of Data Science 1.3 (2024): 1-27.
其他优缺点
Overall the method section could be improved. The inverse problem has no definition. Please first define the problem setting and then discuss the method.
Are all components such as neighbor coordinate, RBF and PCA for time encoding, and spectral block necessary? Each component in PEINR may need an ablation study. If some components of the architecture is not so important, I suggest remove them into the appendix and use the space to define the problem.
Fig 2 is too small and it is hard to understand.
其他意见或建议
I like the dataset, and I am happy to raise my score if my questions on the method can be addressed.
Thanks for your reviews. Here are our explanations about the weaknesses and problems.
1.Response to comments on transferability and generalization of PEINR
We appreciate your query regarding the generalization of our model. Since the physical phenomena in the flow field vary greatly from problem to problem, there is no truly infinite generalization. As demonstrated in our experiments (Fig. 4(a)-(f)), the three types of physical problems differ drastically in both macroscopic and microscopic aspects. At present, our model is not universally generalizable to all problems. However, with a short period of transfer learning, we can quickly adapt the model to converge across different numerical schemes for the same problem, as shown in Figure2(https://drive.google.com/file/d/1Ghv7-cWKf2DQUEt-5O3JUdZfXVkIf3v-/view).
2.Response to comments on the input of PEINR
Our method PEINR is based on implicit neural representation, which takes spatial and temporal coordinates as inputs, not the images, as described in sec 1 line41.
3.Response to comments on the motivation of using INR
We appreciate your inquiry into the rationale for our high-fidelity reconstruction method compared to simple downsampling. Our method, PEINR, aims to reconstruct high-fidelity flow fields from low-fidelity data while simultaneously enhancing grid resolution and numerical accuracy. This approach not only enables efficient storage of high-fidelity data but also alleviates the significant computational resources required for high-fidelity simulations.
While the reviewer suggests directly downsampling high-resolution data and using a low-resolution solver, this approach is problematic because flow fields, unlike images, have inherent physical meanings. Such downsampling would irreversibly lose critical physical information, and the truncation errors of a low-resolution solver would accumulate and propagate over time, leading to deviations from the true solution in long-term predictions.
In contrast, PEINR only stores low-fidelity data and reconstructs high-fidelity details on demand, capturing small-scale flow structures without explicitly solving them. Moreover, since PEINR takes spatiotemporal coordinates as input, it can generate local flow regions on demand rather than the entire global field, significantly reducing computational overhead.
4.Response to comments on the relation to broader scientific literature
Our method, PEINR, aims to reconstruct high-fidelity flow fields from low-fidelity data while simultaneously improving grid resolution and numerical accuracy, rather than solely performing reconstruction or super-resolution.
Nevertheless, we appreciate the opportunity to discuss these works. Our CFD data is obtained from discrete numerical solvers, which inherently introduce numerical dissipation and truncation errors, deviating from the governing Navier-Stokes equations. Forcing a PINN to simultaneously fit the data while satisfying residual equations may lead to non-convergence or unphysical solutions. Instead, our approach combines data-driven learning with physics constraints to ensure physical consistency.
Regarding super-resolution, neural operators leverage function space mappings and efficient spectral-domain computations to overcome the fixed-grid limitations of traditional methods. However, when high-frequency energy decays rapidly, neural operators may struggle to retain fine details compared to INR. Additionally, INR supports on-demand generation of localized regions, such as boundary layers or vortex structures, without requiring full-field computation, making it more adaptable and efficient.
5.Response to comments on missing citations
Thank you for your valuable suggestions. We will incorporate discussions of these references in the revised version.
6.Response to comments on inverse problems definition
In our work, the inverse problem refers to reconstructing high-fidelity physical fields from sparse data[1]. We will clarify this definition in the revised version before discussing our proposed method.
[1] Chen, Yang, et al. "HOIN: High-Order Implicit Neural Representations." arXiv preprint arXiv:2404.14674 (2024).
7.Response to comments on ablation study
The ablation study of neighbor coordinate, RBF and PCA for time encoding, and spectral block is shown in Table2. PEINR removes both spectral block and the physical encoding, while PEINR only removes the physical encoding. As shown in Table 2, PEINR outperforms PEINR in the metric DD in all cases, which indicates that our physical encoding method can preserve more physical peculiarity. The superiority of PEINR compared with PEINR highlights that the added spectral block can also help the model to perform better.
8.Response to comments on Figure 2
Thanks for your suggestion, we have enlarged Figure 2 to make reading easier, as shown in Figure 1 of the link(https://drive.google.com/file/d/1Ghv7-cWKf2DQUEt-5O3JUdZfXVkIf3v-/view).
Thank you to the authors for clarifying that PEINR is based on implicit neural representation, taking spatial and temporal coordinates as inputs. As with other INR methods, PEINR requires case-by-case training and generally cannot generalize to other scenarios.
However, the motivation remains unclear. In the rebuttal, the authors mention two potential scenarios:
This approach not only enables efficient storage of high-fidelity data but also alleviates the significant computational resources required for high-fidelity simulations.
These scenarios should be addressed separately:
-
For efficient storage: If high-fidelity simulation data is already available, it's unclear why re-running low-fidelity simulations and learning the difference is necessary. Direct subsampling of high-fidelity data and reconstruction via INR would be more straightforward. Additionally, no experiments demonstrate storage efficiency gains.
-
For computational cost reduction: When high-fidelity simulations are unknown, it's unclear how to train the INR to capture discrepancies between low and high-fidelity simulations without access to the high-fidelity data. In this context, super-resolution models such as those proposed by Pathak et al. in "Using machine learning to augment coarse-grid computational fluid dynamics simulations" would be more appropriate, as they can generalize to unseen cases.
While I appreciate the dataset contribution, the problem formulation appears artificial. The authors need to better justify their approach of using INR to bridge the gap between low and high-fidelity simulations, given the access to the high-fidelity simulations.
Thank you for your insightful question. The motivation of PEINR lies in reducing computational costs and storage requirements while enabling efficient and high-fidelity flow reconstruction In many practical engineering analyses, due to limited storage resources, it is to save high-fidelity data for all time steps, especially for long-term simulations. Our method leverages implicit neural representations (INR) to learn the discrepancy between low-fidelity (LF) and HF fields, allowing for compact storage and efficient reconstruction for both available and unavailable time steps.
- .
We provide extrapolation and interpolation perspectives to evaluate the model performance. Extrapolation datasets can measure the generalizability of our model at future time steps beyond the observed range, while interpolation datasets aim to assess the model performance on intermediate time steps within the training temporal domain. In the experiment of 2D, by training on early time steps (e.g., 460–480) using LF-HF differences, our model can predict HF corrections at future steps (e.g., 481-500) without directly running costly HF simulations, thereby reducing storage requirements. The 3D experiments have demonstrated the interpolation ability of our method. All the experiments demonstrate that our method is capable of reconstructing the flow field at unavailable time steps. This implies that our method does not need to store all the high-fidelity flow field data and can infer the uncalculated flow field data.
In addition, maybe you also wonder why we learn the error field instead of directly learning the HF field. The reasons can be mainly summarized as the following two points: (1) Reduced learning complexity and improved accuracy. Directly learning HF flow requires the network to fit a highly complex, high-frequency physical field, which is particularly challenging for multi-scale turbulent flows. Learning the error field allows the model to focus on refining LF solutions, making training more efficient. (2) Better generalization and computational efficiency. The error field typically exhibits a smaller dynamic range and smoother variations than the original HF field, making it easier for neural networks to learn and leading to improved efficiency and generalization.
- .
PEINR can generate HF data for future time steps without the need for costly HF simulations, significantly reducing computational costs. Moreover, the predicted HF data can be used as the initial field for HF solver, thereby accelerating its convergence. Additionally, we believe there may be some misunderstandings regarding Pathak et al.’s ML-PDE hybrid approach. First, it is a supervised learning method that still relies on high-resolution data to correct the model error in a low-resolution simulation. Second, its PDE constraints are problem-specific and cannot generalize to unseen cases. Lastly, our PEINR approach is solver-agnostic and can be applied to arbitrary meshes, whereas Pathak et al.’s CNN-based method is limited to cartesian grids.
- .
Recent advancements inspired by super-resolution (SR) techniques have introduced deep-learning methodologies for recovering high-fidelity simulations from corresponding low-fidelity counterparts. Existing approaches show promise in overcoming computational and storage challenges to generate high-fidelity simulations. Despite recent advancements, current research faces a significant challenge: the absence of standardized benchmark datasets for comparing and validating the performance of different SR approaches.
To address this crucial gap, we introduce HFR-Bench, an innovative benchmark dataset that fills the need for standardized evaluation and comparison of SR methods within scientific domains. Incorporating actual coarse data, as we do in our dataset HFR-Bench aligns more closely with real-world challenges and contributes to the advancement of high-fidelity reconstruction methods. Additionally, our method addresses the limitations of existing INR approaches, such as weak spatiotemporal representation and poor adaptability. Once the mapping between LF and HF simulations is learned, the trained model can be applied to different field variables (e.g., pressure, velocity, density) or varying numerical precision levels (as shown in Fig. 2 of the link (https://drive.google.com/file/d/1Ghv7-cWKf2DQUEt-5O3JUdZfXVkIf3v-/view). We acknowledge that PEINR requires retraining for different flow problems and plan to explore pretraining strategies or meta-learning in the future to accelerate convergence on new cases.
We will clarify our motivation in the revised version to better reflect its relevance and significance.
This paper introduces HFR-Bench, a large-scale CFD dataset, with 33,600 unsteady 2D and 3D vector fields across various grid resolutions and numerical precisions, providing a benchmark for flow field reconstruction. It also proposes PEINR, a novel physics-enhanced INR model that improves flow field reconstruction by enhancing numerical precision and grid resolution using physical encoding and a transformer-based spatiotemporal fuser (TransSTF). Experiments show that PEINR outperforms existing methods like NIF and CoordNet.
给作者的问题
- Why are training times in Table 1 shorter than the inference times?
- How does PEINR perform with long time series? There are 500 timesteps in the experimental data, but only a few timesteps are used.
- Are the quantitative results in Table 2 averages or values from a single timestep (as described in the Implementation Details)? The average results are more convincing.
- See weaknesses for other questions.
论据与证据
Yes
方法与评估标准
Yes.
理论论述
Yes.
实验设计与分析
Yes.
补充材料
Yes.
与现有文献的关系
The main contribution is a novel large-scale CFD dataset, providing a benchmark for flow field reconstruction.
遗漏的重要参考文献
Yes. Line 246, missing citation because the ResuMLP architecture is proposed by CoordNet.
其他优缺点
Strengths:
- This paper introduces a novel large-scale CFD dataset, providing a benchmark for flow field reconstruction.
- The experiments are extensive and demonstrate the advantage of PEINR. Weaknesses:
- PEINR seems unable to perform temporal interpolation when using the proposed temporal encoding, otherwise it requires retraining.
- The RBF-enhanced NTK is confusion. How is it implemented in detail? For example, Line 235 “By incorporating the RBF kernel”, how does MLP combine with RBF kernel?
- Line 246, missing citation because the ResuMLP architecture is proposed by CoordNet.
其他意见或建议
Line 64: “can … enhancing” -> “can … enhance” Line 89: “concentrate” -> “concentrates” Figure 3 (c) (d), the \sigma value are not labeled on the curves. Line 328, “9 points” -> “8 points” ? Line 369, “lower values indicating better performance” is redundant.
Thanks for your reviews. Here are our explanations about the weaknesses and problems.
1.Response to comments on the missing citation CoordNet(Line 246)
Thanks for your suggestions and we will correct it in our revised paper. ResuMLP is proposed by CoordNet, and combines residual connections with MLPs, aiming to alleviate the vanishing gradient problem in deep MLP training and enhance the model's representational capacity.
2.Response to comments on temporal interpolation
By utilizing an INR framework, a neural network learns the implicit expression of flow field data, mapping input spatiotemporal positions to corresponding attribute values. This enables continuous modeling and representation of the flow field, allowing for time interpolation without the need for retraining.
3.Response to comments on the RBF-enhanced NTK
We appreciate the reviewer’s question regarding the integration of the RBF kernel with the MLP in our NTK framework. Below is a detailed explanation of the implementation.
The RBF kernel achieves translation invariance by measuring similarity between time points (based on relative distances rather than absolute positions), which is crucial for temporal modeling. In our implementation, the original 1D temporal input is first mapped to a high-dimensional space via the RBF kernel to generate a kernel matrix. Kernel PCA is then applied to extract principal components, yielding low-dimensional temporal features that serve as one input to the MLP.
The NTK of a standard MLP inherently lacks translation invariance. However, the high-dimensional features generated by the RBF kernel enhance the network’s sensitivity to input variations. This modification alters the NTK’s eigenvalues (and spectral distribution), ultimately improving the network’s ability to capture high-frequency information and its generalization performance.
The NTK can be directly computed as: , where represents the output of MLP and represents parameters of MLP. In our framework, the MLP’s input combines spatial coordinates with RBF-enhanced temporal features , enabling the model to simultaneously resolve high-frequency spatiotemporal details and long-term dynamics. Our MLP input form is [,]. By incorporating the RBF kernel’s stationary properties, the RBF-enhanced NTK becomes: . The RBF-enhanced NTK is more stable during training and enables PEINR to simultaneously model high-frequency details and long-term dynamics, significantly improving the accuracy of flow field reconstruction.
4.Response to comments on syntax errors
We sincerely appreciate the reviewer's careful reading and valuable feedback on grammar and phrasing. We have thoroughly revised the manuscript to address these issues.
5.Response to comments on training time
We appreciate the reviewer's attention to the experimental details. To clarify, the "Training Time" reported in Table 1 refers specifically to the time required per training epoch (not the total training time). We acknowledge that this distinction was not explicitly stated in the original manuscript and will add a clear note in the revised version to avoid any confusion.
6.Response to comments on training timesteps
In this paper, the original CFD data were saved at every 20 physical computation steps, so the 20 steps tested actually correspond to the evolution of 400 computational steps. During this phase, the key physical processes have fully evolved, and this experimental configuration has fully validated the long-term stability of the proposed method.
7.Response to comments on average results of our experiments
We repeated this process 3-5 times and calculated the error bars for all models to compare their robustness on high-fidelity flow field enhancement, as shown in Table1(https://drive.google.com/file/d/1Ghv7-cWKf2DQUEt-5O3JUdZfXVkIf3v-/view). It can be seen that the overall performance of the proposed PEINR is better than other state-of-the-art baseline models, indicating the effectiveness of our approach.
Thanks for the author's rebuttal. Most of my concerns have been addressed. However, there are still some issues that I am unclear about.
Temporal interpolation
According to the definition of Temporal Encoding, the RBF Kernel for a given timestamp sequence, such as {1, 2, ..., 40}, needs to be computed first, and then the time features are input into the subsequent network. If this timestamp sequence is altered, for example, to {1, 1.5, 2, ..., 39.5, 40}, a new RBF Kernel will be generated, which not only changes the time feature for each time , but also alters the length of the time features, thereby affecting their input into the subsequent network. How should this issue be handled?
RBF-enhanced NTK
Thank you for your explanation of the RBF-enhanced NTK calculation details. However, ? Could you please elaborate on this in more detail?
Averages results
I’m sorry for not clearly expressing the meaning of this average results in the Questions section. What I am concerned about is whether the results in Table 2 represent the average results from multiple test timestamps or the results from a single timestamp. The former would be more convincing.
We thank the reviewer for raising this critical concern.
(1) Temporal interpolation.
Regarding the potential temporal feature variations induced by interpolation, we have carefully addressed this in our design. Specifically, PEINR precomputes and fixes a set of RBF bases over the entire temporal domain during the initial training phase. Consequently, even when new timestamps (e.g., interpolated are introduced, the corresponding temporal features can still be derived using the same pre-trained bases without retraining or adjusting the network architecture. Figure 3 (https://drive.google.com/file/d/1Ghv7-cWKf2DQUEt-5O3JUdZfXVkIf3v-/view) shows the visualization of the 3D feature trajectory obtained by encoding time steps using fixed Radial Basis Functions (RBFs). Each point corresponds to a time feature generated via the same basis functions , ensuring a stable distribution and fixed feature dimension. The blue markers highlight the time steps and , which are interpolated based on the same without retraining or modifying the network architecture. This demonstrates the ability of the proposed encoding to flexibly handle arbitrary continuous time inputs.
This design ensures consistency in both the dimensionality and statistical properties of temporal features. The network inherently handles interpolated temporal inputs, thereby avoiding issues such as input length variation or feature distribution drift.
(2) RBF-enhanced NTK.
Regarding the formulation , we provide a more detailed explanation.
In a standard MLP, the Neural Tangent Kernel (NTK) is approximated as the inner product of input features multiplied by the variance of weight initialization: , where primarily arises from the cumulative effect of weight variances across network layers during initialization and is the original input feature matrix. For our RBF-enhanced NTK, the input features are replaced by the feature relationship matrix after transformation via the RBF kernel function. The RBF kernel better captures nonlinear similarities between inputs. Consequently, the NTK in the initial state naturally generalizes from to . replaces the original input inner product, enabling higher-order temporal relationship modeling (e.g., capturing multiscale physics in fluid dynamics).
This adjustment maintains the theoretical structure of NTK while enabling it to better handle complex, nonlinear input patterns. We will include this clarification in the revised manuscript to improve the transparency of our method. Thank you again for your valuable feedback.
(3) Averages results.
Our experimental design has clear physical significance: the training data is taken from steps 460–480, and the model predicts results at step 500. Since the original CFD data is saved every 20 computational steps, this 20-step test interval corresponds to an evolution of 400 computational steps. At this stage, key physical processes (e.g., turbulence, shock waves) are fully developed, and this setup is optimal for validating the long-term stability of the method.
Table 2 presents the prediction results at step 500, which is intentionally selected because it is one of the most challenging timestamps for accurate prediction. We believe this is a rigorous way to validate model robustness: testing multiple nearby timesteps and averaging the results would dilute the difficulty of predicting the most evolved, complex flow structures, and thus would not accurately reflect the true performance under the most demanding conditions. Therefore, we report the results at this critical single timestep.
The paper introduces PEINR (Physics-enhanced Implicit Neural Representation), a framework designed for high-fidelity flow field reconstruction. The authors highlight the limitations of current implicit neural representation (INR) methods in handling complex spatiotemporal dynamics and accurately capturing fine-scale flow structures, especially when applied to computational fluid dynamics (CFD). To overcome these challenges, PEINR integrates physical encoding techniques and a transformer-based spatiotemporal fuser (TransSTF). The physical encoding decouples spatial and temporal components, improving resolution and accuracy. TransSTF captures long-range temporal dependencies using a multi-head attention mechanism, enhancing the model's ability to reconstruct high-fidelity flow fields. The authors introduce a large-scale dataset, HFR-Bench, which includes over 33,600 unsteady 2D and 3D flow fields across various grid-resolutions and numerical precisions. Experimental results demonstrate that PEINR significantly outperforms existing INR-based methods in both qualitative and quantitative evaluations, achieving superior reconstruction quality in flow simulations.
给作者的问题
-
Please clarify the localized spatial encoding and how the information of the neighbors are encoded.
-
As claimed as one of the major contributions, will the dataset be publicly available?
论据与证据
The paper is self-evident with its claims.
方法与评估标准
Yes.
理论论述
N/A
实验设计与分析
Experimental designs and analyses are valid.
补充材料
Code is provided but not reviewed.
与现有文献的关系
The method and dataset is valuable for physics and simulation applications, especially CFD / fluid dynamics.
遗漏的重要参考文献
de Vito et al. (2024) Implicit Neural Representation For Accurate CFD Flow Field Prediction. Du el al. (2024) Conditional neural field latent diffusion model for generating spatiotemporal turbulence.
其他优缺点
Strengths:
-
The introduction of HFR-Bench, a large-scale dataset containing 5.4 TB of flow field data, provides a valuable resource for benchmarking and advancing research in flow field reconstruction.
-
Innovative Physical Encoding Techniques: The use of Gaussian coordinate encoding for temporal information and localized encoding for spatial coordinates enhances the model's ability to capture fine-scale structures and spatiotemporal dynamics.
-
Architecture: The introduction of the TransSTF block, leveraging multi-head attention, allows PEINR to model long-range temporal dependencies effectively, which is crucial for capturing the full complexity of fluid dynamics.
Weaknesses:
-
Some important method details are not clear enough for me to fully understand the method, e.g. the Localized spatial encoding.
-
The method assumes grid independence for flow field attributes across different mesh resolutions, which, as acknowledged in the paper, may not hold true in all real-world simulations. This assumption could limit the model's application in certain practical scenarios where grid dependence is significant.
其他意见或建议
I would expect an explanation on the Riemann problem why PEINR performs weaker than CoordNet in terms of PSNR.
Thanks for your reviews. Here are our explanations about the weaknesses and problems.
1.Response to comments on missing citations
[1] de Vito et al. (2024) Implicit Neural Representation For Accurate CFD Flow Field Prediction.
[2] Du et al. (2024) Conditional neural field latent diffusion model for generating spatiotemporal turbulence.
We will add the citations in the revised manuscript to ensure completeness. The following are our descriptions: Recent work by de Vito et al.[1] in implicit neural representation for accurate CFD flow field prediction demonstrates the potential of INR architectures for steady-state CFD simulations. However, their method primarily focuses on single-resolution flow fields with stationary boundary conditions and does not address the critical challenges of reconstructing unsteady, multi-resolution flow fields with spatiotemporal coupling – a key limitation our PEINR framework resolves through physical encoding and TransSTF. We acknowledge the CoNFiLD model [2] for synthesizing turbulence via neural field latent diffusion, whereas PEINR uniquely prioritizes physics-constrained reconstruction of multi-resolution flow fields through physical encoding and spatiotemporal attention mechanisms.
2.Response to comments on spatial localization encoding clarification
The spatial localization encoding in PEINR directly aligns with CFD’s stencil-based computation paradigm, where numerical solutions inherently depend on local neighborhood interactions (e.g., finite volume discretization). By embedding this locality into our spatial localization encoding, PEINR explicitly leverages CFD’s template computation logic to resolve fine-scale flow structures. In spatial localization encoding, we expand the input spatial coordinates to include not only the original coordinates but also those of nearby neighboring points . For instance, in a 2D case, we consider the nearest 4 points (up, down, left, right), transforming the input from a single coordinate tensor of shape (batch_size,2) to an augmented tensor of shape (batch_size,5,2). This design explicitly encodes the local spatial correlations required for solving discretized Navier-Stokes equations, where each grid point’s solution depends on its neighbors’ physical quantities.
3.Response to comments on clarification on grid-independence
We sincerely appreciate the reviewer's insightful feedback. We acknowledge that there might be a misunderstanding regarding our method's assumption of grid dependence/independence, which we aim to clarify in this response. In conventional implicit neural representations (INRs) for fluid super-resolution, it is typically assumed that the physical attributes (e.g., velocity, pressure) at a given spatial coordinate remain consistent across different grid resolutions (grid independence assumption). However, as the reviewer rightly noted, this assumption often fails in real-world flow simulations where grid resolution significantly affects the discretized field values due to numerical diffusion, truncation errors, and turbulence modeling. In contrast, our method explicitly addresses this limitation by abandoning the grid independence assumption. Instead, we propose a novel grid-conditioned implicit neural representation that encodes both spatialtemporal coordinates and grid-resolution-dependent features. This allows our model to learn how physical attributes vary with grid resolution, thereby capturing the inherent grid dependence observed in practical CFD simulations. In Figure1, we illustrate that baseline INR methods fail to generalize across resolutions due to their grid independence assumption, while our method achieves consistent accuracy by accounting for grid dependence.
4.Response to comments on PSNR result of Riemann problem
We appreciate your observation. The observed PSNR difference stems from PEINR's design prioritizing physical accuracy over numerical optimization for discontinuous flows. Riemann problems involve strong discontinuities (shocks, contact surfaces) where PEINR's localized spatial encoding intentionally preserves sharp gradients, making it more sensitive to errors in these regions. While this approach leads to slightly lower PSNR (a mean-squared-error metric favoring smoothness), it avoids the physical distortions visible in CoordNet's results (Figure 4(c)(d)) where large errors near discontinuities appear. We will clarify this rationale in the revised manuscript.
5.Response to comments on our dataset availability
We confirm that HFR-Bench, our large-scale CFD dataset, will be publicly available to support further research. The dataset contains 33,600 unsteady 2D and 3D flow fields (totaling 5.4 TB), covering diverse grid resolutions and numerical precisions for high-fidelity flow reconstruction.
The rebuttal resolved my concerns and I would keep my original score to recommend acceptance.
This paper introduces PEINR, a physics-enhanced implicit neural representation framework for high-fidelity flow field reconstruction. The authors address three key limitations of existing INR methods: 1) invalid grid independence assumption, 2) temporal-spatial complexity disparity, and 3) spectral bias. The work contributes: 1) HFR-Bench: A 5.4TB CFD dataset (33,600 fields) with uniform/non-uniform meshes; 2) PEINR framework combining physical encoding and transformer-based fusion; 3) Novel temporal encoding via Gaussian RBF kernels and spatial stencil-aware encoding.
给作者的问题
1 How does the localized encoding scale to 3D+time with 6D coordinates? In a 3D + time scenario, the input coordinate dimension is increased to 4D (x, y, z, t). If physical parameters or boundary conditions are further considered, it may rise to 6D. The paper does not verify such high-dimensional extendibility, nor does it analyze the trend of parameter increase in the encoding layer.
2 Could adaptive σ in Gaussian encoding improve temporal modeling? In this paper, the parameter in the Gaussian RBF kernel is set to a fixed value of 0.1 (Section 4), while the NTK analysis in Fig.3c-d indicates that affects the spectral properties (small captures high frequencies, while large smooths out the spectrum). The paper does not explore dynamically adjusting to adapt to different time scales (such as transient versus steady-state flow), which might optimize multi-scale dynamics modeling.
3 Have you explored uncertainty quantification for reconstruction errors? In the field of CFD (Computational Fluid Dynamics), there is a high sensitivity to the credibility of errors (for example, in aviation safety assessments). However, the experiments only report mean metrics (such as PSNR and SSIM) without providing confidence intervals or analyses of error distributions. I would like the authors to discuss this question. But this does not detract from the value of the work itself.
4 What's the practical limit for numerical precision enhancement (β_max)?
5 How transferable is PEINR across different fluid regimes (e.g., turbulent vs laminar)?
论据与证据
1 Grid independence assumption violation: Fig.1 shows the attribute values at the same location change as the grid resolution (in the row direction) and numerical accuracy (in the column direction) increase.
2 Nonlinear Temporal Encoding effectiveness: The analysis of NTK in Fig.3 demonstrates improved temporal consistency.
3 Spectral bias mitigation: Ablation study (PEINR1 vs. PEINR2 in Table.2) shows spectral blocks reduce dissipation difference (DD).
方法与评估标准
Methods: The proposed method, for the first time, addresses both grid-resolution enhancement and numerical-precision enhancement simultaneously through a single model, breaking through the limitations of traditional methods that focus solely on a single objective.
Benchmark: The 5.4 TB of data proposed cover five classic flow problems (2D/3D, structured/unstructured grids), and for the first time, provide a strict pairing of grid resolution (=1~64) and numerical precision (=1~7 orders), filling the gap in the CFD community's lack of standardized benchmarks.
Metrics: All metrics are explained in the supplementary material and are quite relevant to this issue.
理论论述
-
NTK theory adaptation for temporal encoding shows translation invariance (Eq 2-3)
-
The combination of the RBF kernel and Kernel PCA (Eq4-5)
-
Spectral blocks enable high-frequency capture (Table 2)
实验设计与分析
- The introduction of HFR-Bench is a comprehensive benchmarking and experiments isolate the effects of grid resolution(\alpha) and numerical precision(\beta) by parsing low/high-fidelity fields with identical initial conditions and physical parameters. The testing on diverse flow regimes and the ablation study are sufficient. There are still some concerns in experiments.
One of my concerns is that there is only one 3d case(SV problem) that is tested, and what about the scalability to larger 3D domains(e.g., turbulent boundary layer)? I may not know much about it, but hopefully, there's a reasonable explanation. Another question is about the training-test temporal split. Why only test on the final 20 steps for uniform meshes ( the second column of L312 on page 6)? I think this may not fully stress long-term stability, and if a more aggressive split (e.g., training on first 50% steps and testing on latter 50%) would better assess robustness? After searching relevant literature, I found that in addition to regular and irregular grids, there are also grid-free methods(e.g., PINNs). The omission of this part may make the value of the proposed method discounted to a certain extent. I hope the authors can discuss this issue.
- The evaluation and the ablation study are convincing, and the proposed theories are well supported. While PEINR avoids the flawed grid-independence assumption, the paper does not quantify how mesh non-uniformity impacts performance, and there is no comparison to uniform-mesh performance under similar α/β.
补充材料
I have reviewed the supplementary material for the explanation about problems(e.g., RM, RT, FFS) for the 2D flow fields with uniform Cartesian mesh and the SV problem for 3D flow field, as well as details about the definition of evaluation metrics.
与现有文献的关系
N/A
遗漏的重要参考文献
The following references should be discussed in the paper:
- MeshGraphNets (Learning Mesh-Based Simulation with Graph Networks, 2020) for irregular meshes
其他优缺点
Strengths
- Comprehensive dataset with rigorous numerical basis
- Novel integration of spectral blocks with transformer
- Strong empirical results
Weaknesses
- Limited 3D validation (only 1 test case)
- Could there be a comparison with grid-free methods or some discussion on those grid-free methods?
- No comparison to uniform-mesh performance under similar α/β, I think it would be essential experiments about the insight of theories of this paper.
其他意见或建议
NA
Thanks for your reviews. Here are our explanations about the weaknesses and problems.
1.Response to comments on the single 3D dataset
The SV case is a classical benchmark in CFD that incorporates both shock waves and vortical structures, which can effectively validate our method's capability in capturing multi-scale, strongly nonlinear physical phenomena. While turbulent boundary layers were not directly tested, they share fundamental flow characteristics and field structures with the SV case, thereby supporting the extensibility of our method. The present study focuses on the fundamental validation of the method using canonical flow problems rather than comprehensive engineering applications. In future work, we plan to develop more test cases to further verify the method's applicability across multiple scenarios.
2.Response to comments on the experimental setting
In this paper, the original CFD data were saved at every 20 physical computation steps, so the 20 steps tested actually correspond to the evolution of 400 computational steps. We trained 20 steps(460-480) and tested 20 steps (480-500). During this phase, the key physical processes have fully evolved, and this experimental configuration has fully validated the long-term stability of the proposed method.
3.Response to comments on the difference from other methods (grid-free methods and PINNs)
In CFD, while regular/irregular grids rely on explicit geometric discretization, grid-free methods represent a novel numerical computation technique that can discretize the solution domain using nodes or particles, thereby eliminating the need for complex mesh generation processes.
PINNs are an extended application of implicit neural representations in scientific computing. In this study, CFD data are solved based on discrete meshes, which inherently introduce numerical dissipation and truncation errors, deviating from the Navier-Stokes equations governing fluid flow. Forcing PINNs to simultaneously fit this data and satisfy the equation residuals may lead to network convergence issues or generate non-physical solutions[1].
[1] Farea et al. Understanding physics-informed neural networks: techniques, applications, trends, and challenges.
4.Response to comments on the quantization of the impact of mesh non-uniformity
To clarify, our method avoids grid-resolution independence, not grid-type uniformity. While mesh non-uniformity can affect traditional solvers, PEINR’s implicit neural representation inherently decouples performance from both resolution and local cell shapes, as it operates on continuous coordinate inputs rather than discrete grid topology. Quantifying uniform-mesh comparisons would deviate from our core contribution.
5.Response to comments on missing citations
We appreciate the references you suggested. But we can only find this paper: “MeshGraphNets” and we will include this citation in our revision.
6.Response to comments on 6D Coordinates We appreciate the reviewer’s insightful question.
In studies involving coordinate expansion, the implicit mapping is typically from spatiotemporal coordinates to physical quantities, so the dataset we provide does not consider physical parameters or boundary conditions. Directly increasing the input dimensions by adding physical parameters or boundary conditions to form a 6D input can lead to several issues, such as semantic ambiguity and optimization conflicts. To address physical parameters or boundary conditions,we plan to conduct comprehensive tests based on hypernetwork in the future.
7.Response to comments on sigma
We thank the reviewer for highlighting the potential of adaptive σ in Gaussian encoding. Our fixed was empirically chosen to balance spectral coverage for cases in HFR-Bench, but we agree that dynamic could better handle multi-scale temporal dynamics.
We will explore schedules (e.g., curriculum learning from high to low σ) or physics-aware (e.g., linking σ to local vorticity magnitude).
8.Response to comments on uncertainty quantification
We repeated this process 3-5 times and calculated the error bars for all models to compare their robustness on high-fidelity flow field enhancement, as shown in Table1. https://drive.google.com/file/d/1Ghv7-cWKf2DQUEt-5O3JUdZfXVkIf3v-/view
9.Response to comments on the practical limit for numerical precision enhancement
depends on the training data. The numerical precision of data is higher, then is higher.
10.Response to comments on transferability of PEINR
Since the physical phenomena in the flow field vary greatly from problem to problem, there is no truly infinite generalization. Our model is not generalizable to all problems now, but with transfer learning, we can quickly adapt the model to converge across different numerical schemes for the same problem, as shown in Fig2 https://drive.google.com/file/d/1Ghv7-cWKf2DQUEt-5O3JUdZfXVkIf3v-/view.
Thanks for the rebuttal of the authors. Most of my concerns have been addressed.
Thank you for your valuable feedback. We are pleased that most of your concerns have been addressed in our rebuttal. Our method offers a novel and practical framework for high-fidelity field reconstruction, with unique capabilities in handling unseen temporal configurations without retraining—a key advantage over existing approaches. We will carefully incorporate all suggestions into the revised manuscript, and we hope these refinements will demonstrate the method’s readiness for acceptance in its current form.
Reviewers all appreciated the method, and found it sound and well supported by the experiments, though there are questions around the motivation and generality of the approach. Overall, the rebuttal was thorough and addressed some of the concerns, leading to a clear consensus towards acceptance. The dataset was found to be a valuable contribution. As a result, I believe that this work would be of interest to the community and I recommend for acceptance. I would ask the authors to incorporate the discussions into the updated manuscript and revisit the motivation at the light of the discussions.