From Uncertain to Safe: Conformal Adaptation of Diffusion Models for Safe PDE Control
We propose Safe Diffusion Models for PDE Control, which introduces the uncertainty quantile as model uncertainty quantification to achieve optimal control under safety constraints through both post-training and inference phases.
摘要
评审与讨论
This paper proposes a safe PDE control method based on the diffusion model inspired by conformal prediction. Specifically, the authors propose two new phases of post-training and inference-time fine-tuning to accommodate the quantified uncertainty score from the calibration set for the safety constraint and objective. Experiments show that the proposed method can achieve better overall results and meet safety constraints compared to baselines.
给作者的问题
See above.
论据与证据
From the perspective of PDE control theory, this paper solves a similar problem of in-domain PDE control, but the problem formulation in Eq (1) is more like offline safe RL because it misses the boundary and time conditions for the PDE dynamical system. I don't think the PDE safe control setting is the best manner to showcase the idea of a diffusion model with a conformal prediction for the safe sequential decision-making task. That is to say, as shown in (Liu et al. 2023a), there are tons of offline safe RL datasets and benchmarks to formulate the constrained optimization problem in Eq (1) and validate the methodology. The method part is disconnected from the motivation of the safe PDE control problem and can work on other general safe RL settings if it is a genuinely effective method.
方法与评估标准
There are some concerns and comments for method part.
-
Overall, the method is basically some fine-tuning over the calibration set so that the performance gets better, which is trivial and lacks in-depth insight. Post-train and inference-time fine-tuning are similar and not novel and impressive to the audience.
-
The distribution shift in Eq (7) is directly handled through previous weighted conformal prediction (Tibshirani et al. 2019), and post-training of Eq (12) directly follows the preliminary part of Eq (3), which greatly weakens the contribution and novelty.
-
Regarding conformal prediction, I agree that exchangeability is the key assumption but it may not hold in sequential decision-making settings. However, the reasons are not only the distribution shift but also the correlation of sequential data. It seems to be problematic to use the sample as the probability density function in Eq (7) because of such correlation between and within each trajectory as well.
理论论述
The theories and proofs directly follow the conformal prediction literature but under different settings. The theorems are not essential to the significance of the proposed method since it is very mature in conformal prediction.
实验设计与分析
I appreciate the different PDE settings and multiple baselines in the experiment part. However, it is expected to experiment on the PDE control benchmark https://github.com/lukebhan/PDEControlGym for off-the-shelf baselines and for fair comparison. Besides, in Table 1 and Table 3, some baselines can achieve 0% regarding safety constraints so it is expected to make the safety constraint harder to be satisfied to highlight the performance of the proposed method. I also wonder if the proposed method can do inference-time scaling, e.g. to adapt to different safety constraints without fine-tuning.
补充材料
Yes, most parts are reviewed.
与现有文献的关系
It is benificial to scientific ML and AI4Science.
遗漏的重要参考文献
Yes, some essential references are not discussed. PDE control problem has been well studied in the control and system community [1,2,3,4]. Another recent paper studies safe PDE boundary control [5], which should be discussed and compared. Also, regarding safe RL and safe control, control barrier function (CBF) based methods are missing [6,7].
[1] Krstic et al. Boundary control of PDEs: A course on backstepping designs, 2008
[2] Smyshlyaev et al. Adaptive control of parabolic PDEs, 2010
[3] Zhang et al. Controlgym: Large-scale control environments for benchmarking reinforcement learning algorithms, 2024
[4] Bhan et al. Pde control gym: A benchmark for data-driven boundary control of partial differential equations, 2024
[5] Hu et al. On the Boundary Feasibility for PDE Control with Neural Operators, 2024
[6] Dawson et al. Safe control with learned certificates: A survey of neural lyapunov, barrier, and contraction methods for robotics and control. 2023
[7] Zhao et al. Model-free safe control for zero-violation reinforcement learning, 2021
其他优缺点
See above.
其他意见或建议
N/A
Thanks for comments. Below are our responses.
Q1. Problem formulation (Eq. 1): like offline safe RL, no boundary and time conditions.
- in Eq. 1 is the PDE constraint. Following your advice, we will add boundary and time conditions along with explanations.
Q2. PDE-safe control setting may not be ideal, as safe RL can formulate the same. The method is decoupled from PDE and applicable to safe RL.
- PDE control scenario is important and complex. Its dimensions are high, and dynamics are nonlinear.
- We evaluate several SOTA safe RL baselines like CDT and TREBI.
- Our paper's motivation lie in highlighting importance of safe PDE control and introducing it into deep learning-based control. To address this, we design methods and develop experimental scenarios, datasets, and baselines across domains.
- Given PDEs' high-dimensional and complex dynamics, we choose diffusion models for their strong modeling power [1, 2]. To satisfy initial conditions, we apply conditional generation to enforce strict adherence.
- To address the concern, we provide results on a safe RL benchmark [3]. https://anonymous.4open.science/r/Safe-Diffusion-Models-for-PDE-Control-213C/Rebuttal.md
Q3. The method is some fine-tuning on the calibration set and trivial. Post-training and finetuning are similar and not novel.
- Our method goes beyond finetuning, incorporating post-training, guidance and key uncertainty quantification via conformal prediction (CP).
- It adjusts *not only on the calibration set. Post-training uses both training and calibration sets, while finetuning and guidance are applied to control targets and the calibration set.
- Post-training and inference-time finetuning adjust outputs completely differently: Post-training uses a reweighted loss (Eq. 12), while finetuning uses a sampling-based loss (Eq. 15) with additional guidance.
Q4. Weighted CP directly follow Eq. 7; Eq. 12 directly follows Eq. 3.
- We do not directly use Eq. 7. As noted in Section 4.2, applying it requires estimating the intractable ratio between generated and dataset distributions. To address this, we design a loss (Eq. 12) and prove Theorem 4.2 and 4.3 to justify using Eq. 13 for valid coverage.
- Eq. 12 does not directly follow Eq. 3. Loss in Eq. 12 is designed to both enable ratio estimation and promote safer, more optimal distributions. Its form is derived in Appendix A.
Q5. Exchangeability for CP breaks in sequential decision-making due to correlations between , , and within trajectories.
- As noted after Eq. 5, the distribution is over full state and control trajectory pairs, so dependencies between and , and within trajectories don't affect exchangeability.
Q6. Theories directly follow CP literature but under different settings, and are not critical to method's significance.
- Except for Theorem 4.1, theorems are largely independent of existing CP theory, with separate derivations.
- Theorems are essential: Theorem 4.2 aligns post-training distribution with the desired form, and Theorems 4.1–4.3 jointly ensure valid conformal intervals for the post-trained model.
Q7. Experiment on PDE ContRol Gym for off-the-shelf baselines and fair comparison.
- This benchmark is an online setting not for safe PDE control. As noted in Introduction, interaction with environments in safe control is risky, so we focus on offline control. In the absence of safe PDE benchmarks, we contribute safe datasets and environments. Our experiments follow widely used prior works (PhiFlow: 1.6k stars) and use the same equations as PDE ContRol Gym, including Burgers' and NS.
- This benchmark's baselines are not for safe control, while we compare SOTA safe RL methods. Our comparison is fair, with open-sourced code in the paper and tuned hyperparameters.
- We will cite this valuable benchmark in Related Works of the next version.
Q8. Some baselines achieve 0% unsafety (Tables 1 and 3), so strengthening constraints.
- There are two criteria for safe control methods: safety and control accuracy under safety. If only one method is safe, we can't compare the second criterion, limiting comprehensive analysis.
Q9. Can this method adapt to different safety constraints without finetuning?
- Different constraints: Our method remains applicable by computing each safety score minus its bound, combining them via smooth max [4], and constraining the result to be below 0, as in the current method.
- Without finetuning: We can take guidance to direct generated data to meet constraints.
Q10. References.
- We will add them in the paper.
Reference
[1] Synthetic Lagrangian turbulence by generative diffusion model.
[2] DiffusionPDE: Generative PDE-solving under partial observation.
[3] Datasets and Benchmarks for Offline Safe Reinforcement Learning.
[4] Magnetic control of tokamak plasmas through deep reinforcement learning.s
Thanks for the detailed clarification. I am not an expert in offline safe RL, and I suggest the authors add more explicit discussion regarding "online" PDE ContRol Gym [1] and the "online" safety filter based safe PDE boundary control [2] in the updated version. Also, since PDE control has been an old topic for decades in the control theory community, and so has safe control, the listed literature should be included in the related work part in the updated version. Other than that, most of my concerns have been addressed. I raise my score to 3.
[1] Bhan et al. Pde control gym: A benchmark for data-driven boundary control of partial differential equations, 2024
[2] Hu et al. On the Boundary Feasibility for PDE Control with Neural Operators, 2024
Thank you for the helpful suggestions. We will include a more explicit discussion about the "online" PDE ContRol Gym [1] and the "online" safety-filter-based safe PDE boundary control [2] in the next version. We will also add the listed references to the Related Work section and include more literature on traditional safe control.
References
[1] Bhan, et al. Pde control gym: A benchmark for data-driven boundary control of partial differential equations.
[2] Hu, et al. On the Boundary Feasibility for PDE Control with Neural Operators.
This paper introduces an approach that maintains safety constraints in PDE-constrained control problems. It employs uncertainty estimation with conformal prediction to optimize control while preserving safety. It fine-tunes a diffusion model using conformal prediction to produce safe control sequences. The experimental results in tasks like 1D Burger equation show the effectiveness of the proposed method.
update after rebuttal
My major concerns, such as running time, have been resolved. Therefore, I maintain my score as weak accept.
给作者的问题
N/A
论据与证据
Yes. I think the claims are correct and clear.
方法与评估标准
Yes. The evaluation criteria follow previous works and effectively assess performance. And the method is proposed to solve the safety issue,
理论论述
Yes. I think it is majorly correct.
实验设计与分析
The paper designs good experimental designs that assesses SafeDiffCon performance across three PDE control tasks: 1D Burgers’ equation, 2D incompressible fluid, and controlled nuclear fusion. The chosen baseliens are also relatively new. Therefore, the experimental design is valid.
补充材料
N/A. No supplementary material is uploaded.
与现有文献的关系
This research makes a good contribution by combining diffusion models, conformal uncertainty quantification, and safe control frameworks within PDE-constrained control systems which fills a research gap.
遗漏的重要参考文献
No
其他优缺点
Strengths
- This paper is well-written. The proof is well performed.
- The evaluation is comprehensive, and the experimental results show the effectiveness of the proposed method.
Weakness
- Lack of evaluation of its performance in real-world scenarios.
- The performance of the proposed framework heavily relies on the collected training data.
- The introduced framework is a little complex, and the time needed is unclear.
其他意见或建议
- It would be better for the authors to provide a table about the running time of the proposed methods and the baselines.
Thanks for the constructive review. Below are the responses.
Q1. Lack of evaluation of its performance in real-world scenarios.
- In fact, the tokamak control in the paper is a near real-world experiment for controlled nuclear fusion which is a highly-nonlinear and highly coupled system. This environment is trained using real data collected from the KSTAR tokamak device (https://github.com/jaem-seo/KSTAR_tokamak_simulator). Additionally, its simulation has been tested with many real discharges, showing reasonable predictions and acceptable prediction accuracy.
Q2. The performance of the proposed framework heavily relies on the collected training data.
- Compared to baselines, our method does not rely more heavily on the collected training data. The table below shows the average unsafety rate of the training set and the generated data. It can be seen that the distribution of data generated by our method differs from that of the training data. Moreover, only our method achieves full safety.
| Datasets | Training data | Generated data |
|---|---|---|
| Burgers' equation | 89.7% | 0% |
| Incompressible fluid | 53.1% | 0% |
| Tokamak fusion reactor | 71.2% | 0% |
- As mentioned in the third and fourth paragraphs of the Introduction, interacting with the environment in the safe PDE control problem is hazardous. Therefore, we choose a more appropriate offline setting, where we can only rely on the data that has already been collected.
Q3. The introduced framework is a little complex.
- Overall framework:
- The method begins with the introduction of uncertainty quantification through the concept of uncertainty quantiles. This is integrated throughout the algorithm to address potential gaps between the actual and predicted safety scores, which could lead to unsafe events not anticipated by the model. By introducing them, we address the critical issue of predictive uncertainty and its impact on safety scores, enabling more robust and safe control actions.
- After pre-training, we employ a reweighted loss function to post-train the model’s output distribution, guiding it toward regions with better safety and more optimal objectives.
- During the inference phase, task-specific fine-tuning is performed on the post-trained model, allowing it to achieve better control performance and stronger safety guarantees for specific tasks, with minimal adjustments needed.
- To help understand, we have present our proposed method with the framework outlined in Figure 1, Algorithm 1 and the first paragraph of Method. We are happy to provide a clearer explanation of the algorithm at the beginning of the Method section to enhance readability.
Q4. Comparison of running time.
- Thanks for the good suggestion. The table below presents the comparison of the inference time between our method and baselines on the Burgers' equation on A800 with 8 CPUs. We would like to highlight that we have enhanced the inference efficiency using several techniques. Firstly, by introducing post-training, the data distribution is already closely aligned with the desired distribution prior to inference. Secondly, we significantly speed up the sampling process of the diffusion model using DDIM [1]. As a result, our method demonstrates a reasonable inference speed and is considerably faster than TREBI, which is also based on the diffusion model.
| Methods | BC | PID | SL-Lag | MPC-Lag | CDT | TREBI | Ours |
|---|---|---|---|---|---|---|---|
| Inference Time (min) | 0.1351 | 0.1091 | 0.8842 | 26.2905 | 0.0890 | 13.5525 | 2.3575 |
References
[1] Denoising Diffusion Implicit Models.
Thanks for the reply! I have one more question here. Which base diffusion model are you using? Is it the DDIM model?
Thank you for your question! The diffusion model we use is trained following the standard DDPM framework [1], while the sampling process follows DDIM [2]. However, our code also provides an option for sampling using the standard DDPM procedure.
References:
[1] Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models[J]. Advances in neural information processing systems, 2020.
[2] Song J, Meng C, Ermon S. Denoising Diffusion Implicit Models[C]. International Conference on Learning Representations, 2021.
This paper introduces SafeDiffCon, a method that integrates safety constraints into diffusion models for PDE control tasks. By leveraging conformal prediction to quantify model uncertainty, the approach employs post-training with a reweighted loss and inference-time fine-tuning to align generated control sequences with safety requirements. Experiments on 1D Burgers’ equation, 2D fluid dynamics, and nuclear fusion control demonstrate that SafeDiffCon uniquely satisfies safety constraints while achieving superior control performance compared to baselines like BC, MPC-Lag, and TREBI. The key innovation lies in combining uncertainty-aware conformal adaptation with diffusion models to address distribution shifts in offline settings.
给作者的问题
- Can you provide inference time cost comparison with baselines?
- Why is the diffusion model chosen for the control problem, which is essentially different from the generation task?
- What PDE is the Tokamak control problem driven by?
论据与证据
The paper claims that SafeDiffCon achieves optimal control under safety constraints, and the experiment results support the claim.
方法与评估标准
The post-training and fine-tuning methods are suitable for constrained problems. The benchmark datasets Burgers, NS, and Tokamak make sense for the control applications.
理论论述
The theoretical claims seem to be correct.
实验设计与分析
The inference efficiency, e.g., the inference time cost is not compared. The proposed model's inference finetune might induce significant extra time cost.
补充材料
I reviewed the experiment details in Appendix E.
与现有文献的关系
Related to Control Theory, Numerical Methods.
遗漏的重要参考文献
No.
其他优缺点
No
其他意见或建议
No
Thanks for your insightful comments. Below are our responses.
Q1. Inference time cost comparison with baselines. The proposed model's inference finetune might induce significant extra time cost.
- Great suggestion. We would like to point out that we have accelerated the efficiency of inference. On one hand, by introducing post-training, the data distribution is already close to the desired distribution before inference. On the other hand, we have significantly accelerated the diffusion model's sampling time using DDIM [9].
- The table below presents a comparison of the inference time between our method and the baseline on the Burgers' equation on A800 with 8 CPUs. It shows that our method achieves a moderate speed and is much faster compared to TREBI, which is also based on the diffusion model.
| Methods | BC | PID | SL-Lag | MPC-Lag | CDT | TREBI | Ours |
|---|---|---|---|---|---|---|---|
| Inference Time (min) | 0.1351 | 0.1091 | 0.8842 | 26.2905 | 0.0890 | 13.5525 | 2.3575 |
Q2. Why is the diffusion model chosen for the control problem, which is essentially different from the generation task?
- In fact, diffusion models have already been widely used in decision-making tasks, including scenarios such as PDE systems [1, 2], robotics [3, 4], and traditional reinforcement learning [5, 6, 7]. This is because the task can naturally be modeled as a probabilistic distribution over state and control sequences, and the diffusion model has many unique advantages in handling such tasks.
- It has superior modeling capabilities due to its ability to capture complex distributions and generate high-fidelity outputs by progressively refining predictions over multiple denoising steps.
- By denoising from a Gaussian distribution, it models the entire trajectory, meaning it learns to generate states at different time steps simultaneously, aiding in capturing long-range dependencies. So it can perform global optimization.
- Furthermore, studies have shown that it is robust to noise which is essential in safe control problems [8].
Q3. What PDE is the Tokamak control problem driven by?
- Tokamak control involves the coupling of multiple PDEs. The primary equation is the Grad–Shafranov equation, along with others such as the Heat Transport Equation and the Skin Effect Equation [10].
References
[1] A generative approach to control complex physical systems.
[2] Wavelet diffusion neural operator.
[3] Diffusion policy: Visuomotor policy learning via action diffusion.
[4] Hierarchical diffusion policy for kinematics-aware multi-task robotic manipulation.
[5] Planning with diffusion for flexible behavior synthesis.
[6] Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning.
[7] Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling.
[8] Diffusion Models are Certifiably Robust Classifiers.
[9] Denoising Diffusion Implicit Models.
[10] Reconstruction of current profile parameters and plasma shapes in tokamaks.
The paper introduces SafeDiffCon, a method integrating safety constraints into deep learning-based control of PDE systems through diffusion models. Addressing the gap in existing methods that neglect safety, SafeDiffCon employs conformal prediction to estimate uncertainty quantiles, which guide both post-training and inference phases. Evaluated on 1D Burgers equation, 2D incompressible fluid flow, and a nuclear fusion control problem, SafeDiffCon uniquely satisfies all safety constraints across tasks, outperforming classical and deep learning baselines in control performance. Key contributions include the integration of uncertainty quantification via conformal prediction, safety-constrained diffusion training, and adaptive inference mechanisms, demonstrating robust and safe PDE control in complex scenarios.
给作者的问题
-
To what extent can the conformal prediction intervals in SafeDiffCon provide reliable coverage guarantees for safety scores when applied to novel control tasks during inference, especially considering the distribution shift between calibration data and the optimal control distribution?
-
While the paper presents an interesting approach to safe PDE control, I'm curious about the specific choice of diffusion models as the foundation. Could you elaborate on the intrinsic advantages that diffusion models provide for your method compared to other generative or deep learning approaches? Where specifically does the synergy between your uncertainty quantification framework and the diffusion model architecture manifest? Would applying similar conformal adaptation techniques to alternative model architectures yield comparable safety guarantees and performance improvements? These are my main concerns of the paper.
Post Rebuttal
Most of my concerns are appropriately addressed. I choose to increase the score.
论据与证据
Yes.
方法与评估标准
Yes.
理论论述
Yes. I have checked the proofs.
实验设计与分析
Following previous works, the designs of experiments are clear and well-organized. The analysis is also convincing.
To improve the quality of the paper, I suggest the authors provide more ablation analysis on the other two tasks, 1D Burgers Equation and nuclear fusion control, to better demonstrate the effectiveness of the proposed method.
补充材料
I have reviewed all of the supplementary material.
与现有文献的关系
The work builds upon existing diffusion models for physical control (like Wei et al., 2024; Hu et al., 2024) but extends them with conformal prediction techniques (Vovk et al., 2005; Tibshirani et al., 2019) to handle distribution shifts between training data and desired safe controls. The approach combines post-training with reweighted loss functions and inference-time fine-tuning, enabling models to satisfy safety constraints while optimizing control objectives. This relates to safe offline reinforcement learning methods like CPQ (Xu et al., 2022), COptiDICE (Lee et al., 2022), and TREBI (Lin et al., 2023), but differs by specifically addressing PDE-constrained systems and using uncertainty quantiles based on conformal prediction to provide safety guarantees. The work demonstrates applications in three different physical domains (1D Burgers' equation, 2D incompressible fluid dynamics, and controlled nuclear fusion), showing greater effectiveness than traditional control methods (PID, MPC-Lag) and deep learning alternatives.
遗漏的重要参考文献
To the best of my knowledge, the paper has discussed all essential references.
其他优缺点
Strengths:
-
The paper addresses a critical gap in existing methods by integrating safety constraints into deep learning-based control of PDE systems.
-
This paper is well-written and clearly organized, with a strong motivation and clear contributions.
-
The experimental results are well-presented and provide a comprehensive evaluation of the proposed method in various physical domains, highlighting its effectiveness.
Weakness:
-
The effectiveness of the uncertainty quantile seems to be related to the choice of various parameters, such as coverage probability , the split ratio of training/calibration data, and the weight of objective . It is suggested to provide more ablation analysis on these parameters to better understand their impact on the performance of the proposed method.
-
As mentioned above, more ablation analysis on the other two tasks, 1D Burgers Equation and nuclear fusion control, would further demonstrate the effectiveness of the proposed method.
其他意见或建议
There are a few typos in the paper that need to be corrected:
-
On page 6, in Figure 2's caption, the authors refer to their method as "SafeConPhy" instead of "SafeDiffCon".
-
Same on page 19, in appendix H.1, the authors refer to their method as "SafeConPhy" instead of "SafeDiffCon".
-
On page 22, in the "H.7. PID" section, there's a typo in the first sentence where "Propercentageal" is written instead of "Proportional" when describing the PID control method.
We greatly appreciate your recognition. Below are our responses.
Q1. Ablation studies on other two tasks, 1D Burgers' Equation and tokamak control.
- Thanks for the suggestion. We conduct ablation studies on these two tasks. Results still show that the absence of any module affects the model, causing it to fail to meet the safety constraints. Therefore, each module is effective. These will be added to the manuscript.
Burgers':
| Methods | J | |||
|---|---|---|---|---|
| SafeDiffCon | 0.0011 | 0% | 0% | 0% |
| w/o post-training | 0.0014 | 4% | 50% | 0.01% |
| w/o fine-tuning | 0.0007 | 40% | 20% | 3% |
| w/o Q | 0.0006 | 30% | 10% | 1% |
Tokamak:
| Methods | J | ||
|---|---|---|---|
| SafeDiffCon | 0.0094 | 0% | 0% |
| w/o post-training | 0.0153 | 10% | 0.52% |
| w/o fine-tuning | 0.0210 | 50% | 7.74% |
| w/o Q | 0.0269 | 28% | 0.65% |
Q2. Analysis on parameters, such as coverage probability α, the split ratio of training/calibration data, and the weight of objective γ.
- Great suggestion! We conduct experimental analysis on these parameters. The tables show that they do not impact much on performance, indicating that the model is robust. This will be added to the next version.
- Tokamak:
| α | J | ||
|---|---|---|---|
| 0.8 | 0.01017 | 0% | 0% |
| 0.85 | 0.0091 | 0% | 0% |
| 0.9 | 0.0094 | 0% | 0% |
| 0.95 | 0.0095 | 0% | 0% |
| Split ratio | J | ||
|---|---|---|---|
| 0.005 | 0.0124 | 4% | 0.03% |
| 0.01 | 0.0093 | 0% | 0% |
| 0.02 | 0.0094 | 0% | 0% |
- Incompressible fluid:
| γ | J | SVM | |
|---|---|---|---|
| 0.01 | 0.3548 | 0 | 0% |
| 0.1 | 0.4953 | 0.004 | 2% |
| 0.3 | 0.498 | 0.003 | 2% |
Q3. How reliable are the conformal intervals in SafeDiffCon for safety scores on novel control tasks, given the distribution shift between calibration data and the optimal control distribution during inference?
- Good questions! Novel control tasks include three cases: new equation forms (i.e., dynamics), (1) new safety constraints and bounds, and (2) a shift in the optimal control distribution and (3) the distribution of the calibration set.
- We primarily discuss the third case in the paper. Both theoretical and experimental results demonstrate that our method can adapt to this distribution shift, ensuring that the actual score is covered. However, the theoretical analysis assumes that the diffusion model's approximation of the training set distribution during pretraining is not too poor.
- We lack discussion of the first and second cases in the paper, which will be added in the next version. Since Eq. 5 and Eq. 11 are tied to specific dynamics and safety constraints, the conformal interval doesn't generalize to new ones. However, we can skip the post-training and use inference-time finetuning to compute conformal prediction intervals based on new tasks, ensuring reliable coverage guarantees.
Q4. Why choose diffusion models? What advantages do they offer over other generative or deep learning methods? Where specifically does the synergy between your uncertainty quantification framework and the diffusion model architecture manifest? Could similar conformal techniques applied to other models achieve comparable safety and performance?
- Since our approach to uncertainty quantification is probabilistic, we require a model with an explicit probabilistic framework. Diffusion models meet this requirement, as they allow us to analyze the relationship between the generated distribution and training set distribution. This enables us to incorporate probability-related theory into its training and sampling algorithms, such as our consideration of distribution shift and the design of reweighted loss.
- With multi-step denoising, diffusion models have strong modeling capabilities, allowing it to handle high-dimensional, long-term problems. It has been used in weather and PDE system modeling tasks [1, 2]. Additionally, because it denoises from the Gaussian distribution and models the entire trajectory rather than transition pairs, it can perform global optimization, and has been applied in many decision-making scenarios [3, 4]. Furthermore, studies have shown that it is robust, which means it can withstand noise attacks [5] and is critical in safe control. Experimental results also demonstrate its superiority in control problems.
- Alternative model architectures can also draw from ideas of conformal prediction. However, due to the strong capabilities of diffusion models and their ability to explicitly model the relationship between training and generated distributions, they integrate with conformal prediction more naturally and effectively.
Q5. Typos.
- Thanks. We will correct them.
References
[1] Dyffusion: a dynamics-informed diffusion model for spatiotemporal forecasting.
[2] Synthetic Lagrangian turbulence by generative diffusion models.
[3] Planning with diffusion for flexible behavior synthesis.
[4] Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning.
[5] Diffusion Models are Certifiably Robust Classifiers.
The paper introduces SafeDiffCon, a method integrating safety constraints into deep learning-based control of PDE systems through diffusion models. In relation to existing work, this work builds upon existing diffusion models for physical control but extends them with conformal prediction techniques to handle distribution shifts between training data and desired safe controls. There was limited discussion on this paper: reviewer 54oM requested an ablation analysis, which the authors provided, and which lead to an increased score. Reviewer yisF raised two key concerns, firstly that the method is "essentially finetuning" and second that they would like to see evaluation on the PDE control suite. These concerns were adequately addressed in rebuttal, again leading to an increased score. After discussion there is consensus that this paper be accepted, and I concur.