PaperHub
6.6
/10
Poster4 位审稿人
最低3最高4标准差0.5
3
4
3
4
ICML 2025

Perceptually Constrained Precipitation Nowcasting Model

OpenReviewPDF
提交: 2025-01-20更新: 2025-08-04

摘要

关键词
Precipitation NowcastingSpatio-Temporal PredictionGenerative model

评审与讨论

审稿意见
3

This paper proposes a model called PercpCast for precipitation nowcasting, aiming to predict future rainfall patterns more accurately while also improving how realistic those predictions appear. The authors use a two-stage approach in a single end-to-end framework: first, they generate a "posteriori mean" sequence of future precipitation using a ConvLSTM-based estimator, then refine those estimates through a "rectified flow" module that better aligns the predicted distributions with the ground truth. To address the challenge of longer-horizon forecasting, the authors introduce a frame-sampling strategy that assigns more weight to frames further in the future. The model incorporates an LPIPS-based loss function to enforce perceptual consistency. Experiments on radar datasets (SEVIR and MeteoNet) show that proposed method outperforms various existing approaches in terms of accuracy metrics and visual quality metrics.

给作者的问题

  1. The author often labels the ConvLSTM’s output as a "posteriori mean sequence", but does not clearly explain why it represents the average of future rainfall. A brief note on how minimizing MSE leads the model to predict an "average" outcome would make this point clearer.

  2. In Section 5.3, the authors states "During end-to-end training, the gradient transfer between the precipitation estimator and the rectified flow model is stopped....". Could you clarify why this is done and how it aligns with the assumption that the predicted and true frames are independent, given the historical data?

  3. Please include details about the hardware specifications on which the model was trained. It would help readers compare the resource requirements of the proposed method with simpler baselines.

  4. In Figure 4, PercpCast appears to overestimate precipitation in certain areas (indicated by the pink colors), compared to the ground-truth maps. Could you clarify what might cause these overestimations.

  5. The paper mentions using LPIPS loss to make the predictions look more realistic, but it's not clear why a specific weight was chosen for it in the loss function. Did the authors run any experiments to figure out the best value, or was it selected based on intuition?

  6. The frame sampling strategy gives different importance to frames depending on how far they are in time. Could the authors share any experimental results or tests that show how this choice affects the model's accuracy, especially for longer predictions?

论据与证据

The paper's main claims about improved accuracy and perceptual quality are largely supported by experiments on two types of datasets. However, some points could benefit from clearer evidence or explanation. Specifically, it is not entirely clear why the authors stop the gradient transfer between the precipitation estimator and the rectified flow model. Additionally, it is unclear how the authors decided on the specific weight for the LPIPS loss.

方法与评估标准

The authors use recognized datasets (SEVIR, MeteoNet) and metrics (CSI, HSS, MSE, SSIM, LPIPS) that directly relate to precipitation forecasting and capture both accuracy and perceptual realism.

理论论述

In appendix, the authors includes a theoretical derivation that connects the precipitation nowcasting objective to an optimal transport framework. The derivation appears logically consistent with prior work on rectified flows.

实验设计与分析

The experiments use known radar datasets and standard precipitation metrics, aligning with typical nowcasting research. The train/validation/test splits are standard, and the comparisons with multiple baselines are appropriate. However, ablation studies on the LPIPS loss weight or frame sampling would further validate the design choices.

补充材料

I reviewed the supplementary appendix. It includes a theoretical derivation of the flow-based approach and several ablation results. For example, varying the scale factor (K) on SEVIR, testing 1-rectified flow, and comparing precipitation estimators (like SimVP) on SEVIR. There are also additional visual visual results on both SEVIR and MeteoNet showing comparisons with different baseline models.

与现有文献的关系

The paper extends ongoing research in precipitation nowcasting, which traditionally separates into deterministic approaches that focus on reducing mean-squared error, and probabilistic approaches that aim for more realistic detail. By combining a precipitation estimator with a rectified flow model, this work bridges both views: it maintains the long-term accuracy of deterministic models while incorporating the realistic detail of generative methods. The introduction of a frame-sampling strategy also connects to broader ideas in temporal modeling tasks.

遗漏的重要参考文献

No additional foundational works appear to be missing.

其他优缺点

Strength: PercpCast integrates a precipitation estimator with a rectified flow model to achieve forecasts that are both accurate and visually realistic. The model's frame-sampling strategy, which emphasizes distant frames, makes it particularly effective for long-horizon predictions. The proposed work is supported by thorough evaluations on established public datasets and comparisons with multiple baselines.

Additionally, the paper combines theoretical explanations with visual demonstrations, providing a well-rounded view of its strengths. Weakness:

One notable weakness is the lack of clarity regarding why gradient transfer is stopped between the precipitation estimator and the rectified flow model. The paper does not clearly explain how this decision aligns with the assumption that the predicted and true frames are independent, which leaves an important theoretical justification underexplored.

Additionally, the paper omits details about the hardware specifications used during training, making it challenging to assess the method's computational requirements compared to simpler baselines.

其他意见或建议

Suggestions are addressed in the “Questions for Authors” section

作者回复

We appreciate the reviewer’s detailed feedback. We will address their concerns and eager to engage in a more detailed discussion with the reviewer.

Q1.

Thank you for the comment. In precipitation nowcasting, the high uncertainty in short-term evolution means a single historical observation can correspond to multiple future scenarios. ConvLSTM minimizes the mean squared error lossLpe=E[YY^2]\mathcal{L}_{pe}=\mathbb{E}\left[\|Y-\hat{Y}^*\|^2\right], causing predictions Y^\hat{Y}^* to converge to the conditional expectation E[YX]\mathbb{E}[Y|X]. This produces a posterior mean sequence – a statistical average of all possible future precipitation outcomes under given input conditions.

Q2 & W1

In Section 3 and Appendix A, we outlined the gradient-stopping operation. To clarify: Due to diverging learning objectives between the precipitation estimator and the rectified flow model, gradient stopping is applied to prevent the rectified flow model from interfering with the estimator's acquisition of physical motion dynamics. This constraint forces the rectified flow model to actively learn physical consistency (e.g., motion continuity and distribution alignment) directly from input data. Meanwhile, end-to-end training improves model robustness and alleviates suboptimal solutions inherent in two-stage frameworks.

Regarding independence assumptions, our approach aligns with Freirich et al. (2021) in theoretical framework but diverges in the training methodology. Under the condition of stopping gradient propagation from Y^\hat{Y}^* to Y^\hat{Y}, the generation of Y^\hat{Y} does not influence Y^\hat{Y}^*. This establishes a Markov process: Y^Y^XY\hat{Y} \leftarrow \hat{Y}^* \leftarrow X \rightarrow Y. Consequently, given historical data XX, YY remains independent of both Y^\hat{Y} and Y^\hat{Y}^*, while Y^\hat{Y} depends solely on Y^\hat{Y}^*. This ensures compliance with the predictive independence assumption with theoretical justification, achieving causal decoupling between variables.

Q3 & W2.

Thank you for the comment. Our model employs mixed-precision training (FP16) on a single NVIDIA A100 80GB GPU, with supporting hardware including an Intel(R) Xeon(R) Platinum 8350C CPU @ 2.60GHz. Resource utilization metrics of our method compared to the ConvLSTM on the SEVIR dataset are summarized in the following Table.

MethodParameters (M)GPU Memory (Batch Size=1,GB)Training Time(hour)Inference Time (Batch Size=1,Second)
Proposed Model55.874.1162.47
ConvLSTM17.813.4111.92

Q4.

To understand the issue, we can see Figure 9 that short-term predictions are relatively easier due to closer alignment between predicted and real frame distributions, while long-term predictions suffer significant distribution drift - a discrepancy amplified by frame sampling strategies that prioritize learning long-term variations, thereby compromising short-term prediction accuracy. As shown in the Figure 9, the issue can be alleviated by using a moderate small kk. However, a smaller kk may reduce long-term prediction accuracy. Hence, it is a trade-off.

Q5.

We present the experimental results of different LPIPS loss weight configurations in Table 4, Table 5, and Figure 6. These demonstrate that incorporating the LPIPS loss effectively suppresses checkerboard artifacts in the rectified flow model. Notably, within a reasonable range(0.5~1), the specific weighting configurations do not significantly affect the experimental outcomes. Detailed results for additional weight configurations will be supplemented as the table below.

lpips weightCSIHSSSSIMLPIPSMSE
0.00.2560.3280.7010.3240.0102
0.20.2600.3420.7140.2870.0098
0.50.2670.3600.7220.2680.0092
0.70.2650.3570.7180.2650.0095
1.00.2650.3580.7110.2720.0094

Q6.

The experimental results and visualisations (Tables 4 and 6; Figures 5 and 7-9) show our exponential sampling strategy, where K determines the range of variation of the sampling probability. Figure 5 shows the k-probability relationship, while Table 4 evaluates different k settings. Figure 7 confirms larger k improves distant frame prediction accuracy. Figure 8 compares linear and exponential sampling. Experiments show that with more iterations, moderate increases in distant frame sampling probability improve long-term prediction. However, over-amplification undermines learning of other frames, causing performance drops.

审稿意见
4

This work proposes PercpCast, integrating both Precipitation Estimator (Video prediction model) and the Rectified Flow module. Rectified Flow module learns the transmission from the distribution of the posterior mean predicted by Precipitation Estimator to the distribution of ground truth. Further, LPIPS regularization is introduced in addition to the two typical loss terms for the Precipitation Estimator and Rectified Flow Modules. Besides, temperature-distance weighted scheduling is implemented to ensure the model focuses on the later frames. With all these techniques, PercpCast showcases its effectiveness in the outlined evaluation setting across two radar echoes datasets: SEVIR and MeteoNet.

给作者的问题

Questions are asked in the above sections.

论据与证据

Most claims are supported by the quantitative result in Table 2 and 3.

方法与评估标准

The proposed methods and evaluation criteria makes sense for this problem. To verify the generalisation of PercpCast, its performance is evaluated across two datasets and compared with several SOTA.

理论论述

The use of Rectify Flow is well grounded by ample previous works. There is no concern on the attempt here. However, I am quite confused on what do author attempt to show in Appendix A by proving E[Z^1Y^\*2]E[YY^2]\mathbb{E}[ || \hat{Z}_1 - \hat{Y}^{\*} ||^2] \leq \mathbb{E}[ || Y - \hat{Y}^{*} ||^2]. The said MSE is between the final prediction and precipitation estimation’s output, not the ground truth. Do the authors intend to use this to show that equation 20 is small enough so that equation 2 can also be satisfied?

实验设计与分析

The experimental design is generally fine, except a minor problem:

  • LPIPS is chosen to be one of the evaluation metrics. Meanwhile, it is introduced as a regularization term during the training of PercpCast. This might have fairness issues compared with other baselines. The authors can consider using FVD or pooled CSI as an alternative to LPIPS like the PreDiff paper.

补充材料

The appendix is read and reviewed.

与现有文献的关系

This work adopts a common strategy of using a precipitation estimator and fine-tuning module for precipitation nowcasting like DiffCast and CasCast. Different from previous works like DiffCast and PreDiff which utilize diffusion models in the second stage, the use of the Rectified Flow Module presents a similar but novel idea to model the difference in distribution. I believe this will very much benefit future studies.

遗漏的重要参考文献

Most essential references are discussed. It will be better to compare a few more diffusion-based models with PercpCast, such as PreDiff and CasCast (Gong et al., ICML 2024) since they have a closer structure with PercpCast.

其他优缺点

This section summarizes the strengths and weaknesses discussed above.

Strengths:

  • Using Rectified Flow to learn the distribution difference between the posterior mean and the ground truth is quite a new idea in this task.
  • It showcases its remarkable performance compared with SOTA in the perspectives of perceptuality and accuracy.

Weaknesses:

  • The current evaluation scheme (LPIPS) might be unfair.
  • A few minor but confusing parts in the appendix.

Overall, this paper delivers an interesting solution to precipitation nowcasting. Judging from the good performance results, I am inclined to accept the paper.


Update after rebuttal

The authors mostly addressed my concerns and adopted my suggestions. I will keep the recommendation.

其他意见或建议

  • Some information is not very consistent. In Table 1, the output sequence length is shown to be 49, but in the main text it is described to be 36. Does that mean the PercpCast model also reconstructs the input beside forecasting the future?
  • The writing in the appendix is quite messy, especially in Appendix C. Please proofread and fix.
  • A lot of previous works (PreDiff, DiffCast, CasCast, etc.) also report a pooled CSI with different thresholds to evaluate the “skillfulness” of the forecasts. Observing the tables in the papers, realistic and clear forecasts tend to have higher pooled CSI. This will also be a good indicator to replace LPIPS.
作者回复

We are grateful for the reviewer's acknowledgment of our work ​and their detailed feedback, which will help us refine our research.

Theoretical Claims.

Equation (2) can be solved through either Equation (20) or our proposed method, which has different error bounds. Freirich et al. established and showed that the theoretical optimal solution for Equation (20) is 2MMSE. Here we show the error bound of our method is smaller than 2MMSE by proving \\mathbb{E}\\left[\\left\\|\hat{Z}_1-\\hat{Y}^*\\right\\|^2\right] \\leq \\mathbb{E}\\left[\\left\\|Y-\\hat{Y}^*\\right\|^2\\right] in Appendix A. Since hatZ1\\hat{Z}_1 is the final output ​under the independence assumptions, we can get the conclusion by using the following equation:

\\begin{aligned} \\mathbb{E}\\left[\\left\\|Y-\\hat{Z}_1\\right\\|^2\\right] & =\\mathbb{E}\\left[\\left\\|Y-\\hat{Y}^*\\right\\|^2\\right]+\\mathbb{E}\\left[\\left\|\\hat{Z}_1-\\hat{Y}^*\\right\|^2\\right] \\ & \\leq 2 \\mathbb{E}\\left[\\left\\|Y-\\hat{Y}^*\\right\\|^2\\right]=2MMSE\\end{aligned}

W1 & S3.

Thank you for the advice. We incorporate additional experiments with CasCast and reported the pooled CSI prediction results as shown in the following table. The experimental results further validate the effectiveness of our method. The results will be updated in the revised manuscript.

MethodSEVIRMeteonet
Pool1Pool4Pool16Pool1Pool4Pool16
MAU0.2410.2680.2850.1970.2310.260
ConvLSTM0.2400.2660.2920.1920.2360.264
SimVP0.2410.2630.2830.1650.1960.214
Earthformer0.2140.2540.2650.1580.1890.207
Earthfarseer0.2090.2520.2670.1610.1930.212
STRPM0.2130.2360.2710.1540.1900.203
CasCast0.2380.2620.2890.1830.2070.231
DiffCast0.2440.2700.2940.1990.2350.265
PercpCast0.2670.2870.2990.2090.2400.268

W2 & S2.

Thanks for your careful review. We have reviewed Appendix C and corrected the experimental results in Tables 5–6, which will be updated as follows:

(Lpe, Lrf, Llpips)CSIHSSSSIMLPIPSMSE
(0, 1, 0.5)0.0440.3120.3110.3690.0217
(1, 0, 0.5)0.2400.3070.6630.2330.0085
(1, 1, 0.0)0.2560.3280.7010.3240.0102
(2, 1, 0.5)0.2660.3600.7170.2690.0091
(1, 2, 0.5)0.2640.3550.7120.2700.0093
(1, 1, 0.5)0.2670.3600.7220.2680.0092
(1, 1, 1.0)0.2650.3580.7110.2720.0094
KKCSIHSSSSIMLPIPS
0.000.2620.3480.7030.278
0.020.2630.3430.7090.276
0.050.2670.3600.7220.268
0.070.2660.3520.7160.265
0.10.2660.3460.7050.280
0.20.2500.3270.6820.292

S1.

Thank you for identifying this issue. The precipitation estimator ​reconstructs the input 13 frames and ​predicts 36 future frames, resulting in a total sequence length of 49. To eliminate ambiguity, we will revise Table 1 as follows:

DatasetSizeSeq LenSpatial Resolution
TrainValidTestInOutH × W
SEVIR13020100020001336128 × 128
MeteoNet864050015001336128 × 128
审稿意见
3

This article proposes a new precipitation forecasting model PercpCast, which introduces perceptual constraints into precipitation forecasting tasks. This method first uses ConvLSTM as a precipitation estimator to obtain the posterior mean sequence of future frames. Then, a module based on "rectified flow" is used to adjust the distribution of the posterior mean sequence to the distribution of the real target frame. Finally, a distance weighted frame sampling strategy is used to further enhance the attention to future frames. The experimental part was thoroughly validated on two public datasets, SEVIR and MeteoNet, and the results showed that the method exhibited certain advantages in perceptual quality (LPIPS, SSIM) and event detection metrics (CSI, HSS) while maintaining a low mean square error (MSE).

给作者的问题

See the above questions.

论据与证据

Yes, the problem being solved is the inaccurate landing point of gan and diffusion in precipitation, that is, the inability to balance CSI and image quality index LIPIS.

方法与评估标准

Yes, it does.

理论论述

The images are rescaled to the range [0, 1] and binarized. "Are you sure about binarized? Because the forecast is based on values in the range of 0-255, Binarized doesn't look right.

实验设计与分析

  1. In the innovation of the paper, it is mentioned that the current refined Gan and Diffusion have random sampling, which cannot balance CSI and image quality LIPIS (poor CSI, good LIPIS). Under the perceptual constraints of LIPIS, the second stage of flow matching can move towards a determined path towards the target distribution, reducing the accuracy of high echo landing points. The obvious difference between this method and Diffusion is that in the second stage, flow matching is used instead of Diffusion to refine the model, lacking the ability to ablate Diffusion and your Rectified Flow Model when using CnovLSTM as a precipitation estimator. This makes it difficult to verify the advantages of flow matching in balancing CSI and LPIPS, and it is unclear whether it is temperature weight weighting, lip loss, or flow matching performance that leads to the advantages of balancing CSI and LPIPS.
  2. The comparison method in quantitative indicators uses MAU and Earthfarseer, There is no visual comparison between these two models in the visualization (Figure 3 and Figure 4, as well as the visualization in the supplementary materials)

补充材料

Yes, it contains analysis and proof, datasets, more experimental analysis, and more precipitation cases.

与现有文献的关系

The problem being solved is the inaccurate landing point of gan and diffusion in precipitation, that is, the inability to balance CSI and image quality index LIPIS.

遗漏的重要参考文献

No.

其他优缺点

Strength: The innovation lies in the use of stream matching in the second stage, which assigns greater weights to frames with longer lead times as the forecast progresses, resulting in better forecast performance after 1 hour compared to other models. The motivation is clear.

Weakness: Some comparison methods are relatively old and lack comparison with some updated typical SOTA methods.

其他意见或建议

Fig. 4 with 'preparation study (in the blue box)' appears to be a black box, not a blue box.

作者回复

Thanks for the reviewer's valuable suggestions. We will try to address the reviewer's concerns and are eager to engage in a more detailed discussion with the reviewer.

Theoretical Claims.

Thank you for pointing out this issue. We perform ​normalization (not binarization) to rescale images to [0, 1]: SEVIR (0–255) is divided by 255, and MeteoNet (0–70) by 70. We will correct the 'binarized' with 'normalized' in the revised manuscript.

Experimental Designs Or Analyses 1

We would like to clarify that unlike diffusion models reconstructing precipitation predictions via conditional integration, our method employs end-to-end learning to directly optimize the posterior mean sequence distribution from the precipitation estimator. To enhance this framework, we introduce two key components: ​temperature-weighted scaling and ​LPIPS perceptual loss. Comprehensive ablation studies demonstrate: (1) LPIPS regularization successfully suppresses checkerboard artifacts in rectified flow, enhancing visual coherence(Tables 4 & 5); (2) Temperature weighting significantly improves long-term frame prediction accuracy(Tables 4 & 6); (3) The rectified flow module achieves exceptional modeling of data distributions, generating meteorologically plausible precipitation patterns that effectively address issues such as high echo attenuation and missing details(Tables 5, Figure 4 & 6).

To further compare the performance of diffusion models and Rectified Flow in precipitation prediction, we conducted experiments by replacing the Rectified Flow module with a diffusion model. Specifically, due to the instability caused by adapting end-to-end training to diffusion models, we first constructed a pre-trained precipitation estimator. While keeping other configurations unchanged, we then utilized noise and predicted frame as inputs for diffusion modeling during the frame sampling process. Additionally, we employed CasCast as a baseline comparison. CasCast is a non-end-to-end precipitation prediction framework where its first stage originally uses a Vision Transformer (ViT) for precipitation estimation, followed by a second stage that applies diffusion models for distribution refinement. In our implementation, we replaced CasCast's ViT-based precipitation estimator with a ConvLSTM model. The experimental results, presented in the following Table, further validate the effectiveness of the rectified flow model. The results will be supplemented in the revised manuscript.

MethodSEVIRMeteoNet
CSIHSSSSIMMSELPIPSCSIHSSSSIMMSELPIPS
with Diffusion0.2230.2880.6970.01350.2970.1770.2690.7970.00650.268
CasCast0.2380.3010.7090.01200.2850.1830.2740.8100.00620.252
Proposed Model0.2670.3600.7220.00920.2680.2090.3050.8200.00490.237
MethodSEVIRMeteoNet
CSI74CSI133CSI160CSI181CSI219CSI16CSI24CSI32CSI36CSI40
with Diffusion0.4370.1850.0750.0540.0210.2990.2150.0980.0350.022
CasCast0.4400.1930.1050.0670.0230.3150.2280.1080.0430.020
Proposed Model0.4960.2510.1340.0990.0370.3540.2760.1320.0680.027

Experimental Designs Or Analyses 2

These materials will be supplemented in the revised manuscript.

W

Our experiments have included SOTA methods DiffCast (CVPR 2024), Earthfarseer (AAAI 2024). Following your suggestion, we also compare our method with CasCast (ICML 2024) as shown in above tables and the results will be updated in the revised manuscript. This ensures necessary comparisons with 2024 conference benchmarks.

[1] Yu, D., Li, X., Ye, Y., Zhang, B., Luo, C., Dai, K., Wang, R.,and Chen, X. Diffcast: A unified framework via residual diffusion for precipitation nowcasting. In The IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

[2] Wu, H., Liang, Y., Xiong, W., Zhou, Z., Huang, W., Wang,S., and Wang, K. Earthfarsser: Versatile spatio-temporal dynamical systems modeling in one model. In Proceed-ings of the AAAI Conference on Artificial Intelligence, volume 38, pp. 15906–15914, 2024.

[3] Gong, J., Bai, L., Ye, P., Xu, W., Liu, N., Dai, J., Yang,X., and Ouyang, W. Cascast: Skillful high-resolutionprecipitation nowcasting via cascaded modelling. In International Conference on Machine Learning, pp. 15809–15822. PMLR, 2024

S

Thank you for identifying this inconsistency. We will correct the description from "blue box" to "black box" in the revised manuscript.

审稿意见
4

This paper proposes a precipitation forecast model based on perceptual constraints. Its main contributions include: proposing a new perspective on the precipitation forecast problem, that is, converting it into a posterior mean square error problem under specific constraints; designing a model architecture based on precipitation estimator and correction flow to predict precipitation series while maintaining its authenticity and continuity; and proposing a weighted sampling strategy for long-distance frames to improve the model's prediction ability for long-term series. Experimental results show that the model has better prediction accuracy than the existing optimal model.

给作者的问题

Please refer to the weaknesses.

论据与证据

The precipitation forecast model proposed in this paper is based on a new perspective and its effectiveness is demonstrated through experiments.

方法与评估标准

The method and evaluation criteria proposed in this paper are meaningful for solving the current precipitation forecast problem. This paper proposes a new perspective to solve the problems of existing methods in predicting long series, and adopts appropriate evaluation indicators to measure the accuracy and perceived quality of the model.

理论论述

This paper argues that the introduction of perceptual constraints can improve the performance of the current precipitation forecast model. Specifically, the model transforms the current precipitation forecast problem into a posterior mean square error problem and implements perceptual constraints by constructing a transmission between distributions. The experimental results of the model show that its performance is better than the current state-of-the-art model.

实验设计与分析

From the paper, it is evident that the authors have considered multiple factors in their experimental design and analysis, conducting detailed comparisons and evaluations. They selected several representative baseline models for comparison and tested the model performance under different parameter settings. Additionally, the authors provided a thorough explanation of the hyperparameter selection process and presented concrete experimental results, including specific data and figures. Therefore, I believe the experimental design and analysis in this paper are sound.

补充材料

The supplementary material of this paper provides an introduction to the dataset and more experimental analysis.

与现有文献的关系

The main contribution of this paper is the proposal of a perceptually constrained precipitation prediction model, which improves prediction accuracy and image quality by introducing perceptual constraints. This is different from the current precipitation prediction methods that only focus on minimizing the mean square error (MSE). This model addresses the limitations of existing methods by reconstructing the precipitation prediction problem and using perceptual constraints. The model also uses a sparse sampling strategy based on the attention mechanism and a residual flow structure to enhance the ability to focus on distant frames and capture future changes. These methods have better performance and stability than existing precipitation prediction methods. Therefore, the research results of this paper are meaningful for improving related research in the field of precipitation prediction.

遗漏的重要参考文献

N/A

其他优缺点

Strengths:

A new perceptually constrained precipitation prediction model is proposed, which can effectively improve the prediction accuracy and image quality.

The residual flow structure and sparse sampling strategy are used to enhance the ability to focus on distant frames and capture future changes.

Experimental verification is carried out on two public datasets, and better performance and stability are achieved than existing methods.

Weaknesses:

The prediction effect in some extreme cases has not been analyzed in detail and needs further discussion.

The experimental results do not provide detailed parameter settings and hyperparameter adjustment processes, making it difficult to reproduce the experimental results.

其他意见或建议

N/A

作者回复

We thank the reviewer for recognizing our ideas and theory.

W1

Thank you for your question. Due to the introduction of perceptual constraint, our model has the advantage of accurately preserving high-value part in prediction image, which indicates extreme weather storms. As shown in Figures 4 and 13, our model accurately predicts the evolution of the heavy precipitation band (above 160) and gives reliable intensity estimates.In spite of the advantage, our model may also fail to predict sudden convective storms that develop precipitation abruptly where no storm signals appear at the beginning. Improving such predictions requires incorporating atmospheric variables including temperature, humidity, and wind patterns during precipitation formation, which is a key objective for our subsequent research. We will add necessary discussions in our final version.

W2

We have elaborated on the impact of weight configurations for loss functions and distance sampling (Tables 4-6) in both the experiments and appendices. For other hyperparameters (e.g., learning rates), we identified appropriate values within the range of 1e-3 to 1e-5 and documented them in the main text(Section 5.1 Implementation Details). All hyperparameters in the model have been thoroughly specified, and the experimental code will be made publicly available on a community platform shortly as well.

最终决定

All reviewers support the acceptance of the paper, with two Accepts and two Weak Accepts. The main contribution is an end-to-end precipitation prediction model based on perceptual constraints using a RectifiedFlow framework. I agree with the reviewers that the paper has merits and therefore recommend acceptance. However, the authors are encouraged to incorporate the points discussed in the rebuttal into the final version.