PaperHub
5.7
/10
Rejected6 位审稿人
最低3最高8标准差2.1
8
6
6
3
8
3
4.5
置信度
正确性2.5
贡献度2.5
表达3.0
ICLR 2025

PGLearn - An Open-Source Learning Toolkit for Optimal Power Flow

OpenReviewPDF
提交: 2024-09-27更新: 2025-02-05
TL;DR

Open-source dataset and learning toolkit for optimal power flow

摘要

关键词
optimal power flowmachine learningdataset

评审与讨论

审稿意见
8

The paper tackles an important issue in the field of machine learning for Optimal Power Flow (OPF) problems by introducing PGLearn, a standardized suite of datasets and evaluation tools. This addresses the lack of uniformity in data and metrics that has hindered progress in this area.

优点

• The paper presents robust procedures for generating realistic datasets. • It offers support for multiple OPF formulations, including AC, DC, and second-order cone models. • The dataset is structured in the N-1 contingency format which can be used to enhance power systems reliability. • Standardized datasets are made publicly accessible for broader use. • A comprehensive toolkit is provided for training, evaluating, and benchmarking machine learning models for OPF. • The limitations of the dataset are clearly documented and reported for transparency.

缺点

• The input data is generated using a uniform sampling procedure, but in the future to extend the work for Unit Commitment or multi-period OPF, time series data should also be incorporated. For instance, the recent paper "PowerGraph: A Power Grid Benchmark Dataset for Graph Neural Networks" provides an OPF dataset that uses time series data to model input demand.

问题

• Could you provide comments on the results of benchmarking the datasets using the provided metrics across different machine-learning methods? A leaderboard would be the easiest way to access and interpret these results.

评论

We thank the reviewer for their close review and insightful comments. Detailed answers to the reviewer's questions follow.

The input data is generated using a uniform sampling procedure, but in the future to extend the work for Unit Commitment or multi-period OPF, time series data should also be incorporated. For instance, the recent paper "PowerGraph: A Power Grid Benchmark Dataset for Graph Neural Networks" provides an OPF dataset that uses time series data to model input demand.

We thank the reviewer for pointing us to this recent paper. We agree that capturing temporal structure is an important avenue for generating more realistic datasets, and we intend to do this in future versions of PGLearn. Please refer to our general response for additional details.

Could you provide comments on the results of benchmarking the datasets using the provided metrics across different machine-learning methods? A leaderboard would be the easiest way to access and interpret these results.

We thank the reviewer for their suggestion to create a leaderboard. While we unfortunately do not have the resources (both time-wise and computing-wise) to re-implement and benchmark all existing ML methodologies for Optimal Power Flow problem, we agree that a public leaderboard would provide valuable information to the community. We hope to develop a community-driven leaderboard where researchers can submit their results to faciliate the integration of multiple methodologies, similar to, e.g., a Kaggle competition.

评论

I thank the authors for the point-by-point response. I understand the challenges and limitations you mentioned. I would recommend rephrasing the statement in the paper that implies a full benchmarking was conducted, as it may not accurately reflect the current state of the work.

评论

We thank the reviewer for their suggestion to improve the paper. We indeed do not want to convey that we conduct a full benchmark, and would appreciate if the reviewer could point us to the specific statement they are referring to, so we can update it accordingly.

审稿意见
6

This paper presents PGLearn, a set of datasets and evaluation tools for optimal power flow (OPF), which builds upon the learnings of existing open-source initiatives. Their main contribution relies on including a different data generation process that considers local and global variability, better representing the actual use cases. It supports different formulations of the OPF problem, namely AC, DC, and second-order cone formulations. The motivation behind this resource is to standardize the performance evaluation among the OPF community, which seems to be fragmented and lacking reproducibility between contributions.

优点

  • The paper revisited the OPF problem and identified a gap in the existing approaches, such as the sampling strategy. Then, they proposed a better method that considers the correlation between loads and better represents real-life scenarios.
  • Including the primal and dual solutions is another novelty, as few existing datasets include the dual, which is something specific research directions need.
  • The authors propose a complete pipeline to interact with PGLearn containing standardized datasets, existing optimizers, data augmentation schemes, distinct OPF formulation, and metric calculation.
  • The paper is well structured and clear and introduces their contribution smoothly.
  • The authors stated clearly the limitations of their contribution

缺点

  • Algorithm 1, the notation of what you return could be misleading as you used both (pd,qd)(p^d,q^d) for referenced demand and sampled demand. It looks as if you are returning the same reference value at the end of the algorithm.
  • The sampling strategy needs tuning the ϵ\epsilon and global range. You could add more detail on how to pick the values, giving a case or detail why did you choose the quantities you set.
  • The supplementary material is not self-explanatory for someone outside of the OPF domain; I can understand each file from what I read in your paper's appendix, but you could have added basic instructions. Without instructions, it is just a folder with h5 files.
  • For a work like yours, you should've added an anonymized version of your code to assess how you implemented what you described only with words in the main paper
  • If putting the code is restrictive at this stage, you could also give an idea of the workflow of PGLearn so readers can get an idea of what kind of interaction they can have with your dataset.

问题

  • When you mention having over 1000000 OPF samples, are they all precomputed with your dataset, or do you provide a method to access the results (h5 files)?
  • Can you please provide me with step-by-step instructions on empirically using the supplementary material to check your contribution?
  • How can you ensure that your sampling strategy stays within normal ranges? Why did you choose the values you picked?
  • In section 5.1, what is the main difference between the Optimality gap and the Distance to optimal solution?

伦理问题详情

N/A

评论

We thank the reviewer for their close review and valuable feedback. Detailed answers to the reviewer's questions follow.

Algorithm 1, the notation of what you return could be misleading as you used both for referenced demand and sampled demand. It looks as if you are returning the same reference value at the end of the algorithm.

We thank the reviewer for their careful review of the notation used in the paper. We have updated the notation in Algorithm 1 to avoid confusion.

The supplementary material is not self-explanatory for someone outside of the OPF domain; I can understand each file from what I read in your paper's appendix, but you could have added basic instructions. Without instructions, it is just a folder with h5 files.

For a work like yours, you should've added an anonymized version of your code to assess how you implemented what you described only with words in the main paper

We thank the reviewer for raising this point, and have included anynomized code repositories to the supplemental material. Please review the README files at the root and within each repository for basic usage instructions, and note that the source code is documented via docstrings. The official repositories (which cannot be shared under ICLR anonymity rules) also provide live documentation.

评论

Thank the authors for addressing some of my comments on the Weaknesses section, especially for including the anonymized code in the supplementary material.

I understand that you might be working on addressing the concerns of other reviewers who have numerous other questions. Still, I would appreciate it if you could address my questions, as you overlooked them and focused only on answering the weaknesses.

审稿意见
6

The paper introduces PGLearn, a suite of standardized datasets and evaluation tools designed to advance ML applications in OPF problems. It tries to provide the standard genration of varied data and evaluation metrics by providing realistic datasets that reflect real-world conditions and support multiple OPF formulations.

优点

  1. The dataset is timely and important given the current active research in ML for OPF
  2. The presentation is clear and easy to follow

缺点

  1. For my best understanding, the paper mainly focus on the dataset introduction, it does not deliver the technical advances in the search field
  2. While the datasets aim to capture real-world conditions, unforeseen variability in actual grids may still limit the applicability of the models trained on these datasets.
  3. It only focus on OPF problems, which may not be fit for the general audience of the ML community

问题

  1. The authors are suggested to include a general review on the ML for OPF papers, e.g., how ML are generally applied to OPF problems. There have been several literature review papers on it.

  2. The authors are suggested to state clearly whether the kit contains the function to train/test different ML models? If so, please clarify how to integrate the numerious DNN models and approaches; if not, how should the mentioned Metrics be calculated? Is it still done based on the user side to mannually do so?

  3. It is also useful to add a set of adversial samples in the dataset for robustness testing under the worst case scenario

  4. Can the suite allow users to self-define the problem and use e.g., API calls, to load the generated dataset? It can be helpful if user have some uncommon reformulated OPF formulations and other general optimization formulation

评论

We thank the reviewer for their careful reading of our paper, and insightful feedback. Answers to the reviewer's specific questions follow. The response is split into two comments due to the character limit.

While the datasets aim to capture real-world conditions, unforeseen variability in actual grids may still limit the applicability of the models trained on these datasets.

We thank the reviewer for raising this point, as we recognize the importance of capturing real-world conditions as closely as possible. As discussed in the paper, the vast majority of existing works consider only uncorrelated load-level noise, which yields a very narrow range of total demand across samples (as shown by Figure 2). In contrast, our approach explicitly captures total demand fluctuations by introducing a global scaling factor (denoted by bb in Algorithm 1). Figure 2 demonstrates that this strategy not only yields a broader range of total demand, but also results in datasets with more complex dynamics. We believe this is a meaningful step towards more realistic datasets, as this mechanism better captures diurnal / seasonal variations in total demand. Nonetheless, we agree that more sophisticated correlation structures derived from real-world data are valuable for future research, and intend to include this aspect in future releases of PGLearn.

It only focus on OPF problems, which may not be fit for the general audience of the ML community

We thank the reviewer for this feedback regarding the breadth of PGLearn. We would like to point out that there has been significant interest from the ML community in developing new methodologies for optimization problems that arise in power systems, as demonstrated by the vast body of existing work (see survey papers we cite and the references therein). Furthermore, OPF has proved to be a popular testbed to develop general-purpose ML methodologies for optimization and constrained learning in general, including: DC3, Primal-Dual Learning, Gauge Mapping, Homeomorphic projections, Dual optimization Learning. In that context, PGLearn provides large-scale datasets that comprise multiple OPF formulations spanning three classes of optimization problems: ACOPF (nonlinear, non-convex), SOC-OPF (nonlinear, convex) and DCOPF (linear, convex). The collection also provides user-friendly, vectorized data structures that are readily usable by ML researchers, without needing a complete understanding of power systems dynamics. Therefore, we believe that PGLearn has the potential to support not only research in ML for OPF problems, but broader topics such as ML and optimization and constrained learning as well.

For my best understanding, the paper mainly focus on the dataset introduction, it does not deliver the technical advances in the search field

The main contributions of this paper are 1) the PGLearn dataset, 2) the standardized evaluation methodology, and 3) the accompanying open source data-generation and evaluation code. We believe these contributions align with the scope and objective of the "datasets and benchmarks" track which this paper has been submitted to.

It is also useful to add a set of adversial samples in the dataset for robustness testing under the worst case scenario

We thank the reviewer for raising this point regarding adversarial samples and robustness. To the authors knowledge, adversarial samples are typically considered "adversarial" against a particular model. We are curious what exactly the reviewer has in mind regarding adversarial OPF samples on the dataset level.

The authors are suggested to include a general review on the ML for OPF papers, e.g., how ML are generally applied to OPF problems. There have been several literature review papers on it.

We thank the reviewer for their suggestion. The paper's introduction and related work sections (Section 1 and Section 1.2) provide an overview of existing methods and refer the reader to the relevant literature review papers. We have added a citation to an earlier survey paper on machine learning and optimal power flow: Hasan, F., Kargarian, A., & Mohammadi, A. (2020). A survey on applications of machine learning for optimal power flow. 2020 IEEE Texas Power and Energy Conference (TPEC). Space limitations prevent us from providing a self-contained literature review in the paper, which is why we elected to cite existing surveys on the topic.

评论

This comment continues the response to Reviewer ACrA.

The authors are suggested to state clearly whether the kit contains the function to train/test different ML models? If so, please clarify how to integrate the numerious DNN models and approaches; if not, how should the mentioned Metrics be calculated? Is it still done based on the user side to mannually do so?

We thank the reviewer for their valuable question; please note that we have included anonymized source code in our supplementary material. As mentioned in the paper, AnonymousRepo1 handles data generation, i.e., OPF input data generation, OPF instance building and solving, solution extraction, and data post-processing. AnonymousRepo2 provides standardized APIs for evaluating ML models in a generic fashion. While AnonymousRepo2 does include functionality to train some baseline ML models, users may elect to build and train their own models, and evaluate them using our tool. The documentation contains more details on the exact APIs for computing the mentioned metrics. In general, the design allows to provide best-practices training/modeling tools within AnonymousRepo2 while also supporting, e.g., users who have already implemented a model and wish to evaluate its performance on the PGLearn datasets.

Can the suite allow users to self-define the problem and use e.g., API calls, to load the generated dataset? It can be helpful if user have some uncommon reformulated OPF formulations and other general optimization formulation

Both of the libraries are designed to be modular, i.e. such that adding a custom formulation, data augmentation scheme, etc. is straightforward. Internally, the libraries use generic APIs where possible such that many of the features, e.g. the parallelization/HPC integration, dataset reading/writing, constraint violation utilities, PyTorch layers, loss functions, etc. can be reused directly without modification when using a custom formulation.

It is also useful to add a set of adversial samples in the dataset for robustness testing under the worst case scenario

We thank the reviewer for raising this point regarding adversarial samples and robustness. To the authors knowledge, adversarial samples are typically considered "adversarial" against a particular model. We are curious what exactly the reviewer has in mind regarding adversarial OPF samples on the dataset level.

The authors are suggested to include a general review on the ML for OPF papers, e.g., how ML are generally applied to OPF problems. There have been several literature review papers on it.

We thank the reviewer for their suggestion. The paper's introduction and related work sections (Section 1 and Section 1.2) provide an overview of existing methods and refer the reader to the relevant literature review papers. We have added a citation to an earlier survey paper on machine learning and optimal power flow: Hasan, F., Kargarian, A., & Mohammadi, A. (2020). A survey on applications of machine learning for optimal power flow. 2020 IEEE Texas Power and Energy Conference (TPEC). Space limitations prevent us from providing a self-contained literature review in the paper, which is why we elected to cite existing surveys on the topic.

The authors are suggested to state clearly whether the kit contains the function to train/test different ML models? If so, please clarify how to integrate the numerious DNN models and approaches; if not, how should the mentioned Metrics be calculated? Is it still done based on the user side to mannually do so?

We thank the reviewer for their valuable question; please note that we have included anonymized source code in our supplementary material. As mentioned in the paper, AnonymousRepo1 handles data generation, i.e., OPF input data generation, OPF instance building and solving, solution extraction, and data post-processing. AnonymousRepo2 provides standardized APIs for evaluating ML models in a generic fashion. While AnonymousRepo2 does include functionality to train ML models, users may elect to build and train their own models, and evaluate them using our tool. In general, the design allows to provide best-practices training/modeling tools within AnonymousRepo2 while also supporting, e.g., users who have already implemented a model and wish to evaluate it on the PGLearn datasets.

Can the suite allow users to self-define the problem and use e.g., API calls, to load the generated dataset? It can be helpful if user have some uncommon reformulated OPF formulations and other general optimization formulation

Both of the libraries are designed to be modular, i.e. such that adding a custom formulation, data augmentation scheme, etc. is straightforward. Internally, the libraries use generic APIs where possible such that many of the features, e.g. the parallelization/HPC integration, dataset reading/writing, constraint violation utilities, layers, loss functions, etc. can be reused without modification for new formulations.

审稿意见
3

This work describes a pipeline for sampling training data of Optimal Power Flow-machine learning tasks. It proposes a unified framework for sampling training data samples for OPF problem based on pre-defined grid models.The proposed PGLearn implements realistic data generation procedures that capture load-side variability, and a set of OPF formulations are implemented.

优点

  • The idea of performing standard sampling of OPF data is constructive and shall help the power systems and machine learning community.
  • The paper calls for a standard implementation and testing of Machine learning for OPF problems, which are important.
  • The paper brings our descriptions on a set of metrics on algorithm evaluation.

缺点

  • The proposed method is still limited to user-defined synthetic settings, while the scalability to other grid models/data distributions, other power system tasks, or settings other than supervised learning are not discussed.
  • The contribution is limited to dataset creation and standard testing. No algorithm insights nor new findings of learning to solve optimal power flow problems are reported.
  • Data augmentation is treated as the sole solution for tackling the non-standardized implementation problem for ML-OPF. Are there any other choices, such as sim-to-real transfer, multi-task or meta learning?
  • Though the paper claimed considering the correlation between samples, only uniform noises are added in a two-step approach. Uniform sampling is still far away from the real-world load data correlations. And the paper omits the discussion on how the proposed data augmentation approach resembles realistic settings.
  • The design of standard grid testbed models were rooted in abstraction and representation of real-world transmission grids. However, the paper falls short in explaining in details how the real-world data distribution can be applied to construct dataset for PGLearn.
  • In the demand sampling part, " The power factor at each node is kept constant, i.e. the active and reactive power at each load is scaled by the same amount." This is not a very realistic setting, as power factor is having variations, and has a big impact on the OPF solutions.
  • There is no inclusion nor analysis of contingencies in OPF problems. Essentially the introduction mentions renewable uncertainties multiple times, while the proposed method does not address such uncertainty issues.
  • The PGLearn module is limited to implementations of few methods by similar group of authors. A more thorough literature review and inclusion of methods are recommended: Zhang, Ling, Yize Chen, and Baosen Zhang. "A convex neural network solver for DCOPF with generalization guarantees." IEEE Transactions on Control of Network Systems 9, no. 2 (2021): 719-730. Zhou, Min, Minghua Chen, and Steven H. Low. "DeepOPF-FT: One deep neural network for multiple AC-OPF problems with flexible topology." IEEE Transactions on Power Systems 38, no. 1 (2022): 964-967. Owerko, Damian, Fernando Gama, and Alejandro Ribeiro. "Unsupervised optimal power flow using graph neural networks." In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6885-6889. IEEE, 2024.
  • It seems the results on the four accuracy metrics and computation metrics are not reported for the implemented methods.

问题

I have the following questions regarding the current paper:

  • Could the authors point the link to the anonymousrepo? And in these two repos, are the proposed four accuracy metrics and computation metrics realized?
  • PGLearn generates a bunch of data in terms of primal and dual solutions. But some more interactive representations shall be considered, e.g., generator/line value w.r.t capacity, prediction error, visualization of solution on grid, correlation between nodes and etc.

Some minor issues and questions:

  • in Introduction, "The increased volatility and scale of energy generation in modern and future grids..." Shall be power generation in the scope of OPF. Similarly in the second paragraph: " account for uncertainties in renewable energy generation and/or demand..."
  • In Introduction, "previously intractable applications such as real-time risk analysis", yet in this work, there is no discussion on how ML can help with OPF risk.
  • "inherent volatility of wind and solar generation creates OPF problems that are, or will be, orders of magnitude larger that today’s instances." Why the increase of volatility will make the OPF instances larger?
  • The PGLib-OPF seems to be an important reference. Could the author explain more on "PGLib-OPF only provides a single snapshot per grid"? Does that mean only one data sample is provided for each power grid model?
  • In Appendix A.2.2, the authors mentioned the bus-pair variables were changed to branch on each variable. Could the authors explain more why this has marginal impact on the relaxation? And does this mean the ML model is learning the newly defined variables instead?

伦理问题详情

NA

评论

We thank the reader for their close review of our paper; it has helped us further improve PGLearn. Detailed answers to the reviewer's questions follow. The response is split into two comments due to the character limit.

Could the authors point the link to the anonymousrepo? And in these two repos, are the proposed four accuracy metrics and computation metrics realized?

We replaced the names of our official repositories with "AnonymousRepo1" and "AnonymousRepo2" to comply with anonymity rules of ICLR, since that information would have enabled to de-anonymize our submission. Please find anonymized source code in the updated supplemental materials. This code indeed contains the implementation of the various performance metrics reported in our paper. Naturally, the final version of the paper will contain the name and full URLs of code repositories and datasets.

There is no inclusion nor analysis of contingencies in OPF problems. Essentially the introduction mentions renewable uncertainties multiple times, while the proposed method does not address such uncertainty issues.

We thank the reviewer for raising this point. We would like to point out that PGLearn includes instances with so-called N-1 cases, wherein an individual generator or line is unavailable; this is similar to OPFData. Furthermore, uncertainty in renewable production is typically captured via the so-called "net load," i.e., demand minus renewable generation.

The PGLearn module is limited to implementations of few methods by similar group of authors. A more thorough literature review and inclusion of methods are recommended: Zhang, Ling, Yize Chen, and Baosen Zhang. "A convex neural network solver for DCOPF with generalization guarantees." IEEE Transactions on Control of Network Systems 9, no. 2 (2021): 719-730. Zhou, Min, Minghua Chen, and Steven H. Low. "DeepOPF-FT: One deep neural network for multiple AC-OPF problems with flexible topology." IEEE Transactions on Power Systems 38, no. 1 (2022): 964-967. Owerko, Damian, Fernando Gama, and Alejandro Ribeiro. "Unsupervised optimal power flow using graph neural networks." In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6885-6889. IEEE, 2024.

We thank the reviewer for their valuable suggestions. The goal of our work is to provide standardized datasets and performance metrics, together with a unified interface that enables researchers to evaluate their method on PGLearn. An exhaustive re-implementation and benchmark of all existing approaches in the literature would require far more (time and computing) resources than we have available.

In Introduction, "previously intractable applications such as real-time risk analysis", yet in this work, there is no discussion on how ML can help with OPF risk.

The mention of real-time risk analysis in this sentence refers to the work of Chen, W., Tanneau, M., & Van Hentenryck, P. (2024). Real-time risk analysis with optimization proxies. Electric Power Systems Research, 235, 110822. This work has shown that large-scale Monte-Carlo simulations involving the resolution of multiple OPF problems become computationally tractable by leveraging ML-based proxy models. Naturally, the accuracy of these simulations directly depends on the accuracy of the underlying ML models.

The PGLib-OPF seems to be an important reference. Could the author explain more on "PGLib-OPF only provides a single snapshot per grid"? Does that mean only one data sample is provided for each power grid model?

We thank the reviewer for raising this point. Indeed, PGLib provides only one problem instance for each power grid. The PGLib collection was released in 2019 to "benchmark AC optimal power flow algorithms". It now comprises 66 instances of optimal power flow problems, each corresponding to a different power grid. Most cases are artificial power grids (e.g. IEEE standard test cases), some are snapshots of real transmission systems, e.g., the RTE and Pegase cases. To support the ML community in working on OPF problems, our work thus provides open-source data-augmentation code, standardized datasets generated using this code, and open-source code to evaluate the performance of ML models.

评论

This comment continues the response to Reviewer 19cY.

In Appendix A.2.2, the authors mentioned the bus-pair variables were changed to branch on each variable. Could the authors explain more why this has marginal impact on the relaxation? And does this mean the ML model is learning the newly defined variables instead?

We thank the reviewer for pointing this out. First, we would like to clarify that our codebase and the PGLearn datasets follow the models presented in the paper (Models 1-5, stated in the Appendix). The remark in Appendix A.2.2 notes that different implementations of the Jabr relaxation exist, which present slight discrepancies. Again, our data-augmentation code and ML evaluation code uses the formulation presented in Model 2, which yields more consistent data structures. We have empirically compared both formulations on multiple cases and found differences to be marginal.

in Introduction, "The increased volatility and scale of energy generation in modern and future grids..." Shall be power generation in the scope of OPF. Similarly in the second paragraph: " account for uncertainties in renewable energy generation and/or demand..."

We thank the reviewer for their close reading of the paper. We have adjusted the wording accordingly.

PGLearn generates a bunch of data in terms of primal and dual solutions. But some more interactive representations shall be considered, e.g., generator/line value w.r.t capacity, prediction error, visualization of solution on grid, correlation between nodes and etc.

We thank the reviewer for this feedback regarding data visualization. AnonymousRepo2 contains some visualization utilities, e.g. for plotting aggregated and component-wise prediction errors. Interactive plots and visualizing correlations/solutions on a grid are interesting suggestions which we will consider including in future updates to the library.

Though the paper claimed considering the correlation between samples, only uniform noises are added in a two-step approach. Uniform sampling is still far away from the real-world load data correlations. And the paper omits the discussion on how the proposed data augmentation approach resembles realistic settings. In the demand sampling part, " The power factor at each node is kept constant, i.e. the active and reactive power at each load is scaled by the same amount." This is not a very realistic setting, as power factor is having variations, and has a big impact on the OPF solutions.

We thank the reviewer for their remarks regarding the demand sampling procedure. Section 3 discusses how the chosen sampling procedure is more realistic than the current standard approach, with Figure 2 as evidence. Moreover, we have updated the demand sampling procedure to vary the power factor of load demands by sampling the local noise independentely for the active/reactive components.

The design of standard grid testbed models were rooted in abstraction and representation of real-world transmission grids. However, the paper falls short in explaining in details how the real-world data distribution can be applied to construct dataset for PGLearn.

We thank the reviewer for raising this point. Since real-world data distributions are typically aggregated to the regional/zonal level, a disaggregation must be performed to obtain bus-level information; how to accurately do this disaggregation is an important direction for future research. Moreover, the required detailed grid topology data, e.g. generator/line/transformer parameters, are typically not released publically due to security/privacy concerns.

"inherent volatility of wind and solar generation creates OPF problems that are, or will be, orders of magnitude larger that today’s instances." Why the increase of volatility will make the OPF instances larger?

We thank the reviewer for their close reading of the paper and for pointing out the confusion in our original phrasing. What we meant to convey is that the growth of uncertainty (mainly driven by growth in renewable generation) is driving changes in the structure of optimization problems used to clear markets and control the grid. In particular, there is growing interest in considering optimization formulations that explicitly capture this uncertainty through, e.g., stochastic or robust optimization models, which brings a significant increase in problem scale. We have updated this sentence to clarify our message.

评论

The reviewer appreciates the authors' efforts in improving the manuscript. Yet I still feel there are several important concerns regarding the current work. I acknowledge the improvements and clarifications about the anonymous repo, contingencies, and technical descriptions. While the previous concerns on data visualization, data sampling, connection to practical grid settings still hold.

The biggest concern is still the significance of the work. Essentially, this work aggregates a set of already-published grid instances, run OPF, and find ML surrogate models. And the machine learning models are largely trained on labeled data which are generated offline. But there are three unsolved issues as far as I am aware of: These benchmarks are already publicly available, why power engineers do not directly use MatPower, Julia, or other Python libraries to run and generate the data? According to the manuscript, one of the contributions is to standardize the data generation procedures, yet there is no comparison to other frameworks, codebases, nor realistic load and generation. This limits the practicability and reproducibility of the proposed framework. Last but not the least, the authors only tested on limited ML algorithms for solving OPF, while overlooking much of these efforts in the literature. This limits the impact and feasibility of the proposed approach.

Another major contribution the authors mentioned is the data sampling. I think the proposed approach is still having a large gap with the practical power demand, which has very strong spatiotemporal correlation. Based on this work, there is only local noise injected independently, which overlooked many scenario generation and multi-variable load forecasting papers in the literature. The visualization in Fig. 2 did not convey such complex correlations. Furthermore, it is strongly advised to benchmark the performance using real load data. Some platforms such as CATS[1] already published realistic grids with realistic load vector. Moreover, when I checked the code repo, it is hard to let users directly work on their load dataset or testing cases, limiting the practical usage of the proposed framework.

Overall, I agree there is a strong motivation to publish standardized testing case for ML solving OPF, but given the already established testing case containing more fundamental information on topology, line parameters, and base loads; the practical needs of realistic demand and generation data; and the needs for not only benchmarking data, but also benchmarking algorithms, I think the work in the current form is not having significant impacts and does not possess technical novelties.

Some unsolved issues from the authors' responses are listed below:

The mention of real-time risk analysis in this sentence refers to the work of Chen, W., Tanneau, M., & Van Hentenryck, P. (2024). Real-time risk analysis with optimization proxies. Electric Power Systems Research, 235, 110822. This work has shown that large-scale Monte-Carlo simulations involving the resolution of multiple OPF problems become computationally tractable by leveraging ML-based proxy models. Naturally, the accuracy of these simulations directly depends on the accuracy of the underlying ML models.

I am a bit confused, as the cited paper talked about risk but it does not mean this work is related to risk analysis.

We thank the reviewer for raising this point. Since real-world data distributions are typically aggregated to the regional/zonal level, a disaggregation must be performed to obtain bus-level information; how to accurately do this disaggregation is an important direction for future research. Moreover, the required detailed grid topology data, e.g. generator/line/transformer parameters, are typically not released publically due to security/privacy concerns.

I think in most of Matpower source files for grid cases, there are generators and transformers data. How do the authors process these data to preserve security or tackle the privacy issue?

References [1] Taylor, Sofia, Aditya Rangarajan, Noah Rhodes, Jonathan Snodgrass, Bernie Lesieutre, and Line A. Roald. "California test system (CATS): A geographically accurate test system based on the California grid." IEEE Transactions on Energy Markets, Policy and Regulation (2023).

评论

We thank the reviewer for their feedback and their time reviewing the updated manuscript and supplementary material. In the response below, we specifically address the 3 main questions raised by the reviewer.

Importance of standardized datasets and evaluation metrics.

The reviewer is correct that the original snapshots (from, e.g., PGLib) are publicly-available, and that there exist open-source OPF implementations in various languages. Thus, individual researchers can indeed use such tools to generate their own datasets.

We argue that this creates the following issues. First, it has resulted in inconsistencies in performance evaluation, as different teams often consider different data distributions and power grids. This makes it hard to objectively compare different approaches; akin to having computer vision papers report accuracy on different collections of images. In addition, performance metrics are not standardized, further complicating objective performance comparison.

Second, the computational cost associated to data generation incentivizes teams to consider only small-scale networks (for which data generation is faster and cheaper). Most academic studies currently consider networks with at most 300 buses, which are 20x - 100x smaller than real-life power grids.

Third, this restricts the ability of teams without substantial computational resources (whether HPC or cloud computing credits) to conduct high-quality research. It is noteworthy that the only other open dataset containing large OPF instances was released by DeepMind, which has substantial computational resources.

In that context, our work makes it possible to evaluate and compare performance objectively, by providing standardized, open datasets and performance metrics. We also democratize access to this growing field of research by providing open-source implementations of our data-generation and evaluation code. We believe this is a meaningful step for the research community, and we recognize this is a first step towards more extensive datasets.

Practicability and Reproducibility of the approach

Our work is the first to provide, together with open datasets containing large-scale OPF instances, data-generation and model evaluation code. For instance, the datasets released by DeepMind (OPFData) do not include any associated code. In addition, we use open-source optimization solvers and document random seeds and configuration options to ensure maximal reproducibility of our datasets.

To our knowledge, no existing work provides such a comprehensive toolkit.

Testing of ML algorithms

The scope of this paper is to advance research on machine learning and optimal power flow by providing open, standardized datasets and evaluation metrics. This is akin to the role played by, e.g., MNIST or ImageNet datasets in advancing the state-of-the-art in computer vision. The paper does not attempt to develop state-of-the-art ML models for solving OPF problems. Rather, it provides the research community with the means to do so, and to evaluate their performance in a principled fashion.

评论

I appreciate the authors' further clarifications. But the real computation burden for researchers is not necessarily coming from solving optimization for large grid OPF offline. Rather it is coming from the larger machine learning model training, which this work does not shed any light on. In addition, the claim “Our work is the first to provide, together with open datasets containing large-scale OPF instances, data-generation and model evaluation code” is also not exact, as works such as OPF-Learn [1] already did that, which limited this work's novelty.

And considering the lack of realistic data rather than just sampling based on statistical distribution, the existence of standard grid testing library, and the absence of enough benchmarked ML for OPF algorithms to compare performance, I would still doubt this work's novelty and contributions. And just like the authors suggested, it would be meaningful to evaluate current algorithms' performance under standardized datasets and metrics, while this work falls short of that.

Reference [1] Joswig-Jones, Trager, Kyri Baker, and Ahmed S. Zamzam. "Opf-learn: An open-source framework for creating representative ac optimal power flow datasets." In 2022 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), pp. 1-5. IEEE, 2022.

审稿意见
8

The authors proposed a library that generates datasets to standardize ML for OPF research. While each study in the literature shows improvement over earlier models, it is hard to fairly compare the model performance across different studies.

The authors proposed a library that constructs input output pairs to different formulations of the OPF problem (AC-OPF, DC-OPF and second order cone relaxation of AC-OPF) on large middle to large sized test cases. They also address data augmentation issues in the literature that results in a narrow demand range and implement a procedure that captures global larger demand characteristics. The library also contains solvers implemented in efficient libraries to find the corresponding optimal solutions and evaluation metrics to measure model accuracy and computational cost.

优点

  1. A typical ML for OPF study goes through the steps of generation of input parameters (e.g. demand), finding corresponding solutions, model training and evaluation. The he authors contribute the literature with a toolkit that standardizes all these processes.
  2. I think main contribution of the study is that the proposed library improves comparability and fair assessment of existing models. Many studies in the literature show improvement over some baselines but are not able to compare against the state of the art. The library enables researchers to compare their approaches with previously reported benchmarks without implementing each specific model.
  3. Another contribution is the data augmentation procedure that potentially generates different load characteristics.
  4. The library also implements widely solved formulations of OPF using efficiently written solvers.

缺点

  1. There are not many systems implemented in the library and most of them are large systems. Researchers may want to work on smaller systems for proof of concept or visualizing the model behaviour. It would be nice to add them.
  2. While the proposed data augmentation method generates wider range of demand dynamics than sampling random factor and multiplying the base load, I am not sure this approach captures the global dynamics. The demand characteristics can differ widely within the hours of a day or among different seasons, which can result in different constraints to bind at the solution. This is mentioned as a limitation, but I think more detailed demand scenarios should be considered to train more reliable NN models.

问题

See weaknesses.

评论

We thank the reviewer for their close reading of our paper and for their insightful comments. The response addresses the raised points individually:

There are not many systems implemented in the library and most of them are large systems. Researchers may want to work on smaller systems for proof of concept or visualizing the model behaviour. It would be nice to add them.

We agree with the reviewer that including smaller cases can be valuable for preliminary studies. We have amended the list of benchmarks to include additional small cases 14_ieee, 30_ieee, 118_ieee, and 300_ieee, as well as large cases 13659_pegase and midwest24k (as suggested by Reviewer EiRM). The full case statistics and references are included in the general response.

While the proposed data augmentation method generates wider range of demand dynamics than sampling random factor and multiplying the base load, I am not sure this approach captures the global dynamics. The demand characteristics can differ widely within the hours of a day or among different seasons, which can result in different constraints to bind at the solution. This is mentioned as a limitation, but I think more detailed demand scenarios should be considered to train more reliable NN models.

We thank the reviewer for raising this important point, as we agree with the importance of capturing global dynamics. As noted in our paper, the vast majority of existing works consider uncorrelated, bus-level perturbations, which yield a very narrow range of total demand (as illustrated by Figure 2 in our paper). In contrast, our approach explicitly captures variations in total demand by introducing a global scaling factor (denoted by bb in Algorithm 1). Figure 2 demonstrates that this strategy not only yields a broader range of total demand, but also results in datasets with more complex dynamics. We believe this is a meaningful step towards more realistic datasets, as this mechanism better captures diurnal / seasonal variations in total demand. We nonetheless agree that more granular correlation structures are valuable for future research, and intend to include this aspect in future releases of PGLearn.

评论

Thanks to the authors for their response and adding smaller test systems to the library.

About capturing the global dynamics, it is true that majority of existing studies consider bus level perturbation. For some cases, this approach leads to models that perform well on test set but still cannot generalize outside of the training distribution.

I agree that the proposed approach is a meaningful step towards the realistic datasets and it is a valuable contribution, but the wording "global" in the abstract ( "PGLearn implements realistic data generation procedures that capture both global and local variability") may not be accurate in this context. I have doubts that if researchers train their models using the proposed library, the resulting model can predict with the same performance in case of dramatic changes in load characteristics, such as pandemics or natural disasters. However, the rest of the appearances of the word (e.g. "global scaling", "global range") are more meaningful within the context.

Overall, this does not impact my decision on the value of the study.

审稿意见
3

The authors argue that the numerous datasets used in the benchmarking of ACOPF codes, incl. the Grid Optimization Competition with $9M+ in prize money (https://gocompetition.energy.gov/) and Google DeepMind OPFData (https://arxiv.org/pdf/2406.07234), have minor issues, and need to be replaced by their benchmark. The authors consider 65,536 of samples generated for instances on up to 9241 buses (the Pegase project instance), which are smaller than the OPFData (Lovett et al., 2024) with instances on up to 13,659 buses, but bigger than those of PowerModels (Coffrin et al., 2018) on up to 118 buses with 10,000 samples. The authors consider correlated demand perturbations, but do not consider diurnal variations, or similar.

优点

The writeup is reasonably clear.

缺点

The instances on up to 9241 buses (the Pegase project instance) are smaller than those of the Deepmind OPFData (Lovett et al., 2024) with instances on up to 13,659 buses.

While considering correlated demand perturbations may improve the realism somewhat, the authors should ideally consider real-world time-series capturing diurnal variations, or similar. Likewise, they could consider time-varying production limits, which are readily available to all market participants in Europe via the ENTSO-E Transparency Platform (https://transparency.entsoe.eu/).

The supplementary material contains no code, so the code could not checked for reproducibility or usefulness.

问题

Have you considered the use of the ENTSO-E Transparency Platform (https://transparency.entsoe.eu/)?

评论

We thank the reviewer for their time reviewing our paper; the feedback has helped us further improve the paper. The response will address the raised points individually:

The instances on up to 9241 buses (the Pegase project instance) are smaller than those of the Deepmind OPFData (Lovett et al., 2024) with instances on up to 13,659 buses.

We thank the reviewer for their suggestion to include more large-scale cases in PGLearn. We have amended the list of benchmarks to include additional large cases 13659_pegase and midwest24k, as well as small cases 14_ieee, 30_ieee, 118_ieee, and 300_ieee (as suggested by Reviewer M2Kx). The updated version of the benchmarks table including the full case statistics can be found in the general response. Finally, we would like to point out that, compared to DeepMind's OPFData, we provide 1) both primal and dual solutions, 2) solutions for multiple OPF formulations, 3) a more realistic data generation procedure that covers a broader range of total demand, and 4) open-source data generation and model evaluation code.

While considering correlated demand perturbations may improve the realism somewhat, the authors should ideally consider real-world time-series capturing diurnal variations, or similar. Likewise, they could consider time-varying production limits, which are readily available to all market participants in Europe via the ENTSO-E Transparency Platform (https://transparency.entsoe.eu/).

We thank the reviewer for this insightful feedback, and we agree that better capturing temporal structure is important for further research. The inclusion of a global perturbation in our data-generation methodology allows to capture a broader range of total demand which, in turn, better captures diurnal variations. We believe that this approach yields a meaningful improvement over existing publicly-available datasets. We intend to capture temporal structure more explicitly in future versions of the PGLearn collection.

We also thank the reviewer for pointing out the ENTSO-E transparency platform. This platform releases time series information at the regional/zonal level, which must therefore be disaggregated to obtain bus-level information; how to accurately do this disaggregation is an important direction for future research. We would also like to point out that ENTSO-E does not release detailed grid topology information, e.g., line impedance, generator capacities, individual load profile, etc. which is typically protected.

The supplementary material contains no code, so the code could not checked for reproducibility or usefulness.

We thank the referee for pointing this out. We have included anonymized versions of the repositories in the updated supplemental materials. Please note that although our code will be available under an open-source license, we cannot directly share those official repositories as it would violate ICLR anonymity rules.

评论

Many thanks the suggestion that you add 13659_pegase and midwest24k. Having said that, I still find the contribution over and above the 13659_pegase (as made available by the Pegase project, https://cordis.europa.eu/project/id/211407/reporting, including scenarios to consider in stochastic models, remedial actions that could be taken, etc) somewhat limited.

This platform releases time series information at the regional/zonal level, which must therefore be disaggregated to obtain bus-level information;

This is not correct. ENTSOE Transparency Platform has data on each generating unit.

Overall, this does not impact my decision on the value of the study.

评论

Dear EirM, as you are one of the reviewers who expressed Confidence level 5, I can assume you are a subject expert, and I am sure you can take some time to provide more detailed feedback to the authors; I am sure they would appreciate that as their main objective is working on convincing you that their work is worth the acceptance.

I can notice from your review that although your confidence is the best, you didn't spend much time writing this review and didn't follow the guidelines of the sections (you addressed weaknesses in the summary space), preventing the authors from understanding the details that could make you reconsider your rating, which is the whole idea of the peer review process.

My contribution to the discussion here is as follows:

  • I believe that the contribution is not centered around the size of the instances, so I don't consider it a strong factor to compare against works like Google DeepMind OPFData.
  • I agree that real-life perturbations would be ideal for the authors. Still, the contribution from the authors lies in considering the correlation in the sampling strategy, which is an advance and still a contribution. It is, in a way, a modeling effort that, although simplistic, makes the case more realistic. Reviewer 19cY made more detailed points on this decision from the authors; you might want to look.
  • I agree with the authors that including real-time data from the ENTSO-E Transparency Platform could be a different problem and have other limitations (like the access to topologies), so it might not be a strong point the authors could address.
  • I agree with the lack of code on the supplementary material; you will notice that they added an update.

Thanks in advance for your attention!

评论

Dear Uide,

I have expressed Confidence level 5, because I have worked with multiple TSOs on real dispatch & control systems, and have a fair understanding of the academic benchmarks available.

The main message of my review is that the authors essentially do no original work. They repackage existing instances, while adding noise with a different properties than considered previously. Having said that, the noise is not demonstrated to be any closer to the underlying data (which would be available for the RTE and Pegase instances, in principle, via the ENTSOE Transparency platform, which the authors tried to dismiss rather than look into), and does not extend the "snapshots" (realizations of multi-variate random variable) into time series (realizations of a multi-variate stochastic processes).

Despite your suggestion to the contrary, I have addressed the weaknesses in the correct section as well as -- to a lesser extent -- in the summary space.

real-life perturbations would be ideal for the authors.

Ideal would be real-life time series, which are readily available from the ENTSOE Transparency platform.

considering the correlation in the sampling strategy,

This seems to be a contribution any undergraduate can make?

评论

We thank the reviewer for their additional feedback.

Our previous comment regarding the zonal/regional data in ENSTO-E referred to load information which, to our knowledge, is not available at the nodal level. We are aware that ENTSO-E releases time series data regarding generation units, and did not mean to dismiss the reviewer's suggestion.

We did not directly use data from ENTSO-E for the following two reasons:

  • nodal load time series data is not available (to our knowledge), and
  • the PEGASE cases in PGLib do not include enough information to identify individual generators nor individual buses' geographic location. Although PGLib cases do include generator fuel type information, this was added artificially by the curators of PGLib, and reflects the fuel mix of US-based generators.

As the reviewer mentions, the ideal would be to use actual grid topology and nodal time series of load and generation, however, such complete information is generally not available because of confidentiality/security concerns. This has motivated the use of statistical data augmentation techniques in virtually all published works on ML and Optimal Power Flow.

Our work aims at broadening access to and streamlining the benchmark of ML works for Optimal Power Flow problems. The datasets released as part of PGLearn represent tens of thousands of CPU-hours worth of computations and more than a terabyte of data; most of that computational cost stems from solving large-scale instances. We also provide cross-platform, tabular datasets that can be easily parsed and leveraged by the ML community, without requiring deep knowledge of optimization nor power systems. Those standardization efforts (open datasets and evaluation code) have been explicitly called for in the survey of Khaloie et. al. Therefore, we believe this work will have a meaningful impact on the community, and drive further research in this topic. Naturally, we recognize that this is a first step towards more realistic datasets that capture, e.g., temporal structure. Finally, we would like to point out that our paper is submitted as part of the dataset & benchmark track. Accordingly, the paper focuses on the data-generation and benchmarking processes, rather than presenting new methodologies or model architectures.

评论

We thank the reviewers for evaluating our paper and for their constructive feedback which has helped us improve the paper. In what follows, we review the most common feedback and summarize the changes they inspired.

Paper positioning

The core contribution of our paper is twofold: the PGLearn collection of datasets, and standard evaluation metrics to benchmark the performance of ML methods for OPF problems. Both are accompanied by open-source code for data-generation and model evaluation. We believe that releasing these large standardized datasets is a meaningful step forward for the community, as it will enable multiple research teams to contribute to this field without requiring extensive computational resources, and will streamline the comparison of various methodologies.

Given the large breadth of work in this field, we do not have the resources to implement, train and evaluate, by ourselves, all existing methods for ML and OPF. Rather, we provide a standard evaluation framework for researchers to measure and report performance in a principled fashion. We thank reviewer tMt6 for suggesting a community leaderboard; we hope to host one in the future to which the community can contribute.

Additional cases

Several reviewers (M2KX, EiRM) suggested including additional cases as part of PGLearn, also noting that the selection focuses too much on the medium/large benchmarks. While our focus on large systems is motivated by the gap between existing academic works and the scale of real-life systems, we recognize the need for including smaller systems for testing / development. Therefore, we have extended our collection with four small cases from the PGLib collection (14_ieee, 30_ieee, 118_ieee, 300_ieee), and two larger cases 13659_pegase and midwest24k. The updated version of Table 1, showing the full case statistics, is included below for reference:

Case nameNodesLoadsGeneratorsEdgesTotal PDTotal PGGlobal range
14_ieee14115200.3 GW0.4 GW70% -- 110%
30_ieee30216410.3 GW0.4 GW60% -- 100%
89_pegase8935122106 GW10 GW60% -- 100%
118_ieee11899541864 GW7 GW80% -- 120%
300_ieee3002016941124 GW36 GW60% -- 100%
1354_pegase1354673260199173 GW129 GW70% -- 110%
nyiso_203015761446323242733 GW42 GW70% -- 110%
1888_rte18881000290253159 GW89 GW70% -- 110%
2869_pegase286914915104582132 GW231 GW60% -- 100%
6470_rte647036701330900597 GW118 GW60% -- 100%
Texas7k67174541637914075 GW97 GW80% -- 120%
9241_pegase92414895144516049312 GW530 GW60% -- 100%
13659_pegase136595544409220467381 GW981 GW60% -- 100%
midwest24k2364311727564633739104 GW318 GW90% -- 130%

Moreover, we now group the 14 cases into four categories based on the number of buses:

Small (<<1k buses): 14_ieee, 30_ieee, 89_pegase, 118_ieee, 300_ieee

Medium (<<5k buses): 1354_pegase, nyiso_2030, 1888_rte, 2869_pegase

Large (<<10k buses): 6470_rte, Texas7k, 9241_pegase

Extra-Large (>>10k buses): 13659_pegase, midwest24k

Finally, we would like to point out that, compared to the recent OPFData dataset, we provide 1) our data-generation code, 2) additional test cases, 3) multiple formulations (DC, SOC and AC optimal power flow), and 4) both primal and dual solutions.

Demand Sampling Procedure

Reviewers M2Kx, EiRM, 19cY, and tMt6 remarked that PGLearn does not incorporate time series information, as we noted in our Limitations section. We agree that capturing temporal information is an important aspect for impactful ML research in this field. Nevertheless, we believe that two more pressing needs are 1) increasing the scale of standard test cases used by the community and 2) capturing a broader range of variation in total demand (as evidence by Figure 2 in the paper). We are currently working on integrating time series information in our data-generation and evaluation procedures, which will be part of a future version of PGLearn.

Following the feedback of reviewer 19cY, we have updated our demand sampling procedure to vary both active and reactive power demand components, thus introducing additional variability in power factors (see updated Algorithm 1).

评论

Dear reviewers and area chairs,

We would like to thank you for your time reviewing our manuscript and for the chance to discuss your feedback. Please find below a summary of our overall discussion.

Paper positioning and contribution

Our goal is to broaden access to and improve research on ML and Optimal Power Flow problems, by releasing open standardized datasets and evaluation tools. In particular, we provide

  • public datasets that comprise
    • large-scale OPF instances (up to 22,000 buses) with a broad range of total demand
    • multiple OPF formulations (DC, SOC, AC)
    • complete primal and dual solutions and solver metadata such as computing times
  • cross-platform, array-based data formats that allow researchers to use our dataset without requiring deep power systems knowledge
  • open-source data-generation and evaluation code for researchers to benchmark their methods in a principled fashion

This addresses several of the most pressing concerns raised in previous work, e.g., the survey by Khaloie et al, as well as some crucial limitations in existing datasets (namely, that uncorrelated load perturbations result in a very narrow range of total demand). Therefore, we believe the paper fits well within the dataset and benchmark ICLR track, and provides meaningful benefits to the community.

Main points raised by the reviewers

The main points raised by reviewers are the following

  • the initial datasets should include larger and smaller instances
  • the distribution of OPF instances assumes constant power factor and does not include time series information
  • the absence of code in the supplementary material
  • the absence of new ML methodology / architecture for learning OPF solutions

Changes we have made in response to the reviewers' comments

In order to address the reviewers' feedback, we made the following changes:

  • we included smaller (down to 14 buses) and larger (up to 22,000 buses) OPF instances
  • we updated our data-generation procedure to allow for non-constant power factors

Furthermore, we emphasized the following aspects of our work:

  • we pointed out that very few public datasets exist that combine grid topology information and load and generation time series. We agree that capturing temporal structure is an important next step, which we will include in future releases of PGLearn.
  • we noted that our paper is part of the dataset & benchmark track, and therefore focuses on data generation and evaluation metrics.
AC 元评审

This paper introduces PGLearn, a toolkit with open-source datasets and evaluation tools for machine learning (ML) in Optimal Power Flow (OPF) problems. The authors aim to solve the problem of inconsistency in datasets and evaluation metrics in this field. The paper has some strengths, like providing datasets with different OPF formulations (AC, DC, and SOC), adding variability in data generation, and offering a framework for standardized evaluation. These tools are useful for the research community to compare different ML methods more easily. The authors also discuss the limitations of current datasets and provide code to help other researchers in this area. Despite that, there are few key weaknesses. The proposed method mostly uses synthetic data and does not include realistic time-series data or real-world grid dynamics. Reviewers pointed out that more comprehensive data, like from the ENTSO-E Transparency Platform, could make this work stronger. Also, the paper does not offer any new ML methods or technical insights for solving OPF problems, so the contribution is limited to dataset creation. There is little comparison with other benchmarks, and no detailed testing of ML models on the proposed datasets.

审稿人讨论附加意见

The reviewer discussion highlighted several recurring concerns that reinforce the decision to reject the paper. One major point was the limited novelty of the work, as it primarily repackages existing datasets with synthetic augmentation rather than addressing deeper technical challenges in ML-OPF integration. While the authors responded to comments about real-world data and temporal correlations, the discussion revealed that the dataset still lacks practical utility due to its reliance on simplified assumptions, such as constant power factors and uniform noise. Reviewers also agreed that the absence of thorough benchmarking across diverse ML methods and the lack of alignment with existing standards, such as ENTSO-E data or OPF-Learn, reduces the paper’s overall contribution. Additionally, several reviewers noted that this work, with its focus on datasets and evaluation frameworks, might be a better fit for a venue specializing in power systems, open-source tools, or applied ML, where the audience would likely find it more impactful and relevant. While the toolkit aims to support the community, it does not sufficiently address practical or scalability challenges in the field.

最终决定

Reject