Dynamic Configuration for Cutting Plane Separators via Reinforcement Learning on Incremental Graph
We propose a novel Dynamic Separator Configuration (DynSep) method that models separator configuration in different rounds as a reinforcement learning task, making decisions based on an incremental triplet graph updated by iteratively added cuts.
摘要
评审与讨论
The paper proposes DynSep, a reinforcement learning-based method for dynamically configuring cutting plane separators in mixed-integer linear programming (MILP) solvers.
优缺点分析
The strength of this paper:
(1) Triplet graphs are used to capture evolving problem states during branch and cut with Lightweight subgraph updates (2) Decoder-only design with blocked positional encoding captures temporal dependencies across rounds while ensuring permutation equivariance
The weakness of this paper:
(1) limited scope of hyperparameter tuning (separator activation/timing) (2) limited scope of testing datasets (MIPLIB neos and supportcase)
问题
(1) Could you specify which instances you used to test the performance in MIPLIB. What do you mean by mixed neos and mixed suportcase specifically? (2) For real-world use, what is the latency of policy inference during B&C?
局限性
(1) DynSep only optimizes separator activation status (+1, 0, -1) and round termination (m_t). It ignores critical hyperparameters like: Cut depth/aggressiveness (e.g., sepastore/age in SCIP); Cut selection thresholds (e.g., efficacy vs. orthogonality) and Separation frequency per node.
(2) Memory/compute inference overhead is not discussed. Which scenarios and practical cases the data-driven approach is appropriate should be addressed.
最终评判理由
At the last minutes of rebuttal, the author provided more evaluation experiments and addressed my concerns.
格式问题
None
We thank the reviewer for the insightful and valuable comments. We respond to your comments as follows and sincerely hope that our rebuttal could properly address your concerns. If so, we would deeply appreciate it if you could raise your score("3: Borderline Reject"). If not, please let us know your further concerns, and we will continue actively responding to your comments and improving our submission.
Q1. Lmited scope of hyperparameter tuning, ignoring critical hyperparameters.
A1.
Thank you for the insightful comment. We claim that our proposed DynSep framework is inherently extensible to a broader set of solver hyperparameters beyond separator activation and timing. This is achieved by adapting the policy network: we model the outputs of additional parameters as a logistic‑normal distribution, followed by optional discretization to support both integer and continuous parameters. This enables flexible, differentiable control of arbitrary solver parameters.
To validate this, we conducted additional experiments on three critical hyperparameter groups (as suggested), while retaining the original tuning mechanism for separator activation and max round.
- para group 1: Cut Depth / Aggressiveness (sepastore/age in SCIP). Controlled by solver parameters
separating/cutagelimitandseparating/poolfreq. - para group 2: Cut selection thresholds (e.g., efficacy vs. orthogonality). Controlled by solver parameters
separating/minefficacyandcutselection/hybrid/minortho. - para group 3: Separation frequency per node. Controlled by solver parameters
separating/cutagelimitandseparating/poolfreq.
We provide the experimental results as follows.
| Instance | Knapsack | Knapsack | Knapsack | Corlat | Corlat | Corlat | Supportcase | Supportcase | Supportcase |
|---|---|---|---|---|---|---|---|---|---|
| Metric | Time(s) | PD integral | PD gap | Time(s) | PD integral | PD gap | Time(s) | PD integral | PD gap |
| DynSep (Ours) | 0.52 | 9.71 | 0 | 22.96 | 2233.42 | 0 | 132.50 | 9212.24 | 7.86 |
| Para Group 1 | 0.65 | 9.01 | 0 | 47.36 | 4580.9 | 4e+18 | 141.54 | 8610.24 | 0.17 |
| Para Group 2 | 0.36 | 7.91 | 0 | 42.17 | 1971.18 | 0.01 | 167.64 | 8661.57 | 0.16 |
| Para Group 3 | 0.65 | 10.03 | 0 | 24.71 | 1922.77 | 0 | 141.09 | 8362.53 | 0.17 |
We evaluated our method on three benchmark datasets: Knapsack, Corlat, and MIPLIB mixed supportcase, measuring solving time (Time), primal–dual gap integral (PD integral), and primal–dual gap (PD gap). All three metrics are lower-is-better.
We compared our original DynSep method with its extended versions incorporating three additional parameter groups. Results show that:
-
Across all datasets, DynSep and its extended variants consistently outperform the default solver configuration reported in the main paper.
-
The differences among the parameter groups are relatively small, yet certain combinations yield further performance improvements in both Time and PD integral.
These results validate that DynSep can flexibly extend to control a broader set of solver hyperparameters. Furthermore, carefully chosen parameter combinations can yield additional gains in solving speed and convergence.
Q2. Limited scope of testing datasets (MIPLIB neos and supportcase);
A2.
Thank you for raising this point. We have extended our evaluation beyond MIPLIB mixed neos and mixed supportcase, including two real-world datasets from the Distributional MIPLIB benchmark [1]:
- Maritime Inventory Routing Problem (MIRP). MIRP arises in bulk shipping logistics, integrating vessel routing and port inventory decisions, featuring 15080 binary variables, 19576 continuous variables, and 44430 constraints.
- Seismic‑Resilient Pipe Network Planning (SRPN). SRPN involves optimizing municipal water pipe network design to ensure resilience under seismic disturbances, featuring 3016 binary variables, 3016 continuous variables, and 5917 constraints.
We set the time limit to seconds. The results are summarized in the table below.
| Instance | mirp | mirp | mirp | srpn | srpn | srpn |
|---|---|---|---|---|---|---|
| Metric | Time(s) | PD integral | PD gap | Time(s) | PD integral | PD gap |
| NoCuts | 486.68 | 34735.55 | 6.67e+18 | 280.7 | 11020.69 | 0.28 |
| Default | 580.98 | 52362.46 | 5.38e+19 | 332.04 | 11687.01 | 0.21 |
| Search(20) | 492.85 | 33028.81 | 6.30e+18 | 384.02 | 14571.56 | 0.26 |
| Prune | 501.49 | 35542.82 | 8.33e+18 | 296.08 | 9211.76 | 0.19 |
| LLM4sepasel | 511.97 | 34573.52 | 5e+18 | 300.66 | 8421.65 | 0.14 |
| DynSep(Ours) | 482.13 | 30838.39 | 1.38 | 294.77 | 7581.16 | 0.1 |
These results show that our method (DynSep) delivers marked performance gains on additional real‑world datasets (MIRP and SRPN). Compared to the other configuration methods, DynSep significantly improves both convergence speed (evidenced by reduced PD integral) and solution quality (evidenced by lower PD gap).
Q3. Specification of MIPLIB mixed neos and mixed supportcase.
A3.
We apologize for the confusion. MIPLIB mixed neos and mixed supportcase evaluated in our paper are two well-established subsets derived from MIPLIB 2017 [2], which were constructed by the prior learning-based MILP research [3].
Each subset comprises a cluster of similar instances selected from MIPLIB 2017, where similarity is quantified using 100 human-designed instance features [1]. Briefly, a seed instance is first selected: neos‑1456979 for the mixed neos set, representing knapsack‑constraint problems, and supportcase40 for the mixed supportcase set, representing set‑packing problems. Then, a data-driven clustering algorithm [3] is applied to gather MIPLIB instances that closely match the seed instance in feature space, constructing the mixed neos and mixed supportcase subsets.
Due to space limitations in the rebuttal, we only present a subset of MIPLIB mixed neos and mixed supportcase datasets used in our experiments:
-
Mixed neos contains 17 instances: neos-1456979, icir97 tension, etc.
-
Mixed supportcase contains 37 instances: supportcase40, acc-tight2, acc-tight4, etc.
Q4. The latency of policy inference during B&C.
A4.
Thank you for raising this point. We have provided per-decision latency (the latency of policy inference per decision step) and the total inference time through the solving process as follows.
- Per-decision latency: The table below reports the average ("Avg. latency") and worst-case ("Max. latency") time taken for a single policy call across the entire branch-and-cut process.
| Set Cover | MIS | Knapsack | Corlat | MIK | Anonymous | LB | Neos | Supportcase | |
|---|---|---|---|---|---|---|---|---|---|
| Avg. Latency (s) | 0.33 | 0.22 | 0.09 | 0.05 | 0.31 | 0.41 | 3.04 | 0.11 | 0.39 |
| Max. Latency (s) | 1.09 | 1.01 | 0.71 | 0.70 | 1.07 | 1.94 | 4.51 | 2.02 | 6.85 |
Our results reveal that per-decision latency increases as instance size grows. The average inference latency stays below 0.5 seconds across all datasets, with the exception of the largest load balancing (LB) instance.
The two tables below provide the total inference time ("Infer. Time") over different datasets and solver time limits, along with the inference overhead rate ("Overhead Rate"), which represents the percentage of policy inference in the total solving runtime.
- Total inference time (nine datasets in our manuscript, -second time limit):
| SetCover | MIS | Knapsack | Corlat | MIK | Anonymous | LB | Neos | Supportcase | |
|---|---|---|---|---|---|---|---|---|---|
| Infer. Time (s) | 1.05 | 0.41 | 0.42 | 16.56 | 1.57 | 14.73 | 10.54 | 23.9 | 14.16 |
| Overhead Rate (%) | 69.54 | 77.36 | 80.77 | 72.13 | 14.29 | 6.09 | 3.51 | 10.16 | 10.69 |
- Total inference time (hard datasets in our manuscript, -second time limit):
| Anonymous | LB | Neos | Supportcase | |
|---|---|---|---|---|
| Infer. Time (s) | 21.32 | 16.01 | 86.96 | 12.31 |
| Overhead Rate (%) | 0.89 | 0.44 | 3.19 | 2.17 |
These tables show that on larger, more complex instances and with longer solver time limits, a smaller share of the total solving time is spent on decision making. This implies that the computational cost of our policy calls becomes more justifiable at scale, enabling meaningful improvements in overall solver efficiency with only modest overhead investment.
Q5. Memory/compute inference overhead is not discussed.
A5.
We have analyzed the compute inference overhead in Q4. Here, we focus on the memory inference overhead.
- Per-decision memory overhead: The table below show the peak GPU memory usage per policy call.
| SetCover | MIS | Corlat | LB | Neos | |
|---|---|---|---|---|---|
| Max. CPU mem. (MB) | 3081.70 | 2877.39 | 2892.15 | 4852.27 | 3100.90 |
| Max. GPU mem. (MB) | 2.10 | 2.10 | 2.11 | 2.11 | 2.11 |
- Peak memory overhead for overall inference:
The memory footprint required to store the encoder and policy model weights is MB for each dataset. We track the peak GPU memories during the reference via the tool torch.cuda.max_memory_allocated(), which are list in the table below.
| SetCover | MIS | Corlat | LB | Neos | |
|---|---|---|---|---|---|
| Peak GPU mem. (MB) | 35.56 | 57.85 | 15.68 | 257.60 | 28.85 |
Q6. Which scenarios and practical cases the data-driven approach is appropriatML-guided MILPe should be addressed.
A6.
Our data‑driven configuration approach proves effective across both synthetic and real‑world MILP benchmarks [1,2,3]. It is particularly well‑suited to scenarios where one has access to a batch of instances drawn from the similar distribution as the unseen examples. In these cases, the policy network can be trained on those example instances and then deployed during branch‑and‑cut to flexibly tune solver parameters via the solver’s existing parameter interfaces, dynamically adapting to individual problem structure.
[1] Huang, et al. "Distributional MIPLIB: a multi-domain library for advancing ml-guided milp methods." arXiv 2024.
[2] Gleixner, et al. "MIPLIB 2017: data-driven compilation of the 6th mixed-integer programming library." Mathematical Programming Computation(2021).
[3] Wangi, et al. "Learning cut selection for mixed-integer linear programming via hierarchical sequence model."ICLR (2023).
Dear Reviewer Fjx6,
Thank you very much for taking the time to read our rebuttal and for promptly acknowledging it. We truly appreciate your engagement with our work during the review process.
If you have any further thoughts, questions, or suggestions---either regarding our current response or the broader direction of the work---we would be genuinely grateful to hear them. We highly value your perspective and would welcome any opportunity for continued discussion or clarification.
Best regards,
Authors of Submission 22899
Since cutting-plane methods play a critical role in solving MILPs, my concerns remain unresolved.
(1) Hyperparameters have a significant impact on MILP performance, and even a single cut method—such as the Flow Cover Inequalities Separator in Gurobi—can involve a large number of parameters. The separator mechanism alone includes dozens of them. Therefore, fine-tuning only a limited subset of parameters, as done in the current submission and the rebuttal, is insufficient to fully assess the method's effectiveness.
(2) The performance concern also remains. Achieving strong results on a few instances is not particularly challenging. I am interested in seeing how the proposed method performs on the full set of 240 instances in MIPLIB2017, rather than just a limited subset.
Q1. Fine-tuning a limited subset of parameters is insufficient to fully assess the method's effectiveness.
Thank you for the valuable feedback. We have expanded our separator parameter set, from a total number of to , more than tripling its size. The expanded parameter space includes shared or separator-specific parameters and covers various numeric types (int, float, bool). The results in Table 1 show that our method with an enlarged parameter space delivers substantial performance gains over the default solver setting. Below is the detailed description.
In our original manuscript, our configuration policy controlled parameters in SCIP: namely,
- the global parameters for all separators:
separating/maxroundsandseparating/maxroundsroot(2 parameters). - for each of the 22 built-in separators, both its
separating/<separator_name>/freqandseparating/<separator_name>/priorityparameters (22 × 2 = 44 parameters).
We appreciate the reviewer’s point that a single separator can involve multiple tuning parameters beyond frequency and priority. Thus, we expand our configuration scope to include approximately 5–7 internal hyperparameters per separator, yielding a total of 144 parameters, more than three times our previous controlled parameters.
As shown in Table 1, our learned configuration policy—now encompassing the enlarged parameter space—delivers substantial performance gains over the default solver configuration.
Table 1. Evaluation for the default setting (Default) and our configuration method with an enlarged parameter space (DynSep-expended).
| Instance | Knapsack | Knapsack | Corlat | Corlat | neos | neos |
|---|---|---|---|---|---|---|
| Metric | Time(s) | PD integral | Time(s) | PD integral | Time(s) | PD integral |
| Default | 2.01 | 18.6 | 111.55 | 10573.14 | 282.98 | 18500.5 |
| DynSep-expended (ours) | 0.57 | 10.46 | 56.03 | 5161.01 | 249.43 | 8602.66 |
Compared to traditional automated configuration tools such as ParamILS [1]—typically targeting 81 CPLEX parameters—and SMAC [2]—focusing on 76 parameters—as well as recent ML approaches like Deep Metric Learning [3], which tune 17 parameters, our method supports a larger and extensible parameter space. DynSep's policy employs a unified and scalable framework capable of handling both shared and separator-specific parameters across multiple data types (int, float, bool).
The comprehensive list of newly included parameters is detailed in Table 2. Note that there are separator parameters in the table. Adding two global parameters—separating/maxrounds and separating/maxroundsroot—makes a total of parameters.
Table 2. List of separator parameters controlled by our configuration policy.
| Separator | Shared Params | Specific Params | Separator | Shared Params | Specific Params |
|---|---|---|---|---|---|
| closecuts | freq, priority, maxbounddist, delay, expbackoff | sepacombvalue, closethres | cgmip | freq, priority, maxbounddist, delay, expbackoff | minnodelimit, maxnodelimit |
| disjunctive | freq, priority, maxbounddist, delay, expbackoff | maxinvcuts, maxrank | strongcg | freq, priority, maxbounddist, delay, expbackoff | / |
| minor | freq, priority, maxbounddist, delay, expbackoff | maxminorsconst, maxminorsfac | aggregation | freq, priority, maxbounddist, delay, expbackoff | maxaggrs, maxsepacuts |
| mixing | freq, priority, maxbounddist, delay, expbackoff | uselocalbounds, iscutsonints | clique | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, scaleval |
| rlt | freq, priority, maxbounddist, delay, expbackoff | maxusedvars, goodscore | zerohalf | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, minviol |
| interminor | freq, priority, maxbounddist, delay, expbackoff | mincutviol | mcf | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, nclusters |
| convexproj | freq, priority, maxbounddist, delay, expbackoff | nlpiterlimit | eccuts | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, minviolation |
| gauge | freq, priority, maxbounddist, delay, expbackoff | nlpiterlimit | oddcycle | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, scalingfactor |
| impliedbounds | freq, priority, maxbounddist, delay, expbackoff | usetwosizecliques | flowcover | freq, priority, maxbounddist, delay, expbackoff | / |
| intobj | freq, priority, maxbounddist, delay, expbackoff | / | cmir | freq, priority, maxbounddist, delay, expbackoff | / |
| gomory | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, away | rapidlearning | freq, priority, maxbounddist, delay, expbackoff | maxcalls, maxnodes |
[1] Hutter, Frank, et al. "ParamILS: an automatic algorithm configuration framework." Journal of artificial intelligence research 36 (2009).
[2] Hutter, Frank, et al. "Sequential model-based optimization for general algorithm configuration." International conference on learning and intelligent optimization (2011).
[3] Hosny, Abdelrahman, and Sherief Reda. "Automatic MILP solver configuration by learning problem similarities." Annals of Operations Research (2024).
Q2. How the proposed method performs on the full set of 240 instances in MIPLIB2017?
Thank you for your valuable feedback. We have tested our method on the full set of 240 MIPLIB 2017 benchmark instances. The results in Table 3 show that our approach (DynSep) delivers notable improvements in solving efficiency in the challenging MIPLIB2017 dataset.
Specifically, we set the time limit as seconds and excluded five instances whose presolving time exceeded 300 seconds: neos‑3402454‑bohle, neos‑4722843‑widden, mzzv42z, neos‑5052403‑cygnet, proteindesign121hz512p9, and proteindesign122trx11p8, which is a common removing criterion for MIPLIB2017 benchmark [4,5]
The remaining 235 instances were split into a 70% training set and a 30% test set. Table 3 reports the overall average performance of our method across all 235 instances, and Table 4 provides detailed results for every individual instance in the MIPLIB 2017 benchmark. These experiments confirm that our approach delivers notable improvements in solving efficiency, even when evaluated on the more challenging benchmark set.
[4] Turner, Mark, et al. "Adaptive cut selection in mixed-integer linear programming." arXiv preprint arXiv:2202.10962 (2022).
[5] Wang, Jie, et al. "Learning to cut via hierarchical sequence/set model for efficient mixed-integer programming." IEEE Transactions on Pattern Analysis and Machine Intelligence (2024).
Table 3. Comparison between default setting and our method (DynSep) on all 235 MIPLIB 2017 instances.
| Instance | Miplib2017 | Miplib2017 |
|---|---|---|
| Metric | Time(s) | PD integral |
| Default | 258.77 | 17153.69 |
| DynSep (ours) | 238.88 | 15092.16 |
Table 4. Per-instance comparison of Default and our method (DynSep) on the full set of 235 MIPLIB 2017 benchmark.
| instance_name | solving_time | primaldualintegral | Method | instance_name | solving_time | primaldualintegral | Method | |
|---|---|---|---|---|---|---|---|---|
| 0 | 30n20b8 | 102.77 | 7918.64 | DynSep(Ours) | neos-4387871-tavua | 300.00 | 13809.79 | DynSep(Ours) |
| 1 | 30n20b8 | 294.10 | 21053.08 | Default | neos-4387871-tavua | 300.00 | 26555.13 | Default |
| 2 | 50v-10 | 300.00 | 3753.26 | DynSep(Ours) | neos-4413714-turia | 300.04 | 30003.96 | DynSep(Ours) |
| 3 | 50v-10 | 300.00 | 3553.53 | Default | neos-4413714-turia | 302.13 | 30213.00 | Default |
| 4 | CMS750_4 | 119.32 | 8927.87 | DynSep(Ours) | neos-4532248-waihi | 300.03 | 30002.91 | DynSep(Ours) |
| 5 | CMS750_4 | 300.20 | 21969.51 | Default | neos-4532248-waihi | 300.23 | 10000.84 | Default |
| 6 | academictimetablesmall | 300.00 | 30000.43 | DynSep(Ours) | neos-4647030-tutaki | 300.07 | 15994.51 | DynSep(Ours) |
| 7 | academictimetablesmall | 302.93 | 30292.50 | Default | neos-4647030-tutaki | 302.40 | 20970.49 | Default |
| 8 | air05 | 28.47 | 724.52 | DynSep(Ours) | neos-4738912-atrato | 300.00 | 903.73 | DynSep(Ours) |
| 9 | air05 | 222.33 | 2019.36 | Default | neos-4738912-atrato | 300.01 | 992.78 | Default |
| 10 | app1-1 | 6.09 | 594.63 | DynSep(Ours) | neos-4763324-toguru | 304.60 | 21333.61 | DynSep(Ours) |
| 11 | app1-1 | 18.49 | 1695.55 | Default | neos-4763324-toguru | 303.26 | 25246.69 | Default |
| 12 | app1-2 | 187.93 | 15555.94 | DynSep(Ours) | neos-4954672-berkel | 300.00 | 10422.23 | DynSep(Ours) |
| 13 | app1-2 | 300.33 | 29112.20 | Default | neos-4954672-berkel | 300.00 | 10030.91 | Default |
| 14 | assign1-5-8 | 1.26 | 18.48 | DynSep(Ours) | neos-5049753-cuanza | 300.27 | 30027.25 | DynSep(Ours) |
| 15 | assign1-5-8 | 31.90 | 490.61 | Default | neos-5049753-cuanza | 300.35 | 30035.39 | Default |
| 16 | atlanta-ip | 300.00 | 30000.45 | DynSep(Ours) | neos-506428 | 303.78 | 30377.81 | DynSep(Ours) |
| 17 | atlanta-ip | 302.42 | 30241.88 | Default | neos-506428 | 301.50 | 30150.24 | Default |
| 18 | b1c1s1 | 300.00 | 16908.55 | DynSep(Ours) | neos-5093327-huahum | 300.00 | 18587.41 | DynSep(Ours) |
| 19 | b1c1s1 | 300.04 | 27161.92 | Default | neos-5093327-huahum | 300.22 | 27226.51 | Default |
| 20 | bab2 | 300.36 | 30036.36 | DynSep(Ours) | neos-5104907-jarama | 300.35 | 30034.80 | DynSep(Ours) |
| 21 | bab2 | 300.43 | 30042.55 | Default | neos-5104907-jarama | 300.47 | 30046.90 | Default |
| 22 | bab6 | 323.32 | 32331.94 | DynSep(Ours) | neos-5107597-kakapo | 300.00 | 28661.57 | DynSep(Ours) |
| 23 | bab6 | 300.18 | 30017.88 | Default | neos-5107597-kakapo | 300.05 | 30004.63 | Default |
| 24 | beasleyC3 | 300.05 | 6123.19 | DynSep(Ours) | neos-5114902-kasavu | 301.44 | 30144.12 | DynSep(Ours) |
| 25 | beasleyC3 | 32.49 | 1394.72 | Default | neos-5114902-kasavu | 300.74 | 10041.43 | Default |
| 26 | binkar10_1 | 300.00 | 283.00 | DynSep(Ours) | neos-5188808-nattai | 300.00 | 30000.33 | DynSep(Ours) |
| 27 | binkar10_1 | 172.39 | 1093.98 | Default | neos-5188808-nattai | 300.31 | 30031.18 | Default |
| 28 | blp-ar98 | 300.00 | 11455.66 | DynSep(Ours) | neos-5195221-niemur | 300.01 | 30000.54 | DynSep(Ours) |
| 29 | blp-ar98 | 300.00 | 17007.64 | Default | neos-5195221-niemur | 300.33 | 30033.01 | Default |
| 30 | blp-ic98 | 300.00 | 5253.91 | DynSep(Ours) | neos-631710 | 300.09 | 0.00 | DynSep(Ours) |
| 31 | blp-ic98 | 300.00 | 6227.12 | Default | neos-631710 | 303.56 | 0.00 | Default |
| 32 | bnatt400 | 300.29 | 30029.25 | DynSep(Ours) | neos-662469 | 192.55 | 15312.80 | DynSep(Ours) |
| 33 | bnatt400 | 300.08 | 30008.34 | Default | neos-662469 | 299.14 | 25187.40 | Default |
| 34 | bnatt500 | 300.00 | 30000.08 | DynSep(Ours) | neos-787933 | 4.36 | 435.77 | DynSep(Ours) |
| 35 | bnatt500 | 300.08 | 30007.55 | Default | neos-787933 | 5.77 | 576.29 | Default |
Table 4 (continued). A representative subset of MIPLIB 2017 instances is shown below for brevity—full results are available upon request.
| instance_name | solving_time | primaldualintegral | Method | instance_name | solving_time | primaldualintegral | Method | |
|---|---|---|---|---|---|---|---|---|
| 36 | bppc4-08 | 5.99 | 256.75 | DynSep(Ours) | neos-827175 | 128.83 | 2135.73 | DynSep(Ours) |
| 37 | bppc4-08 | 300.00 | 2177.85 | Default | neos-827175 | 171.48 | 3056.52 | Default |
| 38 | brazil3 | 300.00 | 30000.17 | DynSep(Ours) | neos-848589 | 330.87 | 33087.08 | DynSep(Ours) |
| 39 | brazil3 | 300.03 | 30002.79 | Default | neos-848589 | 311.46 | 11124.61 | Default |
| 40 | buildingenergy | 300.03 | 30003.02 | DynSep(Ours) | neos-860300 | 152.44 | 4393.38 | DynSep(Ours) |
| 41 | buildingenergy | 300.03 | 30002.63 | Default | neos-860300 | 152.19 | 4196.21 | Default |
| 42 | cbs-cta | 93.76 | 9371.95 | DynSep(Ours) | neos-873061 | 300.03 | 30003.18 | DynSep(Ours) |
| 43 | cbs-cta | 300.10 | 30010.24 | Default | neos-873061 | 300.03 | 30003.02 | Default |
| 44 | chromaticindex1024-7 | 300.12 | 30011.77 | DynSep(Ours) | neos-911970 | 300.00 | 8842.15 | DynSep(Ours) |
| 45 | chromaticindex1024-7 | 300.15 | 30015.01 | Default | neos-911970 | 300.00 | 11450.20 | Default |
| 46 | chromaticindex512-7 | 304.15 | 16647.67 | DynSep(Ours) | neos-933966 | 300.15 | 28824.98 | DynSep(Ours) |
| 47 | chromaticindex512-7 | 301.59 | 25040.11 | Default | neos-933966 | 300.09 | 29583.14 | Default |
| 48 | cmflsp50-24-8-8 | 300.00 | 30000.40 | DynSep(Ours) | neos-937815 | 300.04 | 23919.16 | DynSep(Ours) |
| 49 | cmflsp50-24-8-8 | 300.26 | 30026.38 | Default | neos-937815 | 300.18 | 23355.06 | Default |
| 50 | co-100 | 303.37 | 27088.52 | DynSep(Ours) | neos-950242 | 300.00 | 30000.30 | DynSep(Ours) |
| 51 | co-100 | 305.11 | 27836.45 | Default | neos-950242 | 300.01 | 29970.15 | Default |
| 52 | cod105 | 300.11 | 15312.43 | DynSep(Ours) | neos-957323 | 368.33 | 17950.78 | DynSep(Ours) |
| 53 | cod105 | 300.04 | 16051.30 | Default | neos-957323 | 301.42 | 17405.53 | Default |
| 54 | comp07-2idx | 99.30 | 9929.90 | DynSep(Ours) | neos-960392 | 305.71 | 30571.17 | DynSep(Ours) |
| 55 | comp07-2idx | 300.12 | 30011.60 | Default | neos-960392 | 303.98 | 30397.71 | Default |
| 56 | comp21-2idx | 43.22 | 4091.79 | DynSep(Ours) | neos17 | 16.88 | 681.30 | DynSep(Ours) |
| 57 | comp21-2idx | 300.08 | 30004.57 | Default | neos17 | 29.33 | 1376.34 | Default |
| 58 | cost266-UUE | 300.00 | 3938.64 | DynSep(Ours) | neos5 | 300.00 | 1800.15 | DynSep(Ours) |
| 59 | cost266-UUE | 300.05 | 14030.49 | Default | neos5 | 300.01 | 2036.73 | Default |
| 60 | cryptanalysiskb128n5obj14 | 300.07 | 30007.22 | DynSep(Ours) | neos8 | 21.93 | 2015.16 | DynSep(Ours) |
| 61 | cryptanalysiskb128n5obj14 | 300.13 | 0.00 | Default | neos8 | 24.73 | 2361.97 | Default |
| 62 | cryptanalysiskb128n5obj16 | 300.04 | 0.00 | DynSep(Ours) | neos859080 | 1.26 | 126.35 | DynSep(Ours) |
| 63 | cryptanalysiskb128n5obj16 | 300.07 | 10003.04 | Default | neos859080 | 1.77 | 176.57 | Default |
| 64 | csched007 | 27.22 | 2720.19 | DynSep(Ours) | net12 | 300.00 | 23510.79 | DynSep(Ours) |
| 65 | csched007 | 300.01 | 30001.17 | Default | net12 | 300.20 | 26938.98 | Default |
| 66 | csched008 | 300.07 | 30006.74 | DynSep(Ours) | netdiversion | 300.29 | 30028.80 | DynSep(Ours) |
| 67 | csched008 | 300.02 | 30001.81 | Default | netdiversion | 301.77 | 30176.71 | Default |
| 68 | cvs16r128-89 | 300.00 | 11471.81 | DynSep(Ours) | nexp-150-20-8-5 | 19.32 | 1898.55 | DynSep(Ours) |
| 69 | cvs16r128-89 | 300.16 | 16150.14 | Default | nexp-150-20-8-5 | 65.86 | 6465.64 | Default |
| 70 | dano3_3 | 300.00 | 5711.73 | DynSep(Ours) | ns1116954 | 303.16 | 30316.21 | DynSep(Ours) |
| 71 | dano3_3 | 300.11 | 7975.16 | Default | ns1116954 | 300.07 | 30007.18 | Default |
| 72 | dano3_5 | 300.00 | 5865.82 | DynSep(Ours) | ns1208400 | 301.12 | 30111.61 | DynSep(Ours) |
| 73 | dano3_5 | 300.64 | 7590.88 | Default | ns1208400 | 300.07 | 30006.55 | Default |
| 74 | decomp2 | 35.72 | 1475.39 | DynSep(Ours) | ns1644855 | 300.03 | 29906.66 | DynSep(Ours) |
| 75 | decomp2 | 17.53 | 1572.22 | Default | ns1644855 | 300.01 | 29891.60 | Default |
| 76 | drayage-100-23 | 49.37 | 3359.66 | DynSep(Ours) | ns1760995 | 300.38 | 0.00 | DynSep(Ours) |
| 77 | drayage-100-23 | 67.45 | 5195.09 | Default | ns1760995 | 300.53 | 0.00 | Default |
| 78 | drayage-25-23 | 300.00 | 2613.06 | DynSep(Ours) | ns1830653 | 300.00 | 8709.04 | DynSep(Ours) |
| 79 | drayage-25-23 | 300.05 | 8676.32 | Default | ns1830653 | 300.05 | 24382.60 | Default |
| 80 | dws008-01 | 300.00 | 26082.29 | DynSep(Ours) | ns1952667 | 304.78 | 30478.39 | DynSep(Ours) |
| 81 | dws008-01 | 300.01 | 30000.88 | Default | ns1952667 | 304.65 | 30464.90 | Default |
| 82 | eil33-2 | 255.11 | 4152.53 | DynSep(Ours) | nu25-pr12 | 4.77 | 356.16 | DynSep(Ours) |
| 83 | eil33-2 | 300.02 | 4917.81 | Default | nu25-pr12 | 5.33 | 356.76 | Default |
| 84 | eilA101-2 | 303.30 | 24543.44 | DynSep(Ours) | nursesched-medium-hint03 | 302.30 | 30230.15 | DynSep(Ours) |
| 85 | eilA101-2 | 304.72 | 27328.43 | Default | nursesched-medium-hint03 | 302.10 | 30182.35 | Default |
| 86 | enlight_hard | 0.01 | 1.26 | DynSep(Ours) | nursesched-sprint02 | 300.00 | 30000.18 | DynSep(Ours) |
| 87 | enlight_hard | 0.01 | 1.46 | Default | nursesched-sprint02 | 284.22 | 24857.05 | Default |
| 88 | ex10 | 148.85 | 0.00 | DynSep(Ours) | nw04 | 143.64 | 6569.14 | DynSep(Ours) |
| 89 | ex10 | 203.21 | 0.00 | Default | nw04 | 183.54 | 8531.62 | Default |
| 90 | ex9 | 49.94 | 0.00 | DynSep(Ours) | opm2-z10-s4 | 301.22 | 27698.56 | DynSep(Ours) |
| 91 | ex9 | 54.95 | 0.00 | Default | opm2-z10-s4 | 300.02 | 29157.60 | Default |
Dear Reviewer Fjx6,
Thank you for clicking the “Mandatory Acknowledgement” button to confirm you’ve read the authors’ rebuttal. Please also leave comments as soon as possible on whether the authors have addressed your concerns and any follow-up questions in your mind, if any.
Thank you for your continued contributions to the review process.
Best, AC
This paper proposes DynSep, a dynamic separator configuration method that models separator configuration in different rounds as a reinforcement learning task and tokenizes the incremental subgraphs with a transformer policy model to autoregressively predict when to half separation and which separators to activate at each round. Evaluated on synthetic and large-scale real-world MILP problems, DynSep substantially accelerates the solving time on easy and medium instances, as well as reducing integrality gap on hard instances.
优缺点分析
Strengths
- The paper is clear and well-written.
- The extension from static separator configuration to dynamic separator configuration is non-trivial. The authors handle this extension with incremental subgraph tokenization with blocked positional encoding for the transformer architecture, and these special treatments in the architecture design makes sense.
- The authors show significant performance improvement over baselines on a variety of benchmarks, with detailed ablation study to justify the method’s design choices.
Weaknesses
- I feel like the methods developed in this paper is not particularly specific to configuring cutting plane separators, and it should be able to apply to configure other parameters (e.g. branching, heuristics). Can the authors comment on the difficulty / flexibility of adapting their framework to configure other parameters, and potentially do small experiments to show the feasibility of this extension?
- The authors set separator frequency as 10 and set the maximum separator round as T = 5. It is unclear how these two parameters (i.e. more or less frequent separator configurations) affect the performance of the method. The authors should perform some robustness studies in terms of varying these two parameters and comparing the performance.
- Can the authors conduct experiments on gurobi (or other MILP solvers) and see if their algorithms can improve gurobi as well?
问题
See weaknesses.
局限性
The authors explain the limitation of their methods in the conclusion section “the current implementation of dynsep lacks a lightweight decision model to retrain historical information across the global B&B tree”. I wonder if the authors can slightly expand in the conclusion to propose a few potential ways that can potentially address this limitation?
最终评判理由
I'm overall in favor of accepting this paper, as it extends prior work from single-turn cutting-plane hyperparameter configuration to adaptive, multi-turn hyperparameter configuration, and it observes significant improvement from additional configurations. From the rebuttal, one weakness of the paper is that their current implementation cannot easily adapt to configure other sets of hyperparameters in the MILP solver, so the study of the proposed method's generalizability to different sets of MILP hyperparameters is bottlenecked by the engineering challenges. Also, the paper focuses on configuring the SCIP solver, and faces engineering challenges to adapt to other MILP solvers, so the generalizability across solvers is not tested. Given this, I would like to keep my initial score 4, borderline accept.
格式问题
N/A
We thank the reviewer for the positive and insightful comments. We respond to each comment as follows and sincerely hope that our rebuttal will properly address your concerns. If so, we would deeply appreciate it if you could raise your score. If not, please let us know your further concerns, and we will continue actively responding to your comments and improving our submission.
Q1. The difficulty / flexibility of adapting their framework to configure other parameters.
A1.
Thank you for the insightful comment. We claim that our proposed DynSep framework is inherently extensible to a broader set of solver hyperparameters beyond separators (e.g., branching, heuristics). This is because DynSep is inherently a node-level adaptive configuration framework in the B&C tree.
Compared to configure separator, adapting DynSep to branching or heuristics requires nontrivial solver-specific engineering because they expose different plugin/callback interfaces. Specifically, adapting to branching or heuristics would involve:
- Implement analogous instrumentation via the BRANCH and HEURISTIC plugin callbacks, such as
BRANCHEXECLPandHEUREXEC. - Augment the state representation with relevant branching/heuristic signals provided by the solver.
- Adapt the policy network of DynSep: model the outputs of other parameters as a logistic‑normal distribution, followed by optional discretization to support both integer and continuous parameters. This enables flexible, differentiable control of arbitrary solver parameters.
- Train the policy with the same PPO-based scheme.
Due to time constraints, we have not yet migrated DynSep to branching or heuristic configuration, but we have already extended it to additional cut-related parameters to demonstrate feasibility. We conducted additional experiments on three critical hyperparameter groups, while retaining the original tuning mechanism for separator activation and max round.
- para group 1: Cut Depth / Aggressiveness (sepastore/age in SCIP). Controlled by solver parameters
separating/cutagelimitandseparating/poolfreq. - para group 2: Cut selection thresholds (e.g., efficacy vs. orthogonality). Controlled by solver parameters
separating/minefficacyandcutselection/hybrid/minortho. - para group 3: Separation frequency per node. Controlled by solver parameters
separating/cutagelimitandseparating/poolfreq.
We provide the experimental results as follows.
| Instance | Knapsack | Knapsack | Knapsack | Corlat | Corlat | Corlat | Supportcase | Supportcase | Supportcase |
|---|---|---|---|---|---|---|---|---|---|
| Metric | Time(s) | PD integral | PD gap | Time(s) | PD integral | PD gap | Time(s) | PD integral | PD gap |
| DynSep (Ours) | 0.52 | 9.71 | 0 | 22.96 | 2233.42 | 0 | 132.50 | 9212.24 | 7.86 |
| Para Group 1 | 0.65 | 9.01 | 0 | 47.36 | 4580.9 | 4e+18 | 141.54 | 8610.24 | 0.17 |
| Para Group 2 | 0.36 | 7.91 | 0 | 42.17 | 1971.18 | 0.01 | 167.64 | 8661.57 | 0.16 |
| Para Group 3 | 0.65 | 10.03 | 0 | 24.71 | 1922.77 | 0 | 141.09 | 8362.53 | 0.17 |
We evaluated our method on three benchmark datasets: Knapsack, Corlat, and MIPLIB mixed supportcase, measuring solving time (Time), primal–dual gap integral (PD integral), and primal–dual gap (PD gap). All three metrics are lower-is-better.
We compared our original DynSep method with its extended versions incorporating three additional parameter groups. Results show that:
-
Across all datasets, DynSep and its extended variants consistently outperform the default solver configuration reported in the main paper.
-
The differences among the parameter groups are relatively small, yet certain combinations yield further performance improvements in both Time and PD integral.
These results validate that DynSep can flexibly extend to control a broader set of solver hyperparameters. Furthermore, carefully chosen parameter combinations can yield additional gains in solving speed and convergence.
Q2. Robustness studies in terms of separator frequency and maximum separation round.
A2.
We conducted robustness ablations over separator frequency (1, 5, 10, 20) and maximum separation rounds (3, 5, 10, 20) on three benchmarks (MIK, Corlat, and MIPLIB mixed Neos). Overall, across almost all tested settings, DynSep outperforms the default configuration, showing that the approach is reasonably stable to these hyperparameters.
On each benchmark, we report three metrics: solving time (Time), primal–dual gap integral (PD integral), and primal–dual gap (PD gap). All three metrics are lower-is-better. Below is our key observations.
- Frequency: Moderate frequency (e.g., 5&10) gives the better trade-off. That is, small frequency causes separators to fire excessively, incurring high cut-generation overhead, whereas large frequency reduces opportunities for timely dual-bound tightening.
| Instance | MIK | MIK | MIK | Corlat | Corlat | Corlat | Neos | Neos | Neos |
|---|---|---|---|---|---|---|---|---|---|
| Metric | Time(s) | PD integral | PD gap | Time(s) | PD integral | PD gap | Time(s) | PD integral | PD gap |
| Default | 16.65 | 82.80 | 0.0 | 111.55 | 10573.14 | 2.73e+19 | 282.98 | 18500.5 | 2.5e+19 |
| Freq=1 | 12.74 | 124.22 | 0.0 | 53.77 | 4582.15 | 4e+18 | 252.12 | 8756.3 | 2.5e+19 |
| Freq=5 | 13.04 | 86.55 | 0.0 | 38.44 | 3513.13 | 0.0 | 243.91 | 9248.52 | 2.5e+19 |
| Freq=10 (Ours) | 10.99 | 134.15 | 0.0 | 22.96 | 2233.42 | 0.0 | 235.19 | 8511.58 | 2.5e+19 |
| Freq=20 | 12.83 | 213.15 | 0.0 | 49.14 | 4658.09 | 0.0 | 261.86 | 8789.54 | 2.5e+19 |
- Maximum separation rounds: Setting this value too low produces weak cuts and degrades performance, while setting it excessively high increases the computational cost of cut generation. Although this parameter shows some dataset sensitivity, MaxRound=5 is empirically near-optimal in our tests.
| Instance | MIK | MIK | MIK | Corlat | Corlat | Corlat | Neos | Neos | Neos |
|---|---|---|---|---|---|---|---|---|---|
| Metric | Time(s) | PD integral | PD gap | Time(s) | PD integral | PD gap | Time(s) | PD integral | PD gap |
| Default | 16.65 | 82.80 | 0.0 | 111.55 | 10573.14 | 2.73e+19 | 282.98 | 18500.5 | 2.5e+19 |
| MaxRound=3 | 13.63 | 150.19 | 0.0 | 107.26 | 8965.26 | 2.4e+19 | 239.22 | 8581.5 | 2.5e+19 |
| MaxRound=5 (Ours) | 10.99 | 134.15 | 0.0 | 22.96 | 2233.42 | 0.0 | 235.19 | 8511.58 | 2.5e+19 |
| MaxRound=10 | 17.6 | 182.71 | 0.0 | 40.45 | 3717.64 | 2e+18 | 249.41 | 8846.91 | 2.5e+19 |
| MaxRound=20 | 16.2 | 169.8 | 0.0 | 36.71 | 3609.11 | 0.0 | 252.59 | 10945.82 | 2.5e+19 |
Q3. Can the authors conduct experiments on gurobi (or other MILP solvers)?
A3.
We appreciate the reviewer’s suggestion to apply our method to ohter MILP solvers like Gurobi. Although our current experiments are conducted in SCIP, translating our approach for separator-activation tuning to Gurobi is technically feasible. Below, we outline the feasibility and limitations of such an implementation.
- Feasibility. Gurobi provides separators like
CliqueCuts,FlowCoverCuts,MIRCuts,StrongCGCuts, etc., each accepting a configuration parameter in {0, 1, 2} (or –1 for automatic). 0 disables the cut separator, 1 enables it moderately, and 2 enables it aggressively—precisely matching DynSep’s configuration schema. These parameters can be set statically viamodel.setParam("CliqueCuts", value)and can also be changed dynamically at the node level viacbSetIntParam(...)within awhere == GRB.Callback.MIPNODEcallback. By using these interfaces, our separator configuration method can be ported to Gurobi fairly directly. - Limitation. Gurobi’s callback mechanism only invokes user code once per node for separators, which prevents us from adjusting parameters at a finer, per-separation-round granularity. However, node-level tuning remains effective for most practical scenarios.
Given the preceding discussion, the adaptation requires substantial solver-specific engineering to rewire state collection, accommodate Gurobi’s callback semantics, and translate our dynamic decision logic to its API. Due to time constraints, we have not yet completed the Gurobi experiments. Nevertheless, we remain confident that DynSep’s separator-configuration strategy can be properly implemented in Gurobi. We will prioritize these tests and include preliminary Gurobi results once feasible.
Q4. A lightweight decision model to retrain historical information across the global B&B tree.
A4.
Here we provide more discussion on the potential ways that can address the mentioned limitation. The current DynSep only reasons over the local, per-node incremental graph. To capture global, historical dynamics across the entire branch-and-bound tree, one could:
- Maintain a tree-wide memory of separator performance and variable bound changes (e.g. via a lightweight recurrent summary or graph coarsening).
- Aggregate cut efficacy statistics at each depth level (or over recent sibling nodes) to inform decisions in new nodes—trading off memory overhead against improved long-term planning.
While logging every cut and bound update could incur runtime overhead, designing rational sampling techniques could keep this overhead minimal. Embedding such a lightweight global module would allow DynSep to adapt not only to per-round shifts but also to the evolving solver trajectory across the entire search tree.
Thanks for the response. I do not have any follow up questions and would like to keep my score as I think the adaptive cutting plane parameter configuration is a valuable contribution to the community given prior works mainly focus on single turn configuration. I understand the engineering difficulty, but I encourage the authors to explore adaptive configuration for other MILP parameters and potentially other MILP solvers to further strengthen the work.
Thank you very much for your encouraging feedback. We are glad to know that you view the adaptive cutting-plane parameter configuration as a valuable contribution. We fully appreciate your insightful suggestion regarding extending adaptive configuration to other MILP parameters and solvers, and we look forward to exploring these directions in our future work.
Notably, we have conducted new experimental trials exploring an enlarged MILP parameter space. Although these additional parameters focus on the separator module due to engineering complexity, we hope that the results could demonstrate the extensibility of our method across a broader parameter space.
Specially, we expand our controlled separator parameter set from a total number of to , more than tripling its original size. The expanded parameter space includes shared or separator-specific parameters and covers various numeric types (int, float, bool).
The results in Table 1 show that our method (DynSep) with an enlarged parameter space delivers substantial performance gains over the default solver configuration.
Table 1. Evaluation for the default setting (Default) and our configuration method with an enlarged parameter space (DynSep-expended).
| Instance | Knapsack | Knapsack | Corlat | Corlat | neos | neos |
|---|---|---|---|---|---|---|
| Metric | Time(s) | PD integral | Time(s) | PD integral | Time(s) | PD integral |
| Default | 2.01 | 18.6 | 111.55 | 10573.14 | 282.98 | 18500.5 |
| DynSep-expended (ours) | 0.57 | 10.46 | 56.02 | 5161.01 | 249.43 | 8602.66 |
Compared to traditional automated configuration tools such as ParamILS [1]—typically targeting 81 CPLEX parameters—and SMAC [2]—focusing on 76 parameters—as well as recent machine learning approaches like Deep Metric Learning [3], which tune 17 parameters, our method supports a larger and extensible parameter space. It employs a unified and scalable framework capable of handling both shared and separator-specific parameters across multiple data types (int, float, bool).
The comprehensive list of newly included parameters is detailed in Table 2. Note that there are separator parameters in the table. Adding two global parameters—separating/maxrounds and separating/maxroundsroot—makes a total of parameters.
Table 2. List of separator parameters controlled by our configuration policy.
| Separator | Shared Params | Specific Params | Separator | Shared Params | Specific Params |
|---|---|---|---|---|---|
| closecuts | freq, priority, maxbounddist, delay, expbackoff | sepacombvalue, closethres | cgmip | freq, priority, maxbounddist, delay, expbackoff | minnodelimit, maxnodelimit |
| disjunctive | freq, priority, maxbounddist, delay, expbackoff | maxinvcuts, maxrank | strongcg | freq, priority, maxbounddist, delay, expbackoff | / |
| minor | freq, priority, maxbounddist, delay, expbackoff | maxminorsconst, maxminorsfac | aggregation | freq, priority, maxbounddist, delay, expbackoff | maxaggrs, maxsepacuts |
| mixing | freq, priority, maxbounddist, delay, expbackoff | uselocalbounds, iscutsonints | clique | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, scaleval |
| rlt | freq, priority, maxbounddist, delay, expbackoff | maxusedvars, goodscore | zerohalf | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, minviol |
| interminor | freq, priority, maxbounddist, delay, expbackoff | mincutviol | mcf | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, nclusters |
| convexproj | freq, priority, maxbounddist, delay, expbackoff | nlpiterlimit | eccuts | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, minviolation |
| gauge | freq, priority, maxbounddist, delay, expbackoff | nlpiterlimit | oddcycle | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, scalingfactor |
| impliedbounds | freq, priority, maxbounddist, delay, expbackoff | usetwosizecliques | flowcover | freq, priority, maxbounddist, delay, expbackoff | / |
| intobj | freq, priority, maxbounddist, delay, expbackoff | / | cmir | freq, priority, maxbounddist, delay, expbackoff | / |
| gomory | freq, priority, maxbounddist, delay, expbackoff | maxsepacuts, away | rapidlearning | freq, priority, maxbounddist, delay, expbackoff | maxcalls, maxnodes |
[1] Hutter, Frank, et al. "ParamILS: an automatic algorithm configuration framework." Journal of artificial intelligence research 36 (2009).
[2] Hutter, Frank, et al. "Sequential model-based optimization for general algorithm configuration." International conference on learning and intelligent optimization (2011).
[3] Hosny, Abdelrahman, and Sherief Reda. "Automatic MILP solver configuration by learning problem similarities." Annals of Operations Research (2024).
Dear Reviewer E4cm,
Thank you for clicking the “Mandatory Acknowledgement” button to confirm you’ve read the authors’ rebuttal. Please also leave comments as soon as possible on whether the authors have addressed your concerns and any follow-up questions in your mind, if any.
Thank you for your continued contributions to the review process.
Best, AC
This paper introduces DynSep, a dynamic separator configuration method for cutting plane selection in Mixed-Integer Linear Programming (MILP) solvers. DynSep formulates the round-wise activation and deactivation of cutting plane separators as a reinforcement learning (RL) problem. By modeling the evolving MILP instance as an incremental triplet graph, DynSep uses a GCN encoder and a decoder-only Transformer to autoregressively predict both when to halt separation and which separators to activate at each round. Experiments on synthetic and large-scale real-world MILP benchmarks demonstrate that DynSep significantly accelerates solving time (by 64% on average for easy/medium datasets) and reduces the primal-dual gap integral (by 16% on challenging datasets) compared to both human-designed and learning-based baselines. The method also shows strong generalization to much larger MILP instances than those used in training.
优缺点分析
Strengths:
- DynSep delivers substantial improvements over both human-designed and ML baselines across a wide range of MILP benchmarks, including large-scale industrial and public datasets.
- The method generalizes to larger instances and unseen problem sizes, a key requirement for practical MILP solving.
- Ablation studies confirm the necessity of each component, and visualization of learned separator policies aligns with domain insights.
Weaknesses:
- All experiments are conducted within SCIP; comparisons to commercial solvers (e.g., Gurobi, CPLEX) or other solver backends are missing, which limits claims of generality and practical impact.
- The approach configures only the built-in separators in SCIP, and does not explore integration with advanced or user-defined cutting plane families, potentially limiting extensibility.
问题
- Could you add comparisons with these mainstream commercial solvers (such as Gurobi or CPLEX) so I can better understand the current level of DynSep’s capabilities in an industrial context?
- How does DynSep handle the addition of new, user-defined, or more advanced separators beyond SCIP’s built-in set? What is required to support extensibility?
局限性
Yes
最终评判理由
Thank you for your detailed response. It has clarified the questions I raised and resolved my earlier concerns. Given this, I am confident in the contributions of the work and maintain my positive score of 4.
格式问题
No
We thank the reviewer for the positive and insightful comments. We respond to each comment as follows and sincerely hope that our rebuttal will properly address your concerns. If so, we would deeply appreciate it if you could raise your score. If not, please let us know your further concerns, and we will continue actively responding to your comments and improving our submission.
Q1. Extension to commercial solvers. (e.g., Gurobi, CPLEX)
A1.
We appreciate the reviewer’s suggestion to apply our method to commercial solvers like Gurobi or CPLEX. Although our current experiments are conducted in SCIP, translating our approach for separator-activation tuning to Gurobi is technically feasible. Below, we outline the feasibility and limitations of such an implementation.
- Feasibility. Gurobi provides separators like
CliqueCuts,FlowCoverCuts,MIRCuts,StrongCGCuts, etc., each accepting a configuration parameter in {0, 1, 2} (or –1 for automatic). 0 disables the cut separator, 1 enables it moderately, and 2 enables it aggressively—precisely matching DynSep’s configuration schema. These parameters can be set statically viamodel.setParam("CliqueCuts", value)and can also be changed dynamically at the node level viacbSetIntParam(...)within awhere == GRB.Callback.MIPNODEcallback. By using these interfaces, our separator configuration method can be ported to Gurobi fairly directly. - Limitation. Gurobi’s callback mechanism only invokes user code once per node for separators, which prevents us from adjusting parameters at a finer, per-separation-round granularity. However, node-level tuning remains effective for most practical scenarios.
Given the preceding discussion, the adaptation requires substantial solver-specific engineering to rewire state collection, accommodate Gurobi’s callback semantics, and translate our dynamic decision logic to its API. Due to time constraints, we have not yet completed the Gurobi experiments. Nevertheless, we remain confident that DynSep’s separator-configuration strategy can be properly implemented in Gurobi. We will prioritize these tests and include preliminary Gurobi results once feasible.
Q2. Integration with advanced or user-defined cutting plane families.
A2.
SCIP exposes a plugin API for user-defined separators: once a custom separator is registered in SCIP, it appears in the separator nodes of our triplet graph alongside the other built-in separators. DynSep treats every separator node uniformly—built-in or custom—so any new separator simply becomes an additional token whose activation status can be tuned. In fact, our own “configuration separator” is implemented via this same API, and is invoked ahead of all other separators to adjust their parameters dynamically.
Thank you for your detailed explanation. I understand that due to time constraints, the relevant experiments have not yet been completed. However, adding these experiments in the future would greatly enhance the impact and strength of your paper. Overall, your response has addressed my concerns.
Thank you for your thoughtful feedback and understanding. We’re pleased that our responses addressed your concerns, and we fully appreciate the importance of the additional experiments. We are committed to conducting them and will include the results in future updates.
Dear Reviewer rt7i,
It appears that you have not yet replied to the authors' rebuttal for Submission 22899, and the discussion period (July 31–August 6) is halfway over. Please get involved as soon as you can to give us time for a fruitful conversation.
In particular, do you feel the authors have addressed your concern regarding the applicability and the impact of the proposed method?
Lastly, don’t forget to click the “Mandatory Acknowledgement” button to confirm you’ve participated.
Thank you for your continued contributions to the review process.
Best, AC
This work proposes DynSep, a reinforcement learning–based method to dynamically control how cutting planes are added during the Branch-and-Cut process in solving MILP optimization. The authors claim that instead of relying on fixed or manually tuned separator settings, DynSep learns to decide both which separators to activate in each separation round and when to stop adding more cuts at each node while adapting these choices as the problem evolves across different rounds and nodes. To keep it efficient, DynSep only looks at the incremental subgraphs added in each round, then using GCN and a decoder-only Transformer to track the context from earlier rounds and predict the next separator configuration. The authors claim this design helps the solver avoid unnecessary separation rounds, capture useful separator interaction patterns automatically, and improve both speed and solution quality. In experiments, DynSep reduced average solve times by up to 64% and improved solution quality on harder benchmarks, while also generalizing well to larger, unseen MILP instances.
优缺点分析
Strengths
- The main strength of this work is that DynSep is an adaptive method that learns not just which separators to use but also when to stop adding cuts, adjusting dynamically as the Branch-and-Cut process evolves.
- Its design is efficient, since it processes only new cuts each round and uses a Transformer decoder to capture temporal context.
- The proposed DynSep has demonstrated consistent and significant improvements in runtime and/or primal-dual gaps on various benchmarks, and datasets. The ablation study also helps with interpretability.
- Built on the open-source SCIP solver, DynSep seems to be also practical and reproducible.
Weaknesses
- There is a lack of clarity in the formulation of the solutions. For example, it is unclear the difference between cutting planes and separators until the very end of experiments. As a key concept in the solution and graph model, the separator and how it generates cutting planes are not explained, and it is unclear if the set of separator nodes remain the same for each triplet graph. It is also unclear how separator nodes are represented in the triplet graph, do they have node features or simply represented by node identity.
- The transformer generates an action based on the history of token blocks. But the tokenization or encoding of the vectorized action is unclear in the paper. Is the action a sequence of tokens or a single token? If an action is a token, is the token space the same as the action space?
- Within each block, node embeddings have identical positional encoding. How about graph positional encoding such as spectral PE from graph transformer literature? If the authors deliberately choose not to use graph PE, some explanation would be good.
- Many notations are used before their first introduction, such as PD integral, MDP, Node and Round-wise dynamic configurations.
- It is unclear about the generalizability of the DynSep due to the lack of training information. Is it trained on the same types of NP-hard problems in the benchmarks? If so, how well does it generalize to other unseen benchmarks? This could be a major limitation and need more clarification.
问题
- It is unclear how the GCN encoder is trained, is it included in the transformer training?
- How might DynSep be adapted or extended to handle nonlinear or mixed-integer nonlinear programming (MINLP) problems?
- How does the blocked positional encoding impact performance?
- Could alternative architectures such as graph transformers improve the modeling of separator dynamics?
- Is the training based on the same types of MILPs in the test benchmarks? How well does it generalize to unseen types MILPs?
局限性
- The method is limited to MILP problems and does not cover nonlinear formulations that frequently occur in real-world applications.
- The method maybe limited to the types of MILPs seen during training, and generaliability to unseen types of MILPs is unclear.
最终评判理由
The authors have addressed my concerns on clarity of presentation and generalizability of the proposed method across problem sets. Therefore I raise my original rating of 4 to 5.
格式问题
Please ensure consistency by using "decoder-only Transformer," as stated in Experiment 2: Ablation Study, instead of "encoder-only Transformer."
We thank the reviewer for the positive and insightful comments. We respond to each comment as follows and sincerely hope that our rebuttal will properly address your concerns. If so, we would deeply appreciate it if you could raise your score. If not, please let us know your further concerns, and we will continue actively responding to your comments and improving our submission.
Q1. Lack of clarity in the formulation of the solution.
A1.
We apologize for the confusion. Below, we provide the detailed specifications as suggested, and we will include them in the revised manuscript.
- Concept of separators. Separators are core components of a mixed‑integer programming (MIP) solver. Each separator implements a dedicated cutting‑plane algorithm to generate a family of valid inequalities (i.e., cuts) to tighten the LP relaxation. For example, the Gomory cut separator derives Gomory cuts from the current simplex tableau to remove fractional solutions.
- Difference between cutting planes and separators. Cutting planes are valid inequalities that tighten the LP relaxation, which exclude fractional solutions while preserving all feasible integer solutions. Separators are built-in algorithms that generate these cutting planes.
- Separator nodes. In our triplet graph, each separator is represented as a dedicated node. Separator node features comprise a one-hot encoding of its type, plus the separator's performance statistics (e.g., execution time, number of applied cuts, and dual-bound improvement) to capture historical efficacy. Full definitions of these features appear in Table 4 of the appendix. Notably, the set of separator nodes is fixed (corresponding to 22 SCIP’s built‑in separators) and remains the same for every instance, while their feature values are updated dynamically at each decision step.
Q2. How are actions tokenized/encoded for the transformer?
A2.
Each action could be viewed as a block of tokens, consisting of one token for the maximum round threshold and tokens for separators' activation status. Below, we describe the tokenized formulation of the action.
- Maximum‑round token : drawn from an ‑dimensional categorical distribution (its token space is ).
- Separator‑activation tokens : for each separator, one token drawn from a 3‑dimensional categorical distribution over indicating three distinct activation statuses.
Notably, we clarify that we do not feed the sampled action back into the transformer. Instead, we execute the action in the solver environment and obtain the next incremental state (i.e., the incremental graph). We input as the subsequent token block into the transformer. This yields an autoregressive loop for both the training and reference phases--each next-step prediction conditions on the entire history of past incremental states (equal to ). Such procedure matches the modeling of standard online RL policy , where the policy depends only on the current state , not on prior actions.
Q3. Why not use graph-based positional encodings (PEs) (e.g., spectral PE)?
A3.
We acknowledge that, since all separator nodes are fully connected to variable and constraint nodes, they share the same graph topology, thus receiving identical spectral PE. However, we deliberately opt for blocked PE for two main reasons:
- Computational Efficiency: Computing spectral or other graph‑based PEs requires eigendecomposition or additional operations on the graph Laplacian (or related matrices), which scales poorly on large graphs. In contrast, our blocked PE leverages the prior knowledge that separator nodes have no inherent sequence order and is aligned with minimal overhead.
- Dynamic Temporal Modeling: Graph‑based PEs encode static topological similarity and cannot capture the evolving, time‑series nature of the dynamic separator configuration. In contrast, our blocked PE preserves the chronological order of each incremental subgraph while treating separator nodes as an unordered set. This ensures the model focuses on the temporal progression of separator activations rather than a fixed structural pattern.
Q4. Several notations appear before their first introduction.
A4.
We appreciate this feedback. We’ll move those definitions earlier for clarity.
Q5. The generalization of DynSep.
A5.
Thank you for raising this point. In our evaluation, DynSep policies are trained and tested within the same problem class.
-
To evaluate the across-domain generalization, we train our policy on one problem family and apply the learned policy to unseen problem families. Specifically, we train 3 separate DynSep policies (on Knapsack, MIK, and Supportcase) and evaluate each of them across all nine benchmark families used in our manuscript. The table below lists the results, where we report solving time for easy and medium datasets, while reporting PD integral for hard datasets.
As shown in the table, policies trained on one problem type yield improvements over SCIP’s default in most unseen benchmarks, indicating effective transfer of our learned configuration strategy across NP-hard families.
| SetCover | MIS | Knapsack | CORLAT | MIK | Anonymous | LB | Neos | Supportcase | |
|---|---|---|---|---|---|---|---|---|---|
| Time (s) | Time (s) | Time (s) | Time (s) | Time (s) | PD integral | PD integral | PD integral | PD integral | |
| Default | 5.24 | 30.4 | 2.01 | 111.55 | 16.65 | 27069 | 15187.19 | 18500.5 | 21561.09 |
| Train on Knapsack | 2.02 | 0.76 | / | 27.56 | 12.48 | 19183.3 | 4583.45 | 13343.48 | 13216.93 |
| Train on MIK | 7.74 | 5.21 | 1.2 | 34.54 | / | 20715.25 | 5559.96 | 13366.13 | 11583.1 |
| Train on supportcase | 2.21 | 0.67 | 0.78 | 20.94 | 163.6 | 17922.89 | 9421 | 13639.73 | / |
- To further demonstrate generalization, we also evaluated on real-world datasets. Although they come from the same application domain, geographic, temporal, and related variations induce significant distribution shift, thus cross-dataset performance could serve as a proxy for transfer to unseen problems.
As shown in the table below, our method (DynSep) delivers marked performance gains on two real‑world datasets (MIRP and SRPN) from the Distributional MIPLIB benchmark [1], providing evidence of its ability to generalize to unseen, distribution-shifted real-world instances.
| Instance | mirp | mirp | mirp | srpn | srpn | srpn |
|---|---|---|---|---|---|---|
| Metric | Time(s) | PD integral | PD gap | Time(s) | PD integral | PD gap |
| Default | 580.98 | 52362.46 | 5.38e+19 | 332.04 | 11687.01 | 0.21 |
| DynSep(Ours) | 482.13 | 30838.39 | 1.38 | 294.77 | 7581.16 | 0.1 |
Q6. How the GCN encoder is trained?
A6.
We use separate decoder-only Transformers for actor and critic. The GCN encoder is updated only during critic training (backpropagated with the critic loss); it is frozen during actor updates.
Q7. Can DynSep handle nonlinear or MINLP problems?
A7.
Thank you for the insightful comment. We believe DynSep can naturally extend to MINLPs because SCIP’s unified Constraint Integer Programming (CIP) framework handles both MILPs and MINLPs within the same branch‑cut‑and‑price architecture. When solving MINLPs, SCIP automatically applies nonlinear handlers and MINLP-specific separators. like RLT cut separator and minor cut separator, to strengthen relaxations via valid cuts.
By incorporating MINLP‑specific separators into the action space, our instance‑aware configuration method could extend seamlessly to nonlinear problems. Meanwhile, we could enrich our state representation to capture nonlinear structure following the existing approach [2]. In other words, the core decision‑making paradigm—choosing which separators to fire at each node and when to stop—remains valid for both MILP and MINLP, making this extension both straightforward and feasible.
Q8. How does the blocked positional encoding impact performance?
Q9. Would alternative architectures like graph transformers improve separator modeling?
A8. & A9.
We merged the results for Q8 and Q9 into a single table for compactness and easier comparison.
- w/o BlockPE (for Q8): This row show the results that replace the blocked positional encoding with a standard token-level positional encoding. Its degraded performance relative to DynSep indicates that masking intra-block order information (i.e., using BlockPE) benefits training and improves final performance.
- GT (for Q9): This row show the results that replace the encoder’s GCN with a custom bipartite graph transformer (GT): a multi-head TransformerConv for edge-aware message passing, followed by residual-connected LayerNorm and a two-layer feed-forward block. "GT" shows no consistent improvement over DynSep, which may be due to increased inference overhead or the need for finer stability/tuning to realize gains from the transformer-style aggregation.
| Instance | Knapsack | Knapsack | Corlat | Corlat | neos | neos | supportcase | supportcase |
|---|---|---|---|---|---|---|---|---|
| Metric | Time(s) | PD integral | Time(s) | PD integral | Time(s) | PD integral | Time(s) | PD integral |
| w/o BlockPE | 0.93 | 11.12 | 34.4 | 2870.01 | 242.39 | 8291.49 | 153.29 | 12715.64 |
| GT | 0.71 | 11.02 | 46.26 | 4563.26 | 243.52 | 12134.57 | 150.38 | 9157.61 |
| DynSep | 0.52 | 9.71 | 22.96 | 2233.42 | 235.19 | 8511.58 | 132.50 | 9212.24 |
Q10. Typo in Ablation Study.
A10.
Thank you for pointing this out. We will keep all statements consistent.
[1] Huang, et al. "Distributional MIPLIB: a multi-domain library for advancing ml-guided milp methods." arXiv 2024.
[2] Tang, Bo, Elias Boutros Khalil, and Jan Drgona. "Learning to Optimize for Mixed-Integer Nonlinear Programming." (2024).
Thanks for the authors for their detailed rebuttal on the issues I raised.
The authors' response improved the clarity of the work. However, my major concern on the generalizability of the proposed method across different problem sets and distributional shift is only partially addressed. The authors show that it is capable of generalize across problem types, but users still need to pick different trained models for their problems to be able to obtain the performance gain. A truly generalizable approach should be not dependent on such handpicked ML models. Therefore the contribution is still limited. Unless the authors can justify the real world merit of such manually selecting ML models for particular problem sets, I would maintain my current rating.
Dear Reviewer ZkPt,
It appears that you have not yet replied to the authors' rebuttal for Submission 22899, and the discussion period (July 31–August 6) is halfway over. Please get involved as soon as you can to give us time for a fruitful conversation.
In particular, please share whether their responses adequately address your concerns, especially regarding the technical clarity and the generalization of the proposed method.
Lastly, don’t forget to click the “Mandatory Acknowledgement” button to confirm you’ve participated.
Thank you for your continued contributions to the review process.
Best, AC
Thank you for raising the concerns about our generalization study. Below, we’ve improved our generalization study and detailed the real-world merits of domain-specific models.
- Improved generalization study for our method.
We select diverse instances from the MIPLIB 2017 benchmark [1] as our training set and learn a separator configuration policy on these instances. Notably, because the MIPLIB 2017 instances cover a wide variety of problem types and mixed-scenario structures [1], this learned policy could serve as a general configuration model. We then evaluate it—without any additional tuning—on four unseen, domain-specific datasets: Corlat, Load Balancing (LB), Maritime Inventory Routing Problem (MIRP), and Seismic‑Resilient Pipe Network Planning (SRPN).
As shown in Table 1, the general configuration model consistently outperforms the solver’s default settings in both solve time and convergence behavior, demonstrating good generalizability of our method.
Table 1. Generalization performance of our DynSep model trained on MIPLIB 2017, evaluated on four unseen MILP scenarios. (300-second time limit for Corlat & LB; 600-second for MIRP & SRPN)
| Corlat | Corlat | LB | LB | MIRP | MIRP | SRPN | SRPN | |
|---|---|---|---|---|---|---|---|---|
| Time (s) | PD integral | Time (s) | PD integral | Time (s) | PD integral | Time (s) | PD integral | |
| Default | 111.55 | 10573.14 | 300.14 | 15187.19 | 580.98 | 52362.46 | 332.04 | 11687.01 |
| DynSep (ours) trained on miplib2017 | 46.05 | 4289.33 | 300.08 | 4792.71 | 487.59 | 33193.25 | 288.48 | 7582.60 |
- Real-world merits of domain-specific models.
Although a domain-general model reduces the workload of manually picking or retraining for each new scenario, the structural diversity of MILP instances makes domain-specific models considerably more competitive—especially in computationally intensive industrial settings, where these domain-specific models play a pivotal role in accelerating the solving process and enhancing solution quality within a time limit for particular problem sets. For example,
-
By applying Gurobi’s parameter-optimization model to 97 quadratic knapsack instances from Beasley’s OR-Library, the solver highlights five key parameters and achieves up to 7‑fold runtime improvements. [2]
-
SMAC, a configuration method applied to CPLEX’s 76 expert parameters, delivers up to 50‑fold runtime improvements over default settings on domain‑specific tasks such as combinatorial auction and knapsack problems.[3]
-
Configuration approaches such as ParamILS [4] and irace [5] have shown significant, problem-specific speed-ups on real-world wildlife-corridor design problems.
These case studies illustrate that scenario-aware tuning could incur minimal overhead while delivering critical latency reductions in real-world scenarios.
- Clarification of our original generation results.
In our across-domain evaluation for the original rebuttal, each of the three trained policies outperforms the default configuration on the vast majority of problem families (see Table 2). As a result, any of the three trained policies can individually deliver reliable speed-ups on all tested datasets, eliminating the need for the model-selection step.
Table 2. Relative improvement over the default solver settings in solving time (Time) or primal-dual integral (PD integral), for three models trained on different problem families (Knapsack, MIK, Supportcase).
| SetCover | MIS | Knapsack | Corlat | MIK | Anonymous | LB | Neos | Supportcase | AVERAGE Improvements | |
|---|---|---|---|---|---|---|---|---|---|---|
| Time (%) | Time (%) | Time (%) | Time (%) | Time (%) | PD integral (%) | PD integral (%) | PD integral (%) | PD integral (%) | ||
| Train on Knapsack | +61.45 | +97.50 | / | +75.29 | +25.00 | +29.13 | +69.82 | +27.88 | +38.70 | +53.10% |
| Train on MIK | -32.30 | +82.86 | +40.30 | +68.14 | / | +23.47 | +63.39 | +27.75 | +46.28 | +39.99% |
| Train on Supportcase | +57.82 | +97.80 | +61.19 | +81.23 | -89.82 | +33.79 | +37.97 | +26.27 | / | +38.25% |
[1] Gleixner, et al. "MIPLIB 2017: data-driven compilation of the 6th mixed-integer programming library." Mathematical Programming Computation(2021).
[2] Rando, D., et al. "The importance of fine-tuning Gurobi parameters when solving quadratic knapsack problems: A guide for OR practitioners. Pure and New Mathematics in AI (2024).
[3] Hutter, Frank, Holger H. Hoos, and Kevin Leyton-Brown. "Sequential model-based optimization for general algorithm configuration." International conference on learning and intelligent optimization (2011).
[4] Hutter, Frank, et al. "ParamILS: an automatic algorithm configuration framework." Journal of artificial intelligence research 36 (2009).
[5] Pérez Cáceres, L., et al. "An experimental study of adaptive capping in irace: Supplementary material." (2017).
This paper introduces DynSep, a reinforcement learning-based framework aimed at enhancing a backbone MILP solver through dynamic separator configuration. The approach integrates an incremental subgraph-based state representation into a transformer policy model, which autoregressively predicts both when to terminate separation and which separators to activate during branch-and-cut. Extensive empirical evaluation demonstrates that the method significantly accelerates the SCIP solver across diverse datasets, while also exhibiting strong generalization to unseen MILP instances.
There is a clear consensus among the reviewers that the proposed method is technically sound, effective, and delivers notable performance improvements. The extensive experimental results, generalization capability, and the inclusion of interpretability analysis are all strengths of the paper. Most of the key claims are well substantiated through ablation studies. No critical flaws are found. However, the main limitation lies in the lack of study on the broader applicability of the method, particularly its applicability to commercial solvers such as Gurobi, and a wider range of branch-and-cut parameters. A further concern is the limited discussion on the effectiveness of the incremental triplet graph compared to full-graph encoding. While the paper attributes its use to improved inference speed, ablation results show that it also improves performance. The authors are encouraged to revise the paper to include a more detailed analysis explaining this.
Despite these cons, the reviewers agree that the paper makes a meaningful contribution to the field. The authors' rebuttal addressed the technical and empirical concerns effectively, providing additional evaluations that further validate the approach. Following the discussion, all reviewers recommend acceptance. The required revisions are expected to be manageable in the camera-ready version. I am happy to recommend acceptance, provided that the camera-ready version includes the promised experiments and adequately addresses the above weaknesses.