/10

Poster4 位审稿人

最低3最高4标准差0.4

ICML 2025

Neural Solver Selection for Combinatorial Optimization

Chengrui Gao,Haopu Shang,Ke Xue,Chao Qian

提交: 2025-01-23更新: 2025-07-24

TL;DR

We demonstrate that neural combinatorial optimization solvers often exhibit complementary strengths and propose a general selection framework to coordinate multiple solvers for improved performance.

摘要

关键词

neural combinatorial optimization

评审与讨论

审稿意见

评分: 42025-03-05

This paper proposes an ensemble framework to select appropriate neural solvers from the solver pool for each instance to solver. This framework includes a feature extraction step to extract instance-level features. Based on the features, a selection model alongside several selection strategies is proposed to select dedicated neural solvers to solve the corresponding instance. This framework improves the overall performance of current state-of-the-art neural solvers through extensive experiments on TSP and CVRP on small to large scales.

给作者的问题

see weaknesses.

论据与证据

Yes, the claims made in the submission are supported by clear and convincing evidence.

方法与评估标准

Yes, it does make sense.

理论论述

There are no theoretical claims.

实验设计与分析

I have checked the experimental design in Section 4.1, the experimental analyses in Section 4.2, and some discussions in Section 5. They are sound.

补充材料

I have read the appendix, including the ablation study (RQ3, A.5, and A.8) and the hyperparameter study (A.9), and so on.

与现有文献的关系

This paper relates multiple advanced NCO models to solve VRPs in an ensemble way at an instance level and improves current state-of-the-art performance.

遗漏的重要参考文献

To my knowledge, no related works that are essential to understanding are missing.

其他优缺点

Strengths:

(1) This paper is well-motivated and is clear to understand.

(2) The method is simple yet efficient, improving the performance on the top of advanced neural solvers in both small-scale (Table 1) and large-scale (Table 9) scenarios.

(3) The ablation study (RQ3, A.5, and A.8) and hyperparameter study (A.9) are detailed, providing a clear illustration of the effect of different components and hyperparameters.

Weaknesses:

(1) For DlFUSCO and T2T, this paper collects both the models trained on the N = 100 and 500 datasets, but only a single model is used for the other methods, which may make it inappropriate to rank the methods in Table 1 on TSP.

(2) This work primarily reports the summary results on diverse problem instances with significantly varying distributions and scales, while the detailed performance on instances with specific characteristics (e.g., instances with the same scale) is less analyzed. This may restrict the scope of performance evaluation for the proposed method.

(3) An implicit assumption in this paper is that computational resources are constrained, requiring neural solvers to operate sequentially. This assumption highlights the runtime efficiency of solver selection. While I acknowledge the high cost of computational resources and generally appreciate the contributions of the proposed selection framework, the scenario of adequate computational resources is possible and important and should be discussed in the paper.

其他意见或建议

The model structure of the selection model can be visualized for better understanding.

作者回复

2025-04-01

Thank you for your positive review. We sincerely appreciate your agreement on the significance of our neural solver selection framework. Here are the detailed responses to your questions.

In Other Strengths And Weaknesses, weakness 1, “the number of collected models are not strictly consistent for different methods”.

Thank you for raising your concerns. To demonstrate the effectiveness of our neural solver selection framework, we directly collect models of different methods from their released repositories and construct the solver pool according to the process introduced in Appendix A.3, where the redundant solvers are removed. Because some works have released multiple models (e.g., models trained on different scales of data, or models trained with different hyper-parameters), before collecting them we have made a simple but straightforward pre-selection. That is, we keep the models trained on different scales of data, since they usually show complementary performance. On the other hand, for models trained with different hyper-parameters, we only select the best one. Note that Table 1 is not aimed at ranking the previous methods but simply demonstrating the performance of typical collected single models. We will revise to clarify it. Thank you for your advice.
In Other Strengths And Weaknesses, weakness 2, “performance on instances of specific scales are expected”.

Thank you for your suggestion. We will revise our paper to provide separate results on different scales of instances for a deeper investigation. The following tables demonstrate that our selection method consistently outperforms the single best solver across different problem scales on both TSP and CVRP datasets.

Separate results according to problem scale $N$ on the synthetic TSP dataset. We report the mean (standard deviation) optimality gap over five independent runs.

Methods	$50\le N \le 200$	$200< N \le 300$	$300< N \le 400$	$400< N \le 500$
Single best solver	0.96%	2.34%	2.78%	2.98%
Oracle	0.39%	1.19%	1.70%	2.18%
Ours (Greedy)	0.84% (0.03%)	2.01% (0.02%)	2.43% (0.02%)	2.71% (0.03%)
Ours (Top-k, k=2)	0.61% (0.02%)	1.53% (0.03%)	1.99% (0.03%)	2.41% (0.05%)
Ours (Rejection, 20%)	0.75% (0.04%)	1.86% (0.04%)	2.33% (0.03%)	2.62% (0.02%)
Ours (Top-p, p=0.5)	0.71% (0.02%)	1.70% (0.02%)	2.24% (0.04%)	2.57% (0.04%)

Separate results according to problem scale $N$ on the synthetic CVRP dataset.

Methods	$50\le N \le 200$	$200< N \le 300$	$300< N \le 400$	$400< N \le 500$
Single best solver	3.95%	6.06%	7.76%	9.24%
Oracle	2.17%	4.33%	5.74%	7.40%
Ours (Greedy)	2.85% (0.03%)	4.87% (0.02%)	6.47% (0.05%)	8.09% (0.01%)
Ours (Top-k, k=2)	2.32% (0.02%)	4.54% (0.02%)	5.91% (0.03%)	7.55% (0.03%)
Ours (Rejection, 20%)	2.64% (0.02%)	4.70% (0.03%)	6.22% (0.02%)	7.91% (0.03%)
Ours (Top-p, p=0.8)	2.36% (0.02%)	4.70% (0.04%)	6.21% (0.05%)	7.81% (0.02%)

In Other Strengths And Weaknesses, weakness 3, “ the implicit assumption of this paper is that computational resources are constrained, requiring neural solvers to operate sequentially”.

Thank you for raising your suggestion. Take VPRs, which is one of the most popular combinatorial problems, as an example; the number of automatic routing requests can be extremely large every day. Compared to parallel operating all of the solvers, even though the saved cost of neural solver selection might be limited when handling one request, the totally saved cost of all the requests can be very considerable. We will revise to add more discussions on it to clarify its benefit. Thank you very much.
In Other Comments Or Suggestions, “The structure of the selection model can be visualized”.

Thank you for your suggestion. As introduced in our paper, currently, we simply use an MLP as the selection model, where the instance features are the input, and each head of the output layer represents the logits of selecting a specific solver. Since this is a simple model (which already works well), we did not visualize it due to limited space, and directly described it in Section 3.2 of our paper. We will revise our paper to clarify it better. Thank you.

审稿人评论

2025-04-02

Thanks for the detailed response, which addresses my main concerns well. I will raise the score to 4 accordingly.

作者评论

2025-04-09

Thank you very much for taking the time to review our paper. We are pleased to hear that our responses have addressed your main concerns. We sincerely appreciate your thoughtful comments and constructive suggestions. Following your suggestions, we will carefully revise our paper to include the additional results, elaborate on the practical benefits of solver selection, and provide more details on how the candidate models are chosen.

审稿意见

评分: 32025-03-06

The authors propose to train a neural network that selects the most appropriate solver to use on a given instance. They tested their method on TSP and CVRP problems using a pool of state-of-the-art neural solvers. This solver selection is effective and allows a better tradeoff between computation-time and optimality. Multiple modelling approaches are explored.

给作者的问题

What are the parameter counts of the different feature extractors? It might explain the difference in performance.
The hierarchical graph encoder seems a bit complex. Have you tried simpler pooling approaches that keep the hierarchical aspect? I would expect a simpler model to work just as well.
The citation of Velickovic et al., 2018, for feature extraction, may be inaccurate, since their architecture (graph attention network) is different from the transformer one you use. Do you agree?
When you mention the No-Free-Lunch Theorem, this also affects your overall selection procedure. From my understanding, your selection tends to guarantee that the resulting solver will typically be as good as the best one in the pool, but one could argue the existence of a distribution that fools its selection. Can you comment on this?
How is the performance on TSPLIB and CVRPLIB affected from changing the synthetic distribution (e.g. varying the number of components c, the scale of the instances, or removing the covariances and considering the classical identity covariance matrix)?
Why didn't you consider larger instances?

论据与证据

The claims are supported by comparisons between two baselines:

The oracle which knows for each instance which solver to use
The single overall best solver.

They close the gap between the oracle and the single overall best solver by 77% on average while using only two solvers instead of the full pool of solvers required by the oracle. Other selection strategies are tested and they all are strictly better than the single overall best solver in terms of optimality and computation time.

They also produce ablation studies on the choice of the loss function and the feature extraction method. They compare the performance of a classification loss with a ranking loss. The ranking loss is shown to be overall better. They compare two learnable feature extractor and show that the hierarchical one generalizes better to new and bigger instances.

The trained model are evaluated on TSPLIB and a subset of CVRPLIB on relatively small graphs only. It would have been nice to see how the model selection behaves on bigger instances with up to 10000 nodes.

方法与评估标准

The method is well designed. While the hierarchical feature extractor is a bit complicated, it shows better adaptivity to bigger instances than the more standard graph attention encoder. The selection strategies are the one you would expect and the evaluation criteria is appropriate.

理论论述

There is no theoretical claim in this work.

实验设计与分析

The final method achieves a nice tradeoff between optimality and computation time. This is what we expect from such approach and the results are sound.

补充材料

The appendix gives more precise information about the implementation and share additional results. One interesting comment is about handling new solvers that the model has not been trained on. While the method is still in its early stages it is an attractive approach to handle a dynamic pool of solvers. More detailed ablations can be found in the appendix.

与现有文献的关系

The idea of selecting which solver to use for a given instance has already been explored in the past, but usually the model selection is done with metaheuristics approaches or different machine learning techniques other than neural networks. In this work the proposed model selection is done with neural networks and features are extracted from the raw instances directly, leading to less biases and potentially better performance.

遗漏的重要参考文献

To the best of my knowledge, all essential references are cited.

其他优缺点

While selecting the right solver is not new, the idea of using a fully neural approach is interesting, and not limited to neural solvers.

It seems that the loss function is not so important here. The authors could have kept the ranking loss function and explore further the design of the selection strategies.

其他意见或建议

Numbers are not percentages in Fig 1, y axis.
L302: extra comma
Table 3: "encdoer"
L434: "seach"

作者回复

2025-04-01

Thank you for your positive review and constructive comments. Below please find our responses. Corresponding experimental results can be found at link.

R1: How the model selection behaves on larger instances with up to 10,000 nodes

Thanks for your insightful comment. Scaling neural solvers to very large instances (e.g., 10,000+ nodes) is a challenging and active research area. Current neural solvers often struggle with such scales and extracting meaningful instance features from such large-scale instances also poses additional challenges.

As this paper is the first to explore solver selection in NCO, we focused on commonly studied instances (fewer than 1,000 nodes). Nevertheless, we also extended to 2,000 nodes in Appendix A.11 using additional solvers such as GLOP (Ye et al., 2024) and UDC (Zhang et al., 2024), and demonstrated that our proposed neural solver selection framework remains effective in this setting.

We believe that our framework holds potential for very large-scale instances (e.g., 10,000+ nodes) with the inclusion of an expanded solver pool and the development of advanced feature extraction methods. This represents an exciting avenue for future research. We will revise to include more discussions on this topic. Thank you again for your valuable feedback.

R2: The parameter counts of the different feature extractors

For a fair comparison, we actually adjust the depth of the hierarchical encoder referring to the naive graph encoder in our experiments. Specifically, the naive graph encoder uses 4 attention layers, while the hierarchical encoder has 2 blocks, each with an attention layer and an attention-based pooling layer. Their parameter counts are approximately the same, and it is more reasonable to attribute the improvements to the hierarchical design.

R3: The difference from graph attention

When graph attention is used on fully connected graphs like TSP, it works just like a Transformer without positional encoding.

R4: The hierarchical graph encoder seems a bit complex; try simpler pooling approaches that keep the hierarchical aspect

Thanks for your insightful comment. Our pooling method involves three steps: (a) structure-aware embedding computation via attention, (b) importance score calculation with a linear layer, and (c) top-k node selection and embedding updates. Although this process may seem somewhat complex, each step is conceptually grounded. Thanks to your suggestion, we tested a simpler pooling mechanism that skips step (a) and computes importance scores directly from node embeddings. While it benefits from the hierarchical design, this simpler method shows slightly degraded performance, as shown in Table S1. We agree that the hierarchical design plays a more critical role than the implementation of pooling. However, it is worth noting that advanced pooling implementations, such as the attention-based approach, also provide gains.

R5: Discussion on NFL Theorem and performance affected from changing the synthetic distribution

Thanks for your interesting question! When the distribution significantly shifts, the performance of selector may degrade substantially, i.e., the selector is fooled by the training distribution. This challenge—commonly referred to as the OOD problem—is inherent to all machine learning methods. To mitigate this issue, we have made several attempts in our methodology design: 1) Ranking loss, which can leverage the relationship of all solvers, thereby making the selection more robust; 2) Top-k selection, which increases the likelihood of including effective solvers, particularly under distribution shifts. 3) Hierarchical encoder, which extracts transferrable patterns; and 4) Diverse training data. As you described in Question 5, we have made many modifications to diversify the synthetic data. Without such diversity, the selector may easily overfit the training data. The selector cannot generalize well to TSPLIB and CVRPLIB if the training distribution is too simple, such as training on uniform datasets with $n=100$ .

From a broader perspective, the development of strong neural solvers can benefit significantly from combining multiple solvers with a selection model. Compared to training a single neural solver to handle all possible problem distributions, training a selection model is more tractable for addressing the OOD challenge, as it only needs to identify patterns within instances rather than solve the problems directly. For this reason, we believe that selection-based methods represent a promising direction for advancing neural solvers, even though they also face OOD challenges.

In response to Question 5, we conducted new experiments using smaller-scale or simplified training data (see Table S2). The results indicate that generalization on TSPLIB degrades as training data becomes simpler.

审稿人评论

2025-04-07

Thanks you for your rebuttal, which clarifies many points and adds valuable information to your work. I don't have further questions. I'm still leaning toward weak acceptance, but it may change depending on the discussion with the other reviewers.

作者评论

2025-04-09

Thank you very much for taking the time to review our paper. We are pleased to hear that our responses have clarified many points. We sincerely appreciate your insightful comments and constructive suggestions. Accordingly, we will carefully revise our paper to incorporate the additional experimental results and include the discussions on extending to larger instances, the design of pooling layers, and the approaches to address the OOD challenge.

Besides, we are also happy to know that you recognize our efforts in exploring neural solver feature extraction. We fully agree that leveraging solver features to manage a dynamic solver pool presents a promising research direction. In our future work, we plan to delve deeper into this line of research and explore more advanced methods to further enhance the selection framework.

审稿意见

评分: 32025-03-07

The paper proposes a framework to coordinate neural solvers for combinatorial optimization problems (COPs), addressing the complementary performance of individual solvers across instances. It introduces a three-component framework: (1) feature extraction using graph attention networks or hierarchical encoders, (2) a selection model trained via classification or ranking losses, and (3) selection strategies (e.g., top-k, rejection-based) to balance performance and efficiency. Experiments demonstrate that the framework reduces optimality gaps over state-of-the-art individual solvers on synthetic and real-world benchmarks. The results highlight the benefits of coordinating diverse neural solvers, particularly under distribution shifts and larger problem scales.

Update after rebuttal

I acknowledge the author’s responses to the questions raised and recommend a weak accept for this paper.

给作者的问题

This paper claims a general solver selection framework. However, it seems that for each type of the COPs, we need to carefully design a feature extraction component. Can you explain how to apply this framework to other COPs, besides TSP and CVRP? Does there exist a more general approach?

论据与证据

The claims are well supported by clear and convincing evidence.

方法与评估标准

Yes, the evaluation criteria is reasonable for learning-based methods for COPs.This is an empirical work and does not have any theoretical claims that need to be checked.

理论论述

This is an empirical work and does not have any theoretical claims that need to be checked.

实验设计与分析

It would be better to compare the proposed learning-based selection method with some traditional selection methods. Although they are presented in Appendix A.5, I think it would be better to put them in the experiments section to highlight the effectiveness of the proposed method.

补充材料

No.

与现有文献的关系

This paper applies the algorithm selection to neural solvers for COPs and shows promising results.

遗漏的重要参考文献

I think the related works are well discussed.

其他优缺点

Strengths:

The experiments are solid and clearly show the superiority of the learning-based solver selection method.
The presentation of the paper is well organized.

Weaknesses:

The link between the selection method and the architecture of COPs is unclear. It seems that this paper only applies a general selection method to neural solvers for COPs.

其他意见或建议

In the references, it seems that there is a typo in “Learning to aolve large-scale TSP instances”.

作者回复

2025-04-01

Thank you for your valuable and encouraging comments. We sincerely appreciate your agreement on the effectiveness of the neural solver selection framework in our paper, which, we believe, has the potential to be a new branch of techniques for the application of NCO solvers. We summarize the concerns in your review. Here are the detailed responses.

In section Experimental Designs Or Analyses, about “putting the comparison with traditional selection methods in Appendix A.5 in the main paper”.

Thank you for your suggestion. Limited by space, the current version of our paper only included the results of a typical traditional method (Smith-Miles et al., 2010), titled as “Manual” in Table 3, and left the details of implementation and the results of other methods in Appendix A.4 and Appendix A.5, respectively. We will revise our paper to clarify it better.
In weakness, "It seems that this paper only applies a general selection method to neural solvers for COPs".

Thank you for raising your concerns. As we know, the general idea of solver selection (or algorithm selection) has been implemented in a variety of scenarios. However, how to adapt it in the context of neural solvers for COPs has never been explored before our work. Note that neural solvers themselves are quite different from traditional COP solvers. That is, they utilize neural network models (such as Transformers or Diffusion Models) as backbones, and generate solutions with the models in an end-to-end manner. Our work first investigated the instance-level performance of prevailing NCO solvers and found that they demonstrate clear complementarity. After that, we experimentally revealed that the change of problem distribution can change the dominance relationship of solvers. These two observations verify the potential benefit of neural solver selection for COPs. Inspired by these observations, we propose the first neural solver selection framework for COPs, which is the main contribution of this work and we believe can benefit the community and inspire future research in this area.

On the other hand, the implementation of our neural solver selection framework has some advanced components. For example, we propose a new method of extracting instance features for NCO, which is different from previous works on classical algorithm selection for TSP. Firstly, we examined the manual features proposed in classical algorithm selection for TSP (https://tspalgsel.github.io/) and found that they can only achieve limited performance (in Table 3 of the original paper). To address this, we proposed a novel pooling-based hierarchical encoder designed to extract richer instance features, leading to significantly better generalization performance. We believe that such an instance feature extraction method may also be helpful for improving other NCO methods, not limited to our neural solver selection framework.

Thank you again for your thoughtful comments. We sincerely hope the above clarification can address your concern.
Question 1 in Questions For Authors, "how to apply this framework to other COPs".

Thank you for raising your concerns. Our paper aims to demonstrate the great potential of solver selection in the context of NCO (through observations in Figure 1(a)-(b)), and propose a general framework on neural solver selection for NCO, which consists of three components: feature extraction, selection model, and selection strategy. We believe the methods in the selection model and selection strategy can be easily adapted to a very wide spectrum of COPs. On the other hand, as we know, different kinds of COPs may possess different inherent characteristics, making it very challenging to obtain a general feature extraction method. In this work, we focus on TSP and CVRP, and propose to use graph attention for feature extraction. For other kinds of COPs, model structures may need to be specifically designed for feature extraction, e.g., MatNet (Kwon et al., 2021). General feature extraction methods (e.g., utilizing LLMs) are also interesting directions for future works. Besides, in this paper, we propose hierarchical pooling in feature extraction and show its effectiveness. We believe that the idea behind hierarchical pooling can benefit the design of feature extraction methods on other COPs. Thank you again for your suggestions. We will revise our paper to include more discussion.

Thank you again for dedicating your time to reviewing our paper. We also welcome any further questions and discussions.

审稿人评论

2025-04-04

Thanks for providing further details. My decision to assign a weak acceptance to this paper still holds.

作者评论

2025-04-09

Thank you very much for taking the time to review our paper. We sincerely appreciate your insightful comments and constructive suggestions. Following your suggestions, we will carefully revise our paper to provide more details about how our method differs from traditional selection approaches and how our framework can be extended to a broader range of COPs.

审稿意见

评分: 32025-03-13

This submission introduces a framework for intelligently coordinating multiple neural solvers to tackle combinatorial optimization problems (COPs). The core idea involves feature extraction from problem instances, training a selection model to identify the most suitable solver, and employing robust selection strategies to balance performance and efficiency. The framework's components include extracting features that characterize problem instances, a model to select the optimal solver, and a method that selects one or more solvers to address the problem.

给作者的问题

论据与证据

The submission’s claims are well supported by experimental results. For example, the reason the authors developed a multi-solver selection framework is that there exists no single neural solver that dominates all others on every instance, and instance distribution shifts can also significantly affect the solvers’ performance relationship. These claims are supported by Figure 1. In addition, the paper claims the proposed framework has achieved significantly better results than individual solvers. The claim is again supported by experimental results: the framework reduces the optimality gap by 0.82% on synthetic TSP, 2.00% on synthetic CVRP, 0.88% on TSPLIB, and 0.71% on CVRPLIB Set-X compared to the best individual solver.

方法与评估标准

The problem is stated in the introduction section of the submission: no individual solver is dominantly better at solving all instances. To this end, this paper proposes a solver selection framework that incorporates feature extraction, solver selection model and selection strategies, which dedicated to select the best few solvers to solve the COP problems. The benchmark involves best individual solvers and solver portfolios, which are some good examples to compare with. The evaluation also considers generalization problems: the framework is tested on out-of-distribution datasets (TSPLIB and CVRPLIB Set-X) and larger-scale instances. Overall, the problem, the proposed methods and the selected benchmarks remain consistent throughout the submission and make sense for the problem.

理论论述

实验设计与分析

Experiments on Traveling Salesman and Capacitated Vehicle Routing Problems demonstrate the framework's effectiveness in selecting appropriate solvers. The framework leads to improved solution quality and comparable time consumption compared to using the best individual neural solver alone. The work also explores future research directions such as incorporating solver features, addressing runtime awareness, and enhancing the collection of neural solvers.

补充材料

yes, appendix A7 based on authors' response.

与现有文献的关系

遗漏的重要参考文献

其他优缺点

I generally like the idea proposed in this paper, where each instance can be assigned to the most appropriate solver. However, the technical novelty is limited. The authors just use a MLP to calculate the compatibility scores of neural solvers. There is still room to improve.

The authors are advised to improve their presentation as well, especially figure 1. In addition, authors can include how the selected solvers work cooperatively to solve the instance, which could be an essential part of the framework.

其他意见或建议

作者回复

2025-04-01

Thank you for taking the time to review our paper. We are delighted to learn that you generally appreciate the core idea of our work, and we sincerely value your insightful comments. However, we believe there may be some misunderstandings regarding certain aspects of the paper, which we would like to clarify.

We first want to emphasize that our main contribution is proposing neural solver selection in the community of Neural Combinatorial Optimization (NCO) for the first time and showing its effectiveness even with a very straightforward implementation. Inspired by the No-Free-Lunch theorem, we investigated the instance-level performance of prevailing NCO solvers and found that they demonstrate clear complementarity. This phenomenon emphasizes the potential of combining the advantages of state-of-the-art neural solvers and motivates our proposal of adaptively selecting suitable solvers for each instance. Since our work is supposed to be a pioneering attempt at neural solver selection, our main goal is to verify the possibility and benefits of solver selection for NCO. In our experiments, we found that even a straightforward method using hand-crafted features and MLP classification models can outperform the state-of-the-art neural solver, which strongly indicates that combining multiple neural solvers through solver selection is a promising direction for NCO. We believe our work can benefit the NCO community and inspire future research in this area.

In response to your concerns regarding technical novelty, we would like to highlight two key advancements in our work:

Handling dynamic neural solver pool through instance-solver matching. In Appendix A.7, we explore a novel feature extraction method for neural solvers, where an instance tokenizer and a two-layer transformer are utilized to summarize neural solvers' features from representative instances (i.e., those instances where a neural solver performs well). Based on these learned features, we train a matching network to compute compatibility scores between instance features and solver features. This architecture, detailed in Appendix A.7, is significantly more sophisticated than the MLP classifier used in our main experiments. Importantly, this instance-solver matching mechanism allows the framework to generalize to previously unseen solvers, enabling it to handle dynamic solver pool flexibly. Reviewer Qe6z recognized this aspect as "interesting." Moreover, we highlight that our work is the first to introduce the method of leveraging representative instances to extract solver features, further emphasizing its technical originality. Since your decision may have been made without reviewing the appendix in detail, we kindly refer you to Appendix A.7 for additional information.
Instance feature extraction through hierarchical encoder. While hierarchical encoding for COPs has been explored in prior work (e.g., Goh et al., 2024), existing methods primarily rely on cluster-based embedding aggregation. In contrast, our approach employs a pooling-based network design, which offers greater flexibility in identifying important but non-centric nodes (as illustrated in Figure 6). This flexibility enhances the model's ability to capture richer local patterns, thereby improving generalization. Compared to cluster-based aggregation, our method provides a novel and more robust solution for hierarchical feature extraction.

We hope these explanations can address your concerns regarding the technical contributions of our work. Additionally, we acknowledge your suggestion to improve the presentation and will carefully revise the paper accordingly. For example, we will enhance Figure 1 to include more details, such as the process of running the selected subset of solvers and obtaining the best solution from their results.

Once again, we greatly appreciate your time and thoughtful feedback. Please do not hesitate to reach out if you have further questions or suggestions.

最终决定Accept (poster)

2025-05-01

This paper has the following identified issues:

Limited Technical Novelty
Generalization to Other COPs
Evaluation Scope
Baseline Consistency and Fairness

However, the following points are positively evaluated, and considering these factors comprehensively, the recommendation is Weak Accept:

Novel and impactful contribution to the NCO field
Demonstrates strong improvements in optimality gaps and robustness
Modular and general design with potential beyond TSP/CVRP