A near linear query lower bound for submodular maximization
摘要
评审与讨论
This paper investigates discrete submodular and modular function optimization under a cardinality constraint. It establishes an improved lower bound on the query complexity for submodular maximization by linking it to the communication complexity of distributed set detection. Additionally, the paper proved that finding an approximately optimal subset for additive function maximization still requires near-linear query complexity, but the authors propose an algorithm that estimates the optimal value of additive function maximization in queries.
优点
- The topic of proposing query-efficient algorithms for submodular optimization problems is important and has many applications in the field of machine learning.
- The proposed lower bound for monotone submodular maximization improved compared with existing results. The proof idea introduced in this paper seems interesting and can be useful for the submodular optimization community.
缺点
- The algorithm for the additive function, presented as a one of the key result in the paper, is insufficiently explained. Additionally, the authors do not discuss any related work in this area, leaving questions about how their work compares with prior studies on similar problems. A comparison with relevant literature would strengthen the paper by situating this result within the broader context of related research.
- Several theoretical contributions, including the final theorem, are not fully clarified. Without adequate explanations, readers may struggle to understand the implications and validity of these findings.
- The presentation of proofs is unclear and challenging to follow.
- Although the authors present proofs for their main results, they do not discuss the technical innovations and challenges in sufficient depth. A clearer articulation of the novel aspects of their approach would help readers appreciate the paper's unique contributions.
- It is claimed that this paper produces query-efficient algorithms for estimating the optimal value of the studied problem, but there is no experimental evidence to support the claim in the paper.
- Typos:
- Line 093: "The studied of the query...";
- Line 187-188: "for any , "
问题
- The paper introduces an algorithm that estimates the optimal value of the problem of additive function maximization under cardinality constraint. What are the related work of this algorithm and how does this compare to existing results?
- What are the major technical difficulties the authors overcome in proving the lower bound of the query complexity for the submodular maximization problem?
Thank the reviewer for the suggestion on the exposition on the writing. We include a technique overview section (Section 1.1) in the updated paper that discusses our technical contribution in detail.
The algorithm for the additive function, presented as one of the key results in the paper, is insufficiently explained. Additionally, the authors do not discuss any related work in this area, leaving questions about how their work compares with prior studies on similar problems. A comparison with relevant literature would strengthen the paper by situating this result within the broader context of related research.
We discuss the intuition of our algorithm in Section 1.1. We do not find literature that studies the same problem (i.e., query complexity for additive function).
Several theoretical contributions, including the final theorem, are not fully clarified. Without adequate explanations, readers may struggle to understand the implications and validity of these findings.
Can you point out to us which theorem is not adequately explained?
Although the authors present proofs for their main results, they do not discuss the technical innovations and challenges in sufficient depth. A clearer articulation of the novel aspects of their approach would help readers appreciate the paper's unique contributions.
See Section 1.1 in the updated paper.
It is claimed that this paper produces query-efficient algorithms for estimating the optimal value of the studied problem, but there is no experimental evidence to support the claim in the paper.
The paper in its current version is a theoretical study, which settles important problems in the literature. We leave empirical study for future study.
The paper introduces an algorithm that estimates the optimal value of the problem of additive function maximization under cardinality constraint. What is the related work of this algorithm and how does this compare to existing results?
We do not find literature that studies the same problem (i.e., query complexity for additive function).
What are the major technical difficulties the authors overcome in proving the lower bound of the query complexity for the submodular maximization problem?
See Section 1.1 in the updated paper, our technique is completely different from previous work.
This paper works on the problem of selecting -out-of- elements, specifically focusing on submodular maximization problem. It improves the imapproximability result of query complexity from to for achieving any constant-factor approximation by relating submodular maximization to distributed set detection problem. Moreover, they prove that finding an approximately optimal subset of an additive function requires near-linear query complexity, though the value of the optimal subset can be estimated in queries. Finally, they propose a sublinear algorithm for submodular maximization on additive function. The paper emphasizes on the inapproximability result and approximation algorithm for additive functions.
优点
This paper proves inapproximability results of query complexity for both general submodular function and additive function by reducing from the distributed set detection problem. It seems novel. Also, it provides a sublinear time algorithm that approximates the optimal value within factor.
缺点
The paper's presentation is lacking in clarity. Some definitions are not as precise and formal as in other works. For example, the definition of submodular on Line136 is wrong. If is non-monotone and , it is possible that and . The actual definition should be as follows:
The set function is submodular if for every two sets and element .
Moreover, the additional related work part is not detailed enough. The second paragraph, in particular, merely lists a collection of papers without further explanation.
The contribution of the paper seems limited. The inapproximability result on query complexity provided in Theorem 3.7 is , with . However, [1] proves that an -approximation algorithm obeying must use value oracle queries, which is tighter than the one provided in Theorem 3.7.
References
[1] Wenxin Li, Moran Feldman, Ehsan Kazemi, and Amin Karbasi. Submodular maximization in clean linear time. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (eds.), Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022. URL http://papers.nips.cc/paper_files/paper/2022/ hash/6faf3b8ed0df532c14d0fc009e451b6d-Abstract-Conference.html.
问题
When the oracle is discussed on Line 031, it is unclear what the oracle model is. The formal definition of the value oracle model is that the value oracle returns for any given set . However, the examples provided in the last sentence, "estimating the quality of prediction from a subset of features or samples by training a smaller model on them", are closer to the noisy value oracle model discussed in [1].
Typically, approximation algorithms find a subset with a constant approximation ratio for the objective value. In this paper, the algorithm approximates the optimal value. Its potential applications are not discussed.
References
[1] Horel, Thibaut, and Yaron Singer. "Maximization of approximately submodular functions." Advances in neural information processing systems 29 (2016).
Thank you for your response. I would like to provide further clarification on my review.
The contribution of the paper seems limited. The inapproximability result on query complexity provided in Theorem 3.7 is , with . However, [1] proves that an -approximation algorithm obeying must use value oracle queries, which is tighter than the one provided in Theorem 3.7.
Let me clarify my comment further. [1] provides two inapproximability results on query complexity: and , where the first one holds for all values and the second one holds when and is the approximation ratio. Upon careful consideration, I realized that the improvement in this paper lies in providing a tighter bound on the query complexity for cases where and . However, this is not clearly stated in the paper. As I mentioned in the weakness, the paper's presentation is lacking in clarity.
When the oracle is discussed on Line 031, it is unclear what the oracle model is….
I do agree that the oracle returns the exact value of is a standard setting in theoretical research. However, there are also other settings for the value oracle. The main point here is not about discussing different settings of value oracle. Instead, the paper should clearly specify what the value oracle refers to. Both the definition on Line 032 and the example on Line 033 are unclear and confusing.
Our algorithm can first be used to assess whether the dataset is sufficiently valuable—that is, whether it contains elements with large values.
This might be a good point for the motivation. However, it should be more specific if such applications exist. I agree with Reviewer GZiL's comment that the authors do not discuss any related work in this area. Based on your response, there appears to be no related work addressing this problem. In that case, the motivation needs to be detailed enough to help readers understand why this problem is worth studying. The lack of discussion on both related work and potential applications is insufficient.
This paper provides an improved inapproximability result on query complexity with range from and . However, this result should be highlighted in the paper rather than left for the reader to uncover. Also, the motivation for studying the approximation of the optimal value is insufficiently addressed. Overall, the paper’s writing lacks clarity and needs significant improvement.
We thank the reviewer for their feedback. In the rebuttal, we clarify why our lower bound is significantly stronger than previous work, both conceptually and technically.
The paper's presentation is lacking in clarity. Some definitions are not as precise and formal as in other works. For example, the definition of submodular on Line136 is wrong….
Thanks for the correction, we revise the definition according to your suggestion.
The contribution of the paper seems limited. The inapproximability result on query complexity provided in Theorem 3.7 is , with . However, [1] proves that an -approximation algorithm obeying must use value oracle queries, which is tighter than the one provided in Theorem 3.7.
We respectfully disagree with this assessment. Our lower bound is significantly stronger than the previous work [1]. The lower bound in [1] applies only to the case where (see Theorem 4.2 in their paper). This implies that their lower bound is restricted to the regime where the subset size is linear, . As discussed in our paper, this is an uncommon setting. In most of the literature, it is implicitly assumed that as values outside this range can lead to anomalous results (see the paragraph starting at Line 68 for more details).
From a technical standpoint, their lower bound is derived using a simple counting argument. Specifically, they argue that determining the most valuable elements (assuming a linear function where these elements have value 1, while others have value 0) requires bits of information. Since each query reveals only bits of information, this results in a lower bound of queries. However, this approach clearly cannot be generalized to establish a lower bound when .
In contrast, our approach employs entirely different and more sophisticated techniques. We rely on a reduction from query complexity to communication complexity, leveraging a communication lower bound based on advanced methods such as information complexity and the distributed data-processing inequality. Additionally, our construction of the hard instance involves a novel two-level truncation technique. For further details, please see Remark 1.1 in the updated version of our paper.
We would be happy to provide further clarification on this point and to elaborate on why our lower bound is both significantly stronger and derived using non-trivial techniques.
When the oracle is discussed on Line 031, it is unclear what the oracle model is….
We assume the oracle returns the exact value of f(S), which is the standard setting in theoretical research. We acknowledge that real-world applications may involve noise and may not strictly adhere to the submodularity or linearity assumptions. This is a challenge faced by all theoretical studies, and extending the work to account for noisy settings is an exciting direction for future research. (We also note that our algorithm can be generalized to tolerate small amounts of noise.)
Typically, approximation algorithms find a subset with a constant approximation ratio for the objective value. In this paper, the algorithm approximates the optimal value. Its potential applications are not discussed.
This has been a typical goal for sublinear algorithms now (e.g., see [1,2]). In terms of application, consider a scenario where one needs to select a subset from a large dataset under a budget constraint of . Our algorithm can first be used to assess whether the dataset is sufficiently valuable—that is, whether it contains elements with large values. If the dataset meets this criterion, one can proceed with further analysis or selection; otherwise, unnecessary efforts can be avoided.
[1] Dynamic Matching with Better-than-2 Approximation in Polylogarithmic Update Time. Sayan Bhattacharya, Peter Kiss, Thatchaphol Saranurak, and David Wajc. SODA 2023
[2] New Streaming Algorithms for High Dimensional EMD and MST. Xi Chen, Rajesh Jayaram, Amit Levi, Erik Waingarten. STOC 2022
Thank you for your quick response. We have updated our paper to address your concerns regarding the writing. All changes in this round are highlighted in blue. Specifically:
-
We have added motivations for studying the approximation of the optimal value. Please refer to the paragraph at Line 92.
-
We now emphasize the improved lower bound for ranging from to , see Line 57 and Line 71.
-
We highlight our focus on the exact value oracle model. See Line 210.
Please let us know if there are additional concerns regard paper writing, thanks!
This paper considers the NP-hard problem of maximizing a monotone submodular objective with a cardinality constraint . In the area of submodular optimization, generally the time bottleneck is regarded as the number of queries to . It was previously shown by Li, Feldman, Kazemi, Karbasi [2022] that there is a lower bound for the number of queries to for an algorithm that gives a constant factor approximation guarantee.
The current paper strengthens this lower bound to for instances where (Theorem 3.9), which is nearly tight since there exists linear time constant factor approximation algorithms e.g. one is given by Li et al.. They also consider the very restricted special case of objectives that are additive functions, and show that even this case finding a solution with constant approximation requires nearly linear query complexity (Theorem 3.7). They further prove this lower bound holds for the general case (but not the restricted linear case) even for just estimating the value of the optimal subset. In order to prove these lower bounds, the paper makes a connection between the number of queries required for submodular maximization and the communication complexity of the distributed set detection problem (a reduction is give in Algorithm 1). To this end, they prove that the distributed set detection problem requires communication cost (Theorem 3.2) by applying the distributed SDPI inequality of Braverman et al. (2016) (Lemma 3.4). On the other hand, for the case of linear functions, the algorithm LinearSum (Algorithm 2) is provided that estimates the value of the optimal solution closely in queries (Theorem 4.1), and this is tight up to polylog factors (Theorem 4.7).
优点
- Lower bounding the number of queries required for a constant factor approximation algorithm for submodular maximization is an important problem, and this paper makes a solid contribution beyond those of Li et al..
- The paper seems highly novel to me. Their technique of connecting submodular maximization with the distributed set detection problem is not an approach I have seen used in submodular optimization before.
- Paper is clear and well-written.
缺点
- Since the focus of the paper is on hardness results, one potential con is that the paper doesn't provide much in the way of solutions for applications, and may be highly theoretical relative to other papers featured at ICLR. They do provide the algorithm LinearSum for the restricted case where is linear and we seek to approximate the value of the optimal solution. This is interesting because it shows that finding the approximate value where the objective is linear is not as hard as the general case. But the linear setting is very restricted, and so I'm not sure that there is much value for this algorithm in applications.
- The paper is missing an important citation. Theorem 4 of Kuhnle [2021] (referenced below) actually gives the lower bound of before the work of Li et al..
Kuhnle, Alan. "Quick streaming algorithms for maximization of monotone submodular functions in linear time." International Conference on Artificial Intelligence and Statistics. PMLR, 2021.
问题
- Could you provide intuition for why the more complicated reduction to distributed set detection gives these stronger bounds, compared to approaches more similar to Li et al.?
We thank the reviewer for the insightful comments!
This is interesting because it shows that finding the approximate value where the objective is linear is not as hard as the general case. But the linear setting is very restricted, and so I'm not sure that there is much value for this algorithm in applications.
We agree that the linearity assumption might be too strong for certain practical applications. However, in many large-scale machine learning scenarios, the "linearity" assumption often holds approximately. For instance, in the data valuation task, [1,2] employ a linear function to measure the value of individual data points, demonstrating strong empirical performance.
[1] Andrew Ilyas, Sung Min Park, Logan Engstrom, Guillaume Leclerc, and Aleksander Madry. Data-models: Predicting predictions from training data. ICML 2022.
[2] Understanding Black-box Predictions via Influence Functions. Pang Wei Koh, Percy Liang. ICML 2017
The paper is missing an important citation. Theorem 4 of Kuhnle [2021] (referenced below) actually gives the lower bound of before the work of Li et al..
Many thanks for pointing this out, we added the reference to our paper.
Could you provide intuition for why the more complicated reduction to distributed set detection gives these stronger bounds, compared to approaches more similar to Li et al.?
The distributed set detection task has a linear communication lower bound regardless of the choice – this is the key reason that we can lift it to a linear query lower bound for submodular maximization. From a technical perspective, the communication lower bound for distributed set detection relies on advanced techniques such as information complexity and the distributed data processing inequality. This might explain why our approach results in a stronger lower bound.
We thank the reviewers for their constructive feedback. In response, we have updated the paper to include a technique overview section (Section 1.1), providing a high-level summary of our method and a comparison with prior work.
The paper studies the problem of maximizing a monotone submodular function subject to a cardinality constraint. The main contribution of the paper is a lower bound result showing that queries are needed to achieve a constant factor approximation guarantee. Previously, the best lower bound was queries. For linear objective functions, the paper shows that queries are still required to construct an approximate solution, but one can estimate the value of the optimal solution using queries.
The main strength of the paper is that the lower bound result is nearly optimal and it closes the remaining gap in the query complexity of submodular maximization with a cardinality constraint. This contribution is a valuable addition to this well-studied and important problem. The reviewers appreciated the theoretical contribution. Following the discussion, the reviewers remained concerned about the lack of practical applications of this work. The author response did not identify concrete real-world applications for this result, and the reviewers remained concerned that this work may not be a good fit for a general ICLR audience. The paper's exposition also needs a significant revision as outlined in the reviewers' feedback.
审稿人讨论附加意见
The main concerns of the reviewers are that this work lacks practical applications and is not a good fit for a general ICLR audience. Additionally, the exposition lacks sufficient clarity to evaluate the work and it will need a substantial revision. Following the discussion, the reviewers remained concerned about these aspects. There were also concerns about the novelty of the techniques that the authors addressed satisfactorily in their response.
Reject