6.8

/10

Poster4 位审稿人

最低4最高5标准差0.4

3.0

置信度

创新性2.8

质量3.0

清晰度2.0

重要性3.0

NeurIPS 2025

Uniform Wrappers: Bridging Concave to Quadratizable Functions in Online Optimization

Mohammad Pedramfar,Christopher John Quinn,Vaneet Aggarwal

OpenReview PDF

提交: 2025-05-12更新: 2025-10-29

摘要

This paper presents novel contributions to the field of online optimization, particularly focusing on the adaptation of algorithms from concave optimization to more challenging classes of functions. Key contributions include the introduction of uniform wrappers, a class of meta-algorithms that could be used for algorithmic conversions such as converting algorithms for convex optimization into those for quadratizable optimization. Moreover, we propose a guideline that, given a base algorithm $\mathcal{A}$ for concave optimization and a uniform wrapper $\mathcal{W}$, describes how to convert a proof of the regret bound of $\mathcal{A}$ in the concave setting into a proof of the regret bound of $\mathcal{W}(\mathcal{A})$ for quadratizable setting. Through this framework, the paper demonstrates improved regret guarantees for various classes of DR-submodular functions under zeroth-order feedback. Furthermore, the paper extends zeroth-order online algorithms to bandit feedback and offline counterparts, achieving notable improvements in regret/sample complexity compared to existing approaches.

关键词

DR-submodular OptimizationConvex Optimization

评审与讨论

审稿意见

评分: 4置信度: 32025-06-21

This paper introduces uniform wrappers, a meta-algorithmic framework that systematically converts online convex optimization (OCO) algorithms into algorithms for quadratizable function classes (a generalization of concave/DR-submodular functions). The core innovation lies in providing provable regret guarantees through automated transformation of both algorithms and their regret proofs. By applying this framework to zeroth-order base algorithms (e.g., ZO-FTRL), the work achieves improved regret bounds for zeroth-order feedback and bandit feedback—surpassing prior state-of-the-art results. The framework can recover or obtain superior regret for DR-submodular maximizations in different setting.

优缺点分析

The article's content is very dense, leading to a somewhat cluttered presentation. I believe the core contribution is that any OCO algorithm can be converted into an algorithm for quadratizable function classes via the proposed uniform wrappers, and that the regret bound proof for the original OCO algorithm can be transformed into a regret bound proof for the new algorithm. If I understand correctly, this would be a very nice result. The remaining parts essentially serve as examples demonstrating the generality of this core conclusion. Therefore, I recommend that the authors significantly enhance the introduction by adding a practical guide outlining the conversion process and proof transformation. Specifically: Assuming a researcher in a related field proposes a new OCO algorithm, how should they use the authors' framework to derive a online quadratizable optimization algorithm?

I'd be happy to reevaluate if the authors could alleviate my confusions as shown in the question part.

问题

Could the authors provide a concise user guide for converting an OCO algorithm into an online quadratizable optimization algorithm, along with an explanation of how the regret bound proof is transformed? This should include specifying which technical parts of the paper should be applied and in what order. If the authors have already provided such part somewhere in the paper, it will be appreciate that the authors could remind me the location.
Could the authors provide examples of potential applications for quadratizable function classes, or provide references that demonstrate the significance of this function class?
For the future work, could the regret lower bound proof of some OCO setting be convert to that of the online quadratizable optimization?

typos: 58-59: "fully adaptive Another..."

局限性

yes

最终评判理由

I believe this work possesses significant scholarly merit despite needing writing improvements. It provides meaningful guidance for future research, though my main reservation remains the lack of a general selection methodology for the wrapper. I increase my rating accordingly.

格式问题

No paper formatting issues is found.

作者回复

2025-07-31

Q1.

Could the authors provide a concise user guide for converting an OCO algorithm into an online quadratizable optimization algorithm, along with an explanation of how the regret bound proof is transformed? This should include specifying which technical parts of the paper should be applied and in what order. If the authors have already provided such part somewhere in the paper, it will be appreciate that the authors could remind me the location.

We first discuss

a concise user guide for converting an OCO algorithm into an online quadratizable optimization algorithm

The conversion of the algorithm is detailed in Meta-algorithm 1. Basically, this meta-algorithm transforms the base algorithm action using $\mathcal{W}^{\operatorname{action}}$ and the base algorithm query oracle using $\mathcal{W}^{\operatorname{query}}$ .

For example, consider the Online Gradient Ascent as a base algorithm:

Input: convex set $\mathcal{K}$ , $x_1 \in \mathcal{K}$ , step-sizes $\eta_n$

For $t=1$ to $T$

Play $x_t$

Observe $o_t$ , a sample of $\mathcal{Q}_{f_t}(x_t)$ , an unbiased estimate of $\nabla f_t(x_t)$

$x_{t+1} \gets P_{\mathcal{K}}(x_t + \eta_t o_t)$

Where we use $\mathcal{Q}_{f_t}$ to denote the stochastic first order query oracle for $f_t$ which returns unbiased estimates of $\nabla f_t$ .

Now, if we apply the first-order to first-order uniform wrapper $\mathcal{W}_1^{NM}$ (for non-monotone up-concave functions over general convex sets; see Appendix C.3 for additional details) according to Meta-algorithm 1, we get a first order algorithm for online optimization of non-monotone DR-submodular function over general convex sets:

Input: convex set $\mathcal{K}$ , $x_1 \in \mathcal{K}$ , $\underline{x} \in \mathcal{K}$ , step-sizes $\eta_n$

For $t=1$ to $T$

Play $(\mathcal{W}_1^{NM})^{\operatorname{action}}(x_t) = (x_t + \underline{x})/2$

Observe $o_t$ , a sample of $(\mathcal{W}_1^{NM})^{\operatorname{query}}(\mathcal{Q}_{f_t})(x_t)$

$x_{t+1} \gets P_{\mathcal{K}}(x_t + \eta_t o_t)$

By plugging in the definitions of $(\mathcal{W}_1^{NM})^{\operatorname{action}}$ and $(\mathcal{W}_1^{NM})^{\operatorname{query}}$ as per Appendix C.3, we get the following algorithm:

Input: convex set $\mathcal{K}$ , $x_1 \in \mathcal{K}$ , $\underline{x} \in \mathcal{K}$ , step-sizes $\eta_n$

For $t=1$ to $T$

Play $(x_t + \underline{x})/2$

Sample $z$ according to $\mathcal{Z}^{NM}$ (as defined in Appendix C.3)

Observe $o_t$ , a sample of $\mathcal{Q}_{f_t}(\frac{z}{2}(x_t - \underline{x}) + \underline{x})$ , an unbiased estimate of $\nabla f_t(\frac{z}{2}(x_t - \underline{x}) + \underline{x})$

$x_{t+1} \gets P_{\mathcal{K}}(x_t + \eta_t o_t)$

This becomes Algorithm 1 in [Zhang et al., 2024] in the case of non-monotone functions.

We will add this example with more explanation in the Appendix to demonstrate exactly how Meta-algorithm 1 works.

an explanation of how the regret bound proof is transformed

The conversion of the regret bound is the focus of Section 6, in the form of a three-step guideline. Section 7 and Appendix D are examples of how to apply such a guideline. Since this is a guideline that depends on the structure and the details of the original proof, any explanation of the application of the guideline requires an understanding of the original proof. This is why Section 7 and Appendix D are as long as they are, we needed to go through the proof of the original regret bound, explain all the ideas there, and rewrite the proofs in order to match the guideline.

In Section 7, lines 285-310 describe exactly how Theorems 1 and 2 provide an example of how to apply the guideline in this case. In Appendix D, lines 984-997, specifically the statement of Theorems 9 and 10 and the paragraph between the theorems describe how the guideline applies in this case.

Online Gradient Ascent (OGA) mentioned above is a special case of what is covered in Appendix D. For OGA specifically, we can simplify the proofs and guidelines as follows. However, in this specific case, we may simplify the proofs and the guideline in the following manner:

As per step 1 of the guideline, we first rewrite a part of the proof of the OGA, in a way that does not require concavity, but the proof of the regret bound for the concave case can be recovered using the inequality $f(y) - f(x) \leq \langle \nabla f_t(x), y - x \rangle$ .

Theorem A. For any differentiable function class $\mathbf{F}$ , with stochastic first-order query oracles bounded by $G$ and step-sizes $\eta_t=\frac{D}{G \sqrt{t}}$ , the algorithm OGA guarantees that $\max_{x \in \mathcal{K}} \sum_{t=1}^T \mathbb{E}[ \langle \nabla f_t(x_t), x - x_t \rangle ] \leq \frac{3}{2} G D \sqrt{T}$ .

The proof of this theorem is a simple rewriting of the classical results. Note that the boundedness of the stochastic first-order query oracle by $G$ implies that the gradients of functions in $\mathbf{F}$ are bounded by $G$ , i.e., functions in $\mathbf{F}$ are $G$ -Lipschitz continuous.

Next we want to adapt the result of Theorem A to upper-quadratizable setting.

Theorem B. If $\mathbf{F}$ is a function class over $\mathcal{K}$ that is upper-linearizable with $0 < \alpha \leq 1$ and $\beta \geq 0$ and a first-order uniform wrapper $\mathcal{W}$ , then with stochastic first-order query oracles bounded by $G$ and step-sizes $\eta_t=\frac{D}{G \sqrt{t}}$ , the algorithm $\mathcal{W}(OGA)$ guarantees that $\max_{x \in \mathcal{K}} \sum_{t=1}^T \mathbb{E}[ f_t(x) - f_t(\mathcal{W}^{\operatorname{action}}(x_t)) ] \leq \frac{3}{2} G D \sqrt{T}$ .

At first glance, it may seem that moving from Theorem A to Theorem B might be non-trivial. However, as elaborated in Appendix A, whether we apply $\mathcal{W}(OGA)$ to an adversary $\operatorname{Adv}$ that selects functions from $\mathbf{F}$ or applying OGA to the adversary $\mathcal{W}(\operatorname{Adv})$ , the sequence $x_t$ remains the same. Thus, by using Theorem A, we immediately see that $\max_{x \in \mathcal{K}} \sum_{t=1}^T \mathbb{E}[ \langle \nabla \mathcal{W}(f_t)(x_t), x - x_t \rangle ] \leq \frac{3}{2} G D \sqrt{T}$ . Now we may use the definition of uniform wrappers to complete the proof of Theorem B.

We will include the discussion above in the appendix to show how our guideline may be applied to a setting that is simpler than the zeroth-order algorithm mentioned in Section 7.

Could the authors provide examples of potential applications for quadratizable function classes, or provide references that demonstrate the significance of this function class?

Q2. So far, the main classes of functions that are covered by quadratizable functions are concave functions and continuous DR-submodular functions (as described in the paper). However, we would like to point out that this concept has been introduced in the literature in [Pedramfar and Aggarwal, 2024a] which was published in December 2024. Thus, given more time, we expect that this concept will be better understood and more examples of quadratizable/linearizable functions will be found.

For the future work, could the regret lower bound proof of some OCO setting be convert to that of the online quadratizable optimization?

Q3. That is a great question. Unfortunately, we do not think our framework could be used for converting lower-bound proofs. In order to prove a lower bound, it is essential to already know the optimal approximation coefficient. (To see why, given a function class $\mathbf{F}$ with a uniform wrapper with $0 < \alpha \leq 1$ , it may be possible to find another uniform wrapper with a worse approximation coefficient $0 < \alpha' < \alpha$ . If an algorithm for $\mathbf{F}$ has sublinear $\alpha$ -regret, then it will have negative $\alpha'$ -regret.)

Finding the optimal approximation coefficient is highly non-trivial. For example, for non-monotone DR-submodular functions over downward-closed convex sets, the optimal approximation coefficient is still unknown.

2025-08-05

Dear Reviewer ESFd,

Thank you again for your thoughtful and constructive comments. We wanted to follow up to see if you had any additional questions or concerns following our rebuttal. We’d be glad to provide further clarification during the discussion phase if needed.

Regards,

Authors

2025-08-07

Thanks for the detailed responses. These examples will increase the clarity of the presentation. I noticed the author's response to another reviewer happened to address the question I intended to ask about 'how to construct a suitable wrapper'. Therefore, I have no further questions.

2025-08-07

Dear Reviewer ESFd,

Thank you for your thoughtful feedback. We're pleased to hear that our response addressed your concerns. We will incorporate the details provided in the rebuttal into the final version. If there are no remaining issues, we hope you will consider increasing your score.

Thank you,

Authors.

审稿意见

评分: 4置信度: 32025-07-01

This paper introduces a meta-algorithmic framework called "uniform wrappers" that transforms algorithms designed for concave/convex optimization into ones applicable to non-convex, quadratizable optimization problems, specifically targeting classes of DR-submodular functions. The main results are: development of a conversion guideline that adapts both algorithms and their proofs; application to zeroth-order feedback, bandit feedback, and offline optimization;

优缺点分析

Strengths: 1) the proposed uniform wrappers is a novel development. It enables systematic translation of actions and oracles while maintaining regret guarantees. 2) The detailed proof adaptation framework leads to new regret bounds under zeroth-order and bandit settings. These theoretical results either match or improve existing regret/sample complexity bounds. 3) The framework handles multiple function classes, including monotone, non-monotone, weakly DR-submodular functions. This demonstrates the framework’s generality by recovering and unifying prior SOTA results.

Weakness: 1) The paper is purely theoretical. No simulations is provided to demonstrate practical performance of the meta-algorithms (e.g., on DR-submodular test functions).

问题

For many practical functions, how should we construct a suitable wrapper W? Are there heuristics or automatable steps, especially for less structured non-convex functions?

局限性

n/a

格式问题

作者回复

2025-07-31

The paper is purely theoretical. No simulations is provided to demonstrate practical performance of the meta-algorithms

The reason we have not added simulations is two-fold.

First, given the theoretical nature of our work and the extensive coverage of difference scenarios, it is not clear what experiments could support our main contributions. An experiment for the zeroth-order algorithm in one of the settings will only demonstrate that, in that setting, that algorithm is performant. Such an experiment only marginally supports our main contribution, which is Meta-algorithm 1 and the guideline described in Section 6.

Second, it is common for papers that are theoretical like this to not have experiments.

For many practical functions, how should we construct a suitable wrapper W? Are there heuristics or automatable steps, especially for less structured non-convex functions?

Given the difficulty of non-convex optimization, finding a uniform wrapper for a single function class is a powerful result. In general, even for offline optimization, there are few families of non-convex functions where global convergence results are currently known. Thus, any heuristic that could be applied to a general family of function classes would be an immensly powerful breakthrough, as it would result in global convergence guarantees for online optimization of the all the function classes in that family.

In order to construct a uniform wrapper $\mathcal{W}$ , we need to specify $\mathcal{W}^{\operatorname{action}}$ , $\mathcal{W}^{\operatorname{function}}$ , and $\mathcal{W}^{\operatorname{query}}$ .

So far, except for the identity wrapper which applies to both concave functions and monotone up-concave functions over general convex sets, the examples of uniform wrappers fit into the following framework:

There is a random variable $\mathcal{Z}$ taking values in $[0, 1]$ and a parametrized family of functions $h_z : \mathcal{K} \to \mathcal{K}$ for $z \in [0, 1]$ , such that, for some $\alpha \in (0, 1]$ and $\beta, \mu \geq 0$ , we have

$\alpha f(y) - f(x) \leq \beta \left( \langle \nabla F(x), y-x \rangle - \frac{\mu}{2}\| y - x \|^2 \right)$ ,

where $F(x) = \mathbb{E}_{z \sim \mathcal{Z}} [f(h_z(x)) - f(h_0(x))]$ .

In this case we define $\mathcal{W}^{\operatorname{action}} = h_1$ , and $\mathcal{W}^{\operatorname{function}}(f) = F$ . We also define $\mathcal{W}^{\operatorname{query}}(Q_f)$ such that it converts a query oracle for $f$ into a query oracle for $F$ using the following equality:

$\\nabla F = \mathbb{E}\_{z \sim \mathcal{Z}} [\\nabla f(h_z(x)) J_{h_{z}}(x) - \\nabla f(h_z(x)) J_{h_{0}}(x)]$ ,

We will add a discussion on these lines in the final version.

审稿意见

评分: 4置信度: 42025-07-03

The main contribution of the paper is providing a procedure named the uniform wrapper that converts algorithms and their regret guarantees from online concave optimization to online quadratizable optimization, which includes online DR-submodular maximization as a special case. By applying the proposed framework, the paper can recover almost all the known results in the literature for online/offline DR-submodular maximization (and indeed its generalization, online/offline up-concave optimization).

优缺点分析

Quality. The paper presents a systematic method for solving online quadratizable optimization. It is, in general, technically sound. All the claims are well supported by theoretical analysis.

Clarity. Perhaps due to the paper's extensive coverage of scenarios, substantial space is devoted to presenting results, leaving limited room for presenting and explaining the approach. The description of conversion steps in Section 6 is overly simplistic, making it difficult to grasp the essence without consulting the appendix. Although Section 7 provides an example of zeroth-order FTRL and proposes a variant algorithm, the authors still fail to explain how this algorithm successfully integrates with the uniform wrapper. In a word, the main body alone provides an inadequate explanation for the algorithm's technical success.

Significance. Seeking a unified algorithmic framework for online quadratizable optimization represents a fascinating research attempt. The proposed method is, in general, successful since it either beats the SOTA or at least recovers it. However, it is worth noting that the approach only beats the SOTA in a limited number of cases. This means that the approach does not contain too many substantial insights that can break the barrier for online up-concave optimization. This paper may be regarded as an interim summary in some sense.

Originality. The paper proposes a unified framework for adapting online concave optimization algorithms to algorithms for online quadratizable optimization, improving the previous results such as [Pedramfar and Aggarwal, 2024a]. This validates its originality.

问题

It is recommended to leave more room for introducing the uniform wrapper. A good manner is to describe the whole procedure in detail for a concrete scenario.

局限性

yes

最终评判理由

I went through the part about the contributions again. Combining with the reply from the authors, I find it unnecessary to be so demanding about the paper’s findings. So I decided to raise my score. And due to the writing, I will increase it only by 1.

格式问题

None

作者回复

2025-07-31

We thank the reviewer for their review and suggestions.

Clarity: ... substantial space is devoted to presenting results, leaving limited room for presenting and explaining the approach.

The main concepts and ideas are explained in Sections 4, 5, and 6 in the main text and Appendices A and C. The core ideas are mathematically complete, but somewhat abstract, which makes them difficult to explain and understand without examples. This is why we have devoted Sections 7 and 8 to examples and applications of our method, which takes a total of 2.5 pages. However, there is a tradeoff between the simplicity of the example and the significance of its application. In Section 7, we have chosen an example that would allow us to obtain novel results.

We believe that, given the abstract nature of the ideas, the best way to increase clarity might be to add a simpler example. In the response to reviewer ESFd, we have detailed how our guideline may be applied to a simpler base algorithm, namely Online Gradient Ascent. The resulting algorithms are not novel, but the procedure will increase the clarity of our presentation. We will include this example in the final version in the main text to further clarify the main ideas.

In the final version, we will expand Section 6 further to clarify the ideas. We will also add examples of pseudo-codes for application of the specific uniform wrappers we consider to the base algorithm mentioned in Section 7.

Significance: ... However, it is worth noting that the approach only beats the SOTA in a limited number of cases. This means that the approach does not contain too many substantial insights that can break the barrier for online up-concave optimization. This paper may be regarded as an interim summary in some sense.

As discussed in the paper, and in the Online Gradient Ascent example in our response to reviewer ESFd, even though the changes to the original algorithm and proof could be very minor, one can not simply claim that the original proof could be adapted to the format we require without justification. Thus, applying our guideline requires an almost complete rewriting of the proof of the original regret bound.

In this work, we provide an example for zeroth-order case. However, this is merely because of limited space. Our technique is general enough that, a priori, it is not possible to rule out any algorithm for online concave optimization as a base algorithm. The result of [Pedramfar and Aggarwal, 2024a] covers base algorithms that are deterministic and require first order feedback. However, there are first order algorithms that are not deterministic. For example, all Follow-The-Perturbed-Leader type algorithms are by definition stochastic as they require perturbation. Similarly, the SOTA projection-free online concave optimization algorithm using membership oracles is also stochastic (See [1]). On the other hand, the results of [Pedramfar and Aggarwal, 2024a] do not cover any base algorithm that is not first order. Thus none of the zeroth-order or second-order (e.g. Newton method) feedback algorithms are directly covered by [Pedramfar and Aggarwal, 2024a]. While it is possible to add more results in this paper, by considering more base algorithms, we would need to introduce the notation, ideas, and proofs of those results in entirety in order to do so. The paper [1] is 77 pages. Even if adapting this result using our framework increases our page count by only 30 pages, it would still result in an overly lengthy paper and reduce the clarity of the presentation. Moreover, none of the ideas used in [1], or in second-order algorithms, or even in the zeroth-order algorithm that we have discussed in Section 7, are directly relevant to the main contribution of our work.

[1] Efficient Projection-Free Online Convex Optimization with Membership Oracle - Mhammedi - 35th Annual Conference on Learning Theory - 2022

2025-08-05

Dear Reviewer ALb7,

Regards,

Authors

2025-08-08

Dear Reviewer ALb7,

Given that there is only a day left in the discussion phase, we wanted to follow up to see if you had any additional questions or concerns following our rebuttal. We’d be glad to provide further clarification during the discussion phase if needed.

Regards,

Authors

2025-08-09

Thank you for your comment and appreciating the generality of our framework and contribution.

I am still skeptical about whether the paper would be readable simply by adding an illustrative example.

We understand your concern about readability. In theoretical framework papers, clarity is achieved through a combination of precise mathematical definitions, general explanations, and illustrative examples. The original submission includes formal definitions and detailed explanations in Sections 3–6 and Appendices A and C. The newly added example is specifically intended to complement these by demonstrating the algorithm’s operation in a concrete setting.

Naturally, as a theoretical contribution, there are limits to how much we can simplify without sacrificing generality or rigor. That said, we believe the added example significantly improves accessibility and helps bridge the gap between abstract formulation and practical intuition.

My other concern is that "the approach only beats the SOTA in a limited number of cases". The authors stress that their approach is very general in their response, which I think does not answer my question. I never deny the generality of the approach. My point is that the approach does not bring new results. It would be beneficial if the author could expand on this aspect.

We respectfully disagree with the assertion that our approach does not yield new results. As summarized in Tables 1 and 2 and detailed in the list of contributions, our framework leads to improved theoretical guarantees in multiple settings.

For example:

In the offline setting with monotone functions over general convex constraint sets under stochastic value feedback, we improve the oracle complexity from $1/\epsilon^4$ to $1/\epsilon^3$ (Table 2).
In the online setting with monotone functions over general convex constraint sets under stochastic full-information value feedback, we improve the regret bound from $T^{3/4}$ to $T^{2/3}$ (Table 1).

We would like to emphasize that these settings highlighted are far from narrow special cases—they represent important and well-studied problems. There have been multiple papers focused on just a single setting (see [1–7]); while some may address both online and offline variants, they typically fix a specific function class (monotone or non-monotone), a single type of constraint set, and one type of feedback. In contrast, our results improve results for multiple such settings (including the two mentioned above), which we believe is a substantial contribution of this work. Improving both of these settings within a unified framework is therefore a meaningful and nontrivial contribution.

Our framework yields order-level improvements for monotone functions over general convex constraint sets, which we believe is a key strength of the paper. Additionally, as noted in Contribution 6, we provide six new results under restricted query oracles, further underscoring the breadth of our contributions.

Finally, we emphasize, as the reviewer acknowledges, that our framework is general, meaning it can be instantiated with a wide range of base algorithms.

As explained in our earlier response, the proposed approach can be used to study a wide range of base algorithms, including those based on second-order (e.g., Newton-type) feedback, making it broadly useful beyond the specific results we present.

[1] Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains - Bian, Mirzasoleiman, Buhmann, Krause - 2019

[2] Gradient Methods for Submodular Maximization - Hassani, Soltanolkotabi, Karbasi - 2017

[3] Continuous DR-submodular Maximization: Structure and Algorithms - Bian, Levy, Krause, Buhmann - 2019

[4] Constrained Submodular Maximization via New Bounds for DR-Submodular Functions - Buchbinder, Feldman - 2023

[5] Lyapunov function approach for approximation algorithm design and analysis: with applications in submodular maximization - Du - 2022

[6] Resolving the Approximability of Offline and Online Non-monotone DR-Submodular Maximization over General Convex Sets - Mualem, Feldman - 2022

[7] Black Box Submodular Maximization: Discrete and Continuous Settings - Chen, Zhang, Hassani, Karbasi - 2020

2025-08-09

2025-08-08

Thank the authors for the rebuttal. The example from the authors' response to Reviewer ESFd helps understand the proposed algorithm. However, as the authors claim, the approach is general, but the proof is case-by-case. It's challenging to organize such materials. I am still skeptical about whether the paper would be readable simply by adding an illustrative example.

My other concern is that "the approach only beats the SOTA in a limited number of cases". The authors stress that their approach is very general in their response, which I think does not answer my question. I never deny the generality of the approach. My point is that the approach does not bring new results. It would be beneficial if the author could expand on this aspect.

审稿意见

评分: 5置信度: 22025-07-14

This paper introduces a general technique for reducing problems involving (online) optimization of quadratizable losses to the much more well-studied problem of online optimization of concave losses (and in the course of doing so, recovering or improving many state-of-the-art results on quadratizable optimization). Quadratizability is a generalization of concavity that allows for proving approximate optimization guarantees (e.g. that some form of gradient descent will reach an alpha-approximation to the optimal); the condition that f is (upper) quadratizable can be ~written as f satisfying

alpha * f(y) - f(h(x)) <= beta * (<g(f, x), y - x> - ||y-x||^2).

The key observation of this paper is that many quadratizable functions actually satisfy a slightly stronger constraint, where the functional g(f,x) is equal to $\nabla G(f)(x)$ for some simple transformation G(f). In particular, they can construct such transformations (which they call “uniform wrappers”) for fairly broad classes of quadratizable functions, including monotone gamma-weakly concave functions with bounded curvature, monotone gamma-weakly concave functions over convex sets containing the origin, and non-monotone concave functions over general convex sets.

This transformation is fairly powerful and allows for transferring a wide variety of results from the concave optimization regime to the quadratizable optimization regime (although it does seem like there is no completely generic claim along these lines, and every application of this framework does require a little bit of custom work). One example the authors give is the generalization of a classic zeroth-order FTRL algorithm (due to Saha and Tewari) to the quadratizable setting, where it immediately gives an O(T^{2/3}) alpha-regret algorithm for the class of alpha-upper-linearizable losses.

优缺点分析

Optimization of (online) quadratizable losses is strongly connected to several problems in online submodular optimization, an area of key importance in machine learning and algorithm design. This paper introduces what seems to me (a non-expert in this area) a fairly powerful technique for translating between this problem and (the very well-studied problem of) online convex optimization.

Earlier work by Pedramfar and Aggarwal (who also introduced the class of quadratizable functions) provided a method to automatically translate between these two regimes, but their method is somewhat more lossy than the method presented in this paper (in the zeroth-order FTRL application mentioned above, the algorithm of Pedramfar and Aggarwal results in suboptimal T^{3/4} rates). The downside is that this new method seems to be somewhat less automatic (e.g., one step is to “rewrite the steps of the analysis so that the application of concavity is limited to a single line”). But the authors do successfully apply it to a fairly large class of problems, and so the generality of this method is fairly convincing.

The paper was in general well-written and easy to understand, although as someone who is not an expert in the area, it would have been helpful to have a bit more exposition explaining which of the new results in Table 1 arethe most significant improvements over the existing state-of-the-art algorithms. (Right now my impression is that the improvements largely are limited to previously not-well-studied domains, e.g. stochastic-within-a-cone queries, but there are many results and I could have missed some important ones).

问题

See section above. Feel free to reply to any part of this review.

局限性

Yes.

最终评判理由

I have read the author's response and the other reviews and my score remains the same. The authors introduce a general framework for studying these generalizations of online convex optimization, and I expect this to be of interest to the optimization / online learning communities at NeurIPS.

格式问题

None.

作者回复

2025-07-31

We thank the reviewer for their review and positive assessment of our work.

... Earlier work by Pedramfar and Aggarwal (who also introduced the class of quadratizable functions) provided a method to automatically translate between these two regimes, but their method is somewhat more lossy than the method presented in this paper (in the zeroth-order FTRL application mentioned above, the algorithm of Pedramfar and Aggarwal results in suboptimal $T^{3/4}$ rates)

We note that the result of [Pedramfar and Aggarwal, 2024a] for the zeroth order case is obtained by transforming a first-order algorithm and then converting the resulting first-order algorithm into a zeroth-order algorithm using one-point gradient estimators (a la [Flaxman et al., 2005]). Specifically, they require the base algorithm to be first-order and their result cannot be directly applied to a zeroth-order base algorithm.

... it would have been helpful to have a bit more exposition explaining which of the new results in Table 1 arethe most significant improvements over the existing state-of-the-art algorithms. (Right now my impression is that the improvements largely are limited to previously not-well-studied domains, e.g. stochastic-within-a-cone queries, but there are many results and I could have missed some important ones).

The most significant improvements are for previously studied oracle types. Note that a deterministic oracle is a special case of stochastic-within-a-cone oracle. Thus, when considering either deterministic or fully stochastic query oracles, we obtain superior regret guarantees for 3 settings. Specifically, for online or offline zeroth order feedback, we improve the SOTA for monotone up-concave functions over general convex sets and for bandit feedback, we improve the SOTA for non-monotone up-concave functions over general convex sets. (See Tables 1 and 2)

However, we note that the results in Tables are by no means the only applications of our work. We have only considered the zeroth-order algorithm of [Saha and Tewari, 2011] due to the limited space in the paper. We refer to our response to reviewer ALb7 for more details on other possible applications.

最终决定Accept (poster)

2025-09-17

Overall all reviewers were positive about this work and I support their decision.