From Linear to Linearizable Optimization: A Novel Framework with Applications to Stationary and Non-stationary DR-submodular Optimization
摘要
评审与讨论
This paper studies online optimization with upper linearizable/quadratizable functions which are a new class of objectives considered in this field. This class extends concavity and DR-submodularity in various settings, including monotone and non-monotone cases over different types of convex sets. A general meta-algorithm is proposed which can convert linear/quadratic maximization algorithms into ones that optimize upper quadratically functions. In this way, once can solve the concave and DR-submodular optimization problems through a unified approach.
优点
This paper proposes a novel framework and novel notion of quadratizable functions and relates the algorithms and regret guarantees for the optimization of linear functions to that of quadratizable functions.
This paper shows that the class of quadratizable function optimization is general, and includes not only concave, but up-concave optimization in several cases.
The authors design a new meta-algorithm, namely SFTT, that captures the idea of random permutations (sometimes referred to as blocking) as used in several papers such as [44, 46, 34]. While previous works used this idea in specific settings, the meta-algorithm is applicable in general settings.
缺点
-
The mentioned upper linearizable/quadratizable functions seem to extend the online optimization framework. It is good as a theoretical paper. Though it will be great if the authors can provide motivations for this kind of model from practial applications
-
I think it will be better if the authors can provide numerical results or emperical intuitions other than the theoretical guaranttees?
-
Another way to strengthen this work is providing a lower bound, or at least a discussion on it
I am not familiar with this topic and will look at the comments by other reviewers.
问题
none
局限性
none
Thank you for your review.
W1. Our notion of linearizable/quadratizable functions generalizes both DR-submodularity and convexity. There are many applications for DR-submodular maximization. Example applications include experimental design [7], resource allocation [Designing smoothing functions for improved worst-case competitive ratio in online optimization - Eghbali, R. and Fazel, M. (2016)], influence maximization [Maximizing the spread of influence through a social network - Kempe et al., 2003], [Influence estimation and maximization in continuous-time diffusion networks - Gomez-Rodriguez et al., 2016] mean-field inference in probabilistic models [3], and MAP inference in determinantal point processes (DPPs) [Determinantal point processes for machine learning - Kulesza et al., 2012] as well as more recently identified applications like serving heterogeneous learners under networking constraints [Experimental design networks: A paradigm for serving heterogeneous learners under networking constraints - Li et al. 2023] and joint optimization of routing and caching in networks [Jointly optimal routing and caching with bounded link capacities - Li et al., 2023]. We will add a section to the appendix and include some of such applications as discussed in the literature.
W2. Note that our framework is so general that, as corollaries (Specifically Corollaries 7 and 8 and Theorem 8), we obtain 33 (14 + 10 + 9) algorithms covering 41 (12 + 12 + 8 + 9) cases where there is only 1 case where there exists an algorithm in the literature that (both with more assumptions and in a more limited settings) can obtain a better regret bound than our result. (There are also 3 of the results of [35], mentioned in Table 2, which require both more assumptions and also obtain worse sample complexity for than our result, but obtain a better approximation coefficient when .)This makes it a bit difficult to run experiments that cover a meaningful subset of results without adding significantly more to the paper. Given the theoretical focus of this paper, we hope that future application papers related to these 41 cases will evaluate and compare the described algorithm in their works.
W3. There are several types of lower bounds one can consider. Approximation coefficient, time complexity, sample complexity, static regret, dynamic regret and adaptive regret. Approximation coefficients discussed in this work are either proven or conjectured to be optimal. For monotone functions over convex sets containing the origin, given full-information first order feedback, it is known that is the optimal -regret. In general, as long as the approximation factor is optimal, we expect -regret of to be a lower bound. Note that this is the first work the obtains sublinear dynamic regret or adaptive regret over DR-submodular functions. The base algorithms that are used here are known to be optimal in the context of convex optimization. However, proving lower bound would require a more detailed analysis which we will leave to a future work. We will write a discussion on the lower bound based on the above in the final version.
Thanks for the responses. I tend to maintain my current rating with a low confidence rate.
The paper introduces a novel notion of upper linearizable / quadratizable functions. Using this notion of functions, the paper proposes a general meta algorithm that extends certain online linear / quadratic maximization algorithms to handle upper linearizable / quadratizable function classes. This general meta algorithm features multiple variants, obtaining results for different feedback settings.
优点
Disclaimer: This paper is out of my expertise area. I do not have a good understanding of this paper. My review should be considered an educated guess.
The technical contribution of this paper is strong. The meta algorithm is general and leads to several theoretical results.
缺点
I find this paper hard to follow. It is largely due to my lack of familiarity with this topic, but I think the authors could do a better job with the presentation. E.g. add examples and figures, when showing theoretical results, also add some explanation and intuition.
问题
I don't have specific questions.
局限性
The limitations are adequately discussed.
Thank you for your review.
We will expand the explanations for each theorem to make the core ideas more clear. We will also expand section 6 to include more explanations and intuition about the applications in the final version.
Thank you for your reply. I maintain my score.
A new class of functions are introduced: upper linearizable/quadratizable functions, which extend concavity and DR-submodularity. It is shown how to apply algorithms for linear / quadratic maximization to this class, which offers a unified approach to DR-submodular optimiziation problems. The abstract framework is then applies to yield results for a broad variety of optimization settings, including online up-concave maximization with monotone functions, non-monotone, and a variety of feedback settings. Many of these settings had been studied previously in the literature, and the results obtained here either match or improve in many cases. They also convert the online results to offline ones, and apply the framework to non-stationary up-concave maximization.
优点
- The abstraction of the class of linearizable functions unifies and improves results in many settings by making a connection with linear optimization.
- In many cases, the results are achieved under fewer assumptions. For example, offline, monotone, maximization over a set containing the origin with access to the gradient of the function, the authors obtain the same sample complexity and approximation ratio as [18], but only require the function to be up-concave, differentiable, and Lipschitz, while [18] has more requirements, such as the function's Hessian to be Lipschitz.
- Obtaining these results from one general framework enhances the understanding of these problems and the relationship between them
- Figure 1 shows how to obtain various results from the theorem statements by following a directed path on a graph.
缺点
minor comments:
- line 169: expect -> except
- line 251: every function of F
问题
- Are there any implications for non-monotone maximization over downward closed sets?
- The results for non-monotnoe functions don't seem to apply to gamma-weakly up-concave functions or improve with curvature. Can these results be extended to such cases?
局限性
Yes
Thank you for your review.
We will correct the typos in the final version.
Q1. Currently, there is no result showing linearizability of non-monotone functions over downward closed sets, with a better approximation coefficient/sample complexity than that can be specialized from the results of general convex sets. However, we believe this framework to be general enough to allow for such results. All we needs is to show that such functions are linearizable (i.e., something similar to Lemmas 1, 2, or 3 for non-monotone functions over downward closed convex sets) and almost all of the results that are valid for linearizable functions would follow through even in this case.
Q2. Note that, in the literature, the notion of curvature or gamma-weakly DR-submodular is almost always considered in the context of monotone functions. In fact, we only define curvature for monotone functions in this paper. In order to properly address the question, we would need to formalize and justify appropriate definitions for gamma-weakly non-monotone functions with non-zero curvature and also generalize Lemma 3 so that it would be applied to such functions.
Specifically, for a continuous monotone function , we define the curvature to be the smallest number such that for all and such that . This is already a generalization of what is commonly considered in the literature where the function is assumed to be differentiable and the curvature is defined as where is the -th component of the gradient vector. Moreover, the definition of curvature and -weakly, in the non-monotone case, should be formulated in a way that would allow for a version of Lemma 3 to work. We leave the extension of this notion for non-monotone functions and proving the variant of Lemma 3 in this case as a future work.
The paper explores the concept of linear/quadratic approximation and suggest generic templates towards using such approximations within black box regret-minimization/optimization algorithms. While this approach is quite prevalent and well investigated in convex optimization, the authors show variants of these approximations that apply to DR-submodular function --- a function class which was well studied in the past years. Through their approach the authors have been able to restore as well as improve SOTA results in various settings, which demonstrates the power of their framework. I will therefore accept.