PaperHub
8.0
/10
Spotlight3 位审稿人
最低8最高8标准差0.0
8
8
8
3.7
置信度
正确性3.7
贡献度3.3
表达3.7
ICLR 2025

Streaming Algorithms For $\ell_p$ Flows and $\ell_p$ Regression

OpenReviewPDF
提交: 2024-09-27更新: 2025-04-09
TL;DR

We give new streaming algorithms for underconstrained regression when we see the columns one at a time, and obtain related results for flows.

摘要

We initiate the study of one-pass streaming algorithms for underdetermined $\ell_p$ linear regression problems of the form $$ \min_{\mathbf A\mathbf x = \mathbf b} \lVert\mathbf x\rVert_p \,, \qquad \text{where } \mathbf A \in \mathbb R^{n \times d} \text{ with } n \ll d \,, $$ which generalizes basis pursuit ($p = 1$) and least squares solutions to underdetermined linear systems ($p = 2$). We study the column-arrival streaming model, in which the columns of $\mathbf A$ are presented one by one in a stream. When $\mathbf A$ is the incidence matrix of a graph, this corresponds to an edge insertion graph stream, and the regression problem captures $\ell_p$ flows which includes transshipment ($p = 1$), electrical flows ($p = 2$), and max flow ($p = \infty$) on undirected graphs as special cases. Our goal is to design algorithms which use space much less than the entire stream, which has a length of $d$. For the task of estimating the cost of the $\ell_p$ regression problem for $p\in[2,\infty]$, we show a streaming algorithm which constructs a sparse instance supported on $\tilde O(\varepsilon^{-2}n)$ columns of $\mathbf A$ which approximates the cost up to a $(1\pm\varepsilon)$ factor, which corresponds to $\tilde O(\varepsilon^{-2}n^2)$ bits of space in general and an $\tilde O(\varepsilon^{-2}n)$ space semi-streaming algorithm for constructing $\ell_p$ flow sparsifiers on graphs. This extends to $p\in(1, 2)$ with $\tilde O(\varepsilon^{2}n^{q/2})$ columns, where $q$ is the H\"older conjugate exponent of $p$. For $p = 2$, we show that $\Omega(n^2)$ bits of space are required in general even for outputting a constant factor solution. For $p = 1$, we show that the cost cannot be estimated even to an $o(\sqrt n)$ factor in $\mathrm{poly}(n)$ space. On the other hand, if we are interested in outputting a solution $\mathbf x$, then we show that $(1+\varepsilon)$-approximations require $\Omega(d)$ space for $p > 1$, and in general, $\kappa$-approximations require $\tilde\Omega(d/\kappa^{2q})$ space for $p > 1$. We complement these lower bounds with the first sublinear space upper bounds for this problem, showing that we can output a $\kappa$-approximation using space only $\mathrm{poly}(n) \cdot \tilde O(d/\kappa^q)$ for $p > 1$, as well as a $\sqrt n$-approximation using $\mathrm{poly}(n, \log d)$ space for $p = 1$.
关键词
RegressionStreamingOnline algorithmsFlows

评审与讨论

审稿意见
8

The p\ell_p regression minAx=bxp\min_{\mathbf{Ax} = \mathbf{b}} ||\mathbf{x}||_p is a fundamental problem in machine learning, data science, and numerical linear algebra. When p=2p = 2, it is the classical linear regression problem and has a closed-form solution x=A(AA)1b\mathbf{x}^* = \mathbf{A}^\top (\mathbf{A}\mathbf{A}^\top)^{-1} \mathbf{b}. The general p\ell_p regression has been well-studied, in particular, when A\mathbf{A} is the vertex-edge incidence matrix, it is the p\ell_p-norm flow problem, which corresponds to transshipment, Laplacian solver, and maximum flow if p=1,2p = 1, 2, and \infty respectively. This paper considers the scenario that the matrix ARn×d\mathbf{A} \in \mathbb{R}^{n \times d} has ndn \ll d, which leads to the linear system Ax=b\mathbf{Ax} = \mathbf{b} being underdetermined. In this case, it is impractical to store all the dd columns of A\mathbf{A}, which motivates the usage of the column-arrival streaming model. In addition, there are two kinds of solutions to p\ell_p regression: (1) costestimation*cost estimation* that approximates the optimal p\ell_p norm; (2) vectorvalued*vector-valued* that approximates the optimal solution x\mathbf{x}^*. For the first kind, this paper achieves space complexity O~(ϵ2n 1+max1,p/2(p1))\widetilde{O}(\epsilon^{-2} n^{\ 1+max\\{1, p/2(p-1)\\}}), which is O~(ϵ2n2)\widetilde{O}(\epsilon^{-2} n^2) when p>2p > 2, and also gives lower bounds for p=0,1,2p = 0, 1, 2. For the second kind, this paper also provides lower bounds and upper bounds for different values of pp.

优点

(1) This paper researches the general p\ell_p regression problem under the column-arrival streaming model, while the existing work only considers the special case, like p=2p = 2 or A\mathbf{A} is the vertex-edge incidence matrix.
(2) For general pp, this paper establishes both lower and upper bounds on space complexity for the two kinds of solutions: cost estimation and vector-valued problems. These new results strengthen the paper's contributions.
(3) This paper has an extensive set of references and a comprehensive summary of the related work.

缺点

The pp-norm flow problem defined in (1) (Line 61) is not exact. The correct one should be minAx=bdiag(w)xp\min_{\mathbf{Ax} = \mathbf{b}} ||\text{diag}(\mathbf{w}) \cdot \mathbf{x}||_p. To be specific, when p=1p = 1 and w\mathbf{w} is the cost vector, then it is a transshipment problem; when p=2p = 2 and w=r1/2\mathbf{w} = \mathbf{r}^{1/2}, where r\mathbf{r} is the resistance vector, then it is reduced to a Laplacian solver; when p=p = \infty and w=c1\mathbf{w} = \mathbf{c}^{-1}, where c\mathbf{c} is the capacity vector, then it is the maximum flow problem.

By this formulation, it is the same as formulation (2). As a special case, when A\mathbf{A} is a vertex-edge incidence matrix in the graph streaming problem, there is some existing work (Line 91-97) on transshipment, electrical flow, maximum flow, etc. Could you compare this work with their results?

问题

(1) In Theorem 1, Line 153, should "ss rows" be "ss columns"? At Line 164, should the second O~(ϵ2n)\widetilde{O}(\epsilon^{-2} n) be O~(ϵ2n2)\widetilde{O}(\epsilon^{-2} n^2)? Please verify that.
(2) In Theorem 1, when p1p \to 1, like p=1.1p = 1.1, q=p/(p1)q = p/(p-1) would be a large constant. In this case, could you discuss the behavior of the proposed algorithm? If it is indeed a non-trivial issue, could you give a brief analysis or comment in this paper on p1p \to 1?

评论

Weaknesses: Agreed, equation (2) is meant to capture the general setting when the Lp norms are weighted and capture the cases which you mention. Surprisingly, there are in fact not many algorithmic results known for streaming flow problems. The works mentioned in 91-97 are lower bounds against the directed version of the problem, which are hard for streaming, whereas we present new algorithms for the undirected version. Known results are presented in Table 1.

(1) Thank you for catching the typo for "rows" vs "columns", this has been fixed. In the graph setting, the columns are sparse so the number of bits is asymptotically the same as the number of columns. (2) Indeed, if p goes to 1, there are inherent difficulties, as we discuss in 187 and complement with a lower bound in Theorem 3.

评论

Thank the authors for your feedback! I will retain my score.

审稿意见
8

The paper initiates the study of pp-norm regression in the one-pass streaming setting. They consider the underdetermined setting minAx=bxp\min_{Ax=b} \|x\|_p where AA has size n×dn\times d, n<<dn<<d, receives dd column updates. They consider p[1,)p\in [1,\infty). When we consider instances graphs, these updates correspond to edge insertions.

The first result is an algorithm for “p-norm flow sparsifier” in O(n2)O(n^2) space which is a general version of graph sparsifiers which minimize the p\ell_p-norm. This algorithm follows from using the online lewis weight sampling algorithm of Woodruff and Yasuda’22 on the dual problem. The other algorithmic result deals with giving a tradeoff between the size of the sparsifier and objective error when the goal is to maintain the entire solution vector x. The first algorithm only maintains the objective value.

The paper also presents several information theoretic lower bounds which match these upper bounds. They also give some extra lower bounds for the case of p=2p=2.

优点

The paper studies an important problem and makes a significant amount of progress in the new direction of streaming algorithms for regression. They also leave some good open questions in the paper. The techniques are also quite simple and use previous works quite well. The main technical core is the lower bound part, which is of independent interest.

缺点

It seems like there are too many ideas and previous results used in the paper. Would be useful to have some more preliminaries in the appendix or somewhere.

问题

  1. Can the authors add a comparison with what is known in online algorithms in the corresponding settings?
  2. The tables in the paper, that give the lower and upper bounds for different settings, can these be split in a different way so that its easier to contrast the lower and upper bounds for all settings? This would also make clear what settings are still unknown.
评论

It seems like there are too many ideas and previous results used in the paper. Would be useful to have some more preliminaries in the appendix or somewhere.

We have added a preliminaries section in the appendix with the most important definitions and if the reviewer has any further suggestions as to what would be helpful, we would be happy to add them.

  1. We have added a discussion on the work of https://arxiv.org/abs/1604.05448, which is the only other work we are aware of on online graph sparsification.
  2. The tables in the current draft are organized as Table 1 for algorithms and lower bounds for estimating the cost, and Table 2 for algorithms and lower bounds for outputting the solution. We are happy to take suggestions on what other organization would be more clear to readers.
审稿意见
8

The paper gives upper bounds and lower bounds for streaming underdetermined p\ell_p linear regression in the column arrival model. The results apply both to the value of the best fit, and to the actual solution. There are various results, depending on the value of pp and the approximation guarantees. The arguments use duality and sparsification.

优点

The paper explores the full range of values of pp.

缺点

The methods rely substantially on past work.

问题

None

AC 元评审

This paper provides streaming algorithms for solving linear regression problems with an p\ell_p constraint. In the streaming model data arrives one coordinate at a time. The reviewers deem the results to be good, and this is definitely publishable.

In my own reading, this seems to be an algorithms focused paper, and may not have the proper/big audience in ICLR. However, this is no fault of the authors. The reviews have also been short, but positive.

Overall I recommend acceptance.

审稿人讨论附加意见

The reviews have been positive and uniform; little issues raised, and the authors provided precise response.

最终决定

Accept (Spotlight)