Correlation Clustering Beyond the Pivot Algorithm
摘要
评审与讨论
The paper studied a variant of the pivot algorithm for correlation clustering and gave a dynamic implementation with polylog(n) update time and 2.997 approximation. Correlation clustering is a classical problem in TCS and machine learning. Here, we are given a labeled complete graph , and the goal is to partition the vertices such that the total number of edges crossing clusters and edges inside the same clusters is minimized.
It is well known that the pivot algorithm could achieve a 3-approximation for correlation clustering in expectation. At a high level, the pivot algorithm samples a uniform permutation for the set of vertices and recursively merges the yet-to-be-clustered neighbors of the lowest-ranked vertices to create clusters.
This paper follows a recent line of work that tries to break the 3-approximation barrier with combinatorial algorithms (e.g., CHS [SODA’24]; CLPTYZ [STOC’24]). The paper follows the framework of the pivot algorithm but additionally introduced some ideas in agreement decomposition as explored by, e.g., CLMNPT [ICML’21]; AW [ITCS’22]. In particular, the paper identifies the tackled the issues in the pivot algorithm of merging neighbors with high disagreement neighborhoods and fails to merge non-neighbors with low disagreement neighborhoods. In this way, the paper obtained a -approximation algorithm with much better empirical performances.
给作者的问题
N/A, I do not have additional questions.
论据与证据
Yes. The theoretical results are with proofs, and the algorithm design is reasonable.
方法与评估标准
Yes, the comparison of average and worst-case costs against the vanilla pivot algorithm shows the advantage of the modified pivot algorithm.
理论论述
At a high level, the main technique of the paper resembles a blend of the pivot algorithm and the agreement decomposition algorithm. As discussed in the paper, the vanilla pivot algorithm suffers from the problem of clustering neighbors with a high disagreement and not being able to merge with non-neighbors with low disagreement. The intuitive fix of these problems would be to artificially add some of such vertices to the clustering formed by the vanilla pivot algorithm. Formally proving the approximation guarantee, on the other hand, requires some quite involved techniques in fractional triangle packing. I did not get time to carefully check the quantities in the charging argument, but the design makes sense to me in general.
实验设计与分析
Yes, for the most part. However, from the experiment section, it is unclear whether the paper ran multiple experiments with different random seeds, especially given that the pivot algorithm is known to have a large variance for different choices of random seeds. See my comments in the criticism for details.
补充材料
The appendix contains proofs for the approximation guarantees and the efficient implementation. I briefly checked the proof of efficiency, and it looks good to me.
与现有文献的关系
Correlation clustering is an important problem in machine learning with broad applications. I believe the algorithm designed in this paper is quite practical.
遗漏的重要参考文献
No missing essential references. On a side note, the cited work of CLMNPT [ICML’21] and AW [ITCS’22] did not use the pivot algorithm but instead used agreement decomposition. The current discussion on the right column of page 1 appears to allude that those results were also based on the pivot algorithm. Furthermore, there are some more recent results for correlation clustering in dynamic settings that are related to this work (e.g, ‘Dynamic Correlation Clustering in Sublinear Update Time’ [ICML’24] and ‘Fully Dynamic Adversarially Robust Correlation Clustering in Polylogarithmic Update Time’ [AISTATS’25]).
其他优缺点
I’m in general supportive of the paper: correlation clustering is an important problem, and getting algorithms with both better theoretical bounds and empirical performances is an important direction to pursue. To the best of my knowledge, this is the first algorithm that achieves approximation for dynamic correlation clustering in poly-logarithmic update time. Furthermore, compared to the combinatorial algorithm in CLPTYZ [STOC’24], the algorithm in this paper is much more practical. The paper also appears to be well-written, and the concepts are explained relatively well despite the charging argument being intrinsically involved.
I do not see any major weaknesses in the work. However, I do think the paper could benefit from reporting the average cost for multiple runs of the pivot algorithm. This algorithm often suffers from instability and demonstrates very high costs in the worst case (see the experiments in, e.g., CLMNPT [ICML’21]; CLMP [ICML’24]; BDPSW [AISATS’25]). Therefore, comparing the pivot algorithm with multiple random seeds looks more reasonable. (Maybe the paper already did so, but it’s unclear to me in the paper.)
其他意见或建议
I think is sufficiently smaller than , so maybe writing this number out directly in Theorem 1.1 is better. makes me think the is some number like .
We appreciate your recommendation to run multiple experiments using different random seeds, especially since the Pivot algorithm is known to have instability. To clarify, Figures 1 and 2 already show both average and worst-case outcomes, based on multiple random seeds per dataset. As illustrated, our Modified Pivot algorithm demonstrates notably less instability compared to the pivot algorithm. We will emphasize this point more clearly in the revision.
We agree that writing the approximation number (2.997) explicitly in Theorem 1.1 improves readability. We will apply this change into the final version.
Thanks for the clarification of the experiments. I have no further questions. I'll keep my score as it is.
This paper presents a modification of the standard 3-approximation Pivot algorithm for correlation clustering (select a node in a graph, cluster it with its neighbors, and remove it) called ModifiedPivot, which avoids the worst-case errors of Pivot by clustering some its neighbors as singletons (if their neighborhood doesn't overlap well with the pivot's neighborhood), and by adding some of the pivots non-neighbors to the cluster (if they overlap well enough with the pivot's neighborhood). A careful analysis based on the bad triangle charging technique shows that the method has an approximation guarantee that is strictly better than 3, a barrier that standard pivot cannot overcome. Furthermore, modifiedPivot can be implemented in the fully-dynamic model with log^{O(1) n update time, providing the first better-than-three algorithm with such an update time in this model (previously, there was not even an O(n) update time for better-than-three approximation algorithms). In practice, the modified pivot algorithm leads to empirical improvements on a suite of real-world graphs and stochastic block models.
Update post rebuttal
Thanks for the reply and clarification. I maintain my positive view of the paper.
给作者的问题
How many choices of epsilon and delta do you have to test in your numerical experiments?
Can you help me reconcile the apparent contradiction mentioned above in the "other comments and suggestions" section of my review?
论据与证据
Yes, the explanation of the algorithm itself at a high level is very sensible, and the main proof techniques and why they should be expected to work are outlined clearly. Several figures and discussions of the main techniques are provided throughout to help the reader. Empirical results also support the theoretical findings. Detailed proofs are provided to accompany the theoretical results, which are the main contribution of the paper.
方法与评估标准
The paper is primarily about theoretical results. The numerical experiments accompanying the theory are sensible and are performed on a standard collection of graphs used previously for correlation clustering experiments.
理论论述
I followed the detailed explanation of the proof techniques that are provided at a high level throughout the text. The bad triangle charging scheme is standard in correlation clustering analysis and is a very sound approach. I confirmed at a high level that the pieces of the proof fit together and seem sound. However, the main technical results are contained in the dense 16 page appendix, and I did not check these for correctness.
实验设计与分析
The experimental setup is sound.
补充材料
No
与现有文献的关系
The text does a good job placing the paper within its broader context in the literature. In particular, there was previously no fully dynamic algorithm with even a linear update time that maintained a better than three approximation. Aside from settling this question in the fully dynamic setting, having a modified pivot algorithm that is practical and has a better than 3 approximation is a useful breakthrough result for correlation clustering. While better than 3 approximation algorithm exist, they are largely impractical. Meanwhile, there are many different practical methods that can achieve a 3-approximation or worse, but improving on factor 3 is a significant barrier.
遗漏的重要参考文献
None.
其他优缺点
The key results of this paper is significant and has the potential to lead to other methods that break the factor-3 approximation barrier while actually being practical. It's also nice that the author have complemented their theoretical analysis with some experiments on real world graphs.
One small potential weakness in the experiments section is that it's unclear how many different epsilon and delta parameters were tried out in order to obtain the improved results. If in order to get good results, many parameter values need to be tested, then there is a runtime tradeoff to consider since standard pivot does not need to compute these. Thus, a more accurate comparison may be to run Pivot many times (with different permutations) and take the best result, if you are going to run to run ModifiedPivot many times (on the same permutation) with different choices of epsilon and delta. This is not a huge issue, as the main contributions of the paper are theoretical. However, depending on how many times parameter settings you need to try for epsilon and delta, there could be some weaknesses in the empirical results.
其他意见或建议
In Algorithm 1 you state "We emphasize that even though vertices in get clustered here, they are not removed from in this step and so can be picked as pivots later."
I'm confused by this and it seems contradictory. Saying that a node is already clustered (in this case as a singleton) suggests that in the final output clustering, the node will indeed just be in a singleton cluster. But saying that it could later be chosen as a pivot suggests that it could later be clustered with some of its neighbors (and significantly overlapping non-neighbors). I'm confused then as to how a node can simultaneously clustered already while possibly being a pivot later. I may have missed something, but I'm not sure how to reconcile this apparent contradiction.
Thanks for pointing out the confusion regarding Algorithm 1, we will make sure to clarify this in the next version of the paper. To clarify this, the vertices in will be allowed to be picked as pivots later on, but even if they start clusters they won't themselves be added to those clusters and will be put in singleton clusters. This makes the algorithm easier to implement in the dynamic setting and elsewhere, because its output can be viewed as the output of the PIVOT algorithm which is then post-processed and locally improved. We will clarify this further in the text of the paper in its next version.
In our experiments, we tested at most 8 choices for each of epsilon and delta in all the runs, which adds up to at most 64 combinations. We will clarify this in the next version of the paper.
This paper studies the classic correlation clustering problem, where the objective is to partition objects into clusters while minimizing disagreements with given similarity and dissimilarity labels. The PIVOT algorithm by Ailon et al. (STOC’05) provides a 3-approximation for this problem, but its analysis is tight, and improving this bound has remained an open challenge.
The authors introduce MODIFIEDPIVOT, an extension of PIVOT that locally adjusts the clustering by moving vertices to different or new clusters. Their theoretical analysis proves that MODIFIEDPIVOT achieves an approximation ratio of 3 − Ω(1), improving prior results in dynamic settings. In particular, they show that in a fully dynamic environment, the algorithm maintains this improved approximation while handling updates in polylogarithmic time per operation (Theorem 1.1).
Additionally, the authors implement MODIFIEDPIVOT and evaluate it on real-world datasets, demonstrating that it makes less than 77% of the mistakes made by PIVOT on average, further validating its practical effectiveness.
给作者的问题
Q1) Could the authors clarify the time complexity overhead introduced by MODIFIEDPIVOT (in non-dynamic settings) compared to the standard PIVOT? Given that the improvement in the approximation factor (see W1) is relatively small, is the additional computational cost justified? A more detailed discussion of this trade-off would be appreciated.
论据与证据
The theoretical claims are supported by proofs while the benefit of the proposed algorithm over the standard pivot is supported by the conducted experiments.
方法与评估标准
Yes, both the proposed method and evaluation criteria make sense for the problem at hand.
理论论述
Yes, they seem correct, although I did not check in detail.
实验设计与分析
No issue found.
补充材料
Yes, I reviewed all of it.
与现有文献的关系
The main contributions of this paper are closely tied to the classic PIVOT algorithm for correlation clustering, introduced by Ailon et al. (STOC’05), which is known to achieve a 3-approximation for the problem. The analysis of PIVOT is traditionally conducted through a charging scheme based on analyzing “bad triangles.”
In this paper, the authors introduce MODIFIEDPIVOT, an algorithm that improves upon the original PIVOT by achieving an approximation ratio of 3 − ε₀ for some absolute constant ε₀ > 0. This result is obtained through a novel charging scheme. MODIFIEDPIVOT provides a better-than-3 approximation in the dynamic setting with polylogarithmic time per update, improving the 3-approximation previously achieved by Behnezhad et al. (FOCS’19) and Dalirrooyfard et al. (ICML’24).
遗漏的重要参考文献
Since the proposed MODIFIEDPIVOT algorithm relies on relocating certain nodes to new or alternative clusters compared to the standard PIVOT, it would be valuable for the authors to discuss related works (e.g., [1, 2]) that have applied local search optimization to the correlation clustering objective, starting from the solution provided by PIVOT. These studies have demonstrated the effectiveness of such an approach, aligning with the findings of this paper. While the authors provide a thorough characterization of cases where PIVOT fails, incorporating a discussion of related works would further strengthen the contextualization and significance of the proposed method.
[1] An Efficient Local Search Algorithm for Correlation Clustering on Large Graphs. COCOA 2023.
[2] In and out: Optimizing overall interaction in probabilistic graphs under clustering constraints. KDD 2020.
其他优缺点
Strengths:
S1) The paper effectively demonstrates, through examples and figures, where the standard PIVOT algorithm fails, which helps in clarifying the understanding of Algorithm 1.
S2) The paper addresses an open problem in the literature: whether it is possible to maintain a 3 − Ω(1) approximation of correlation clustering in polylogarithmic time per update.
S3) The charging scheme used to analyze the approximation factor of the proposed algorithm introduces some novel elements compared to the standard charging scheme applied to analyze the PIVOT algorithm.
S4) The experiments empirically show that the MODIFIEDPIVOT algorithm leads to improvements in all cases across the datasets considered.
Weaknesses:
W1) The approximation factor of MODIFIEDPIVOT (2.997) is very close to that of the standard PIVOT (3). While there is an improvement, it appears to be quite marginal.
W2) The time complexity overhead introduced by MODIFIEDPIVOT (in non-dynamic settings) compared to the standard PIVOT is not sufficiently addressed in the paper. Given that the improvement in the approximation factor (see W1) is relatively small, the additional computational cost may not be justified. A more detailed discussion of this trade-off would be valuable.
W3) The description of the charging scheme algorithm (Algorithm 2) is not clear in the main paper. It would be beneficial to either move the pseudocode and the detailed explanation to the supplementary material, leaving only the intuition of the scheme in the main text, or provide a clearer and more detailed description within the main paper.
W4) There is a need for a baseline that uses local search starting from the solution of PIVOT. If the such local search algorithm accepts only improvements, this would trivially preserve the 3-approximation of PIVOT, allowing for a clearer comparison and understanding of the potential benefits of the proposed modifications.
其他意见或建议
The explanation of the PIVOT algorithm before “Problem 1” repeats information already provided in the introduction, leading to some redundancy. It may be beneficial to streamline this section to avoid unnecessary repetition.
On page 2: “The figure below illustrates this. On the top, we have the optimal clustering. On the bottom, we have the output of PIVOT.”
On page 3, some edges in the first figure appear to be obscured by the blue shape representing the cluster. Clarifying or adjusting the visualization could improve readability and ensure all edges are clearly visible.
We thank the reviewer for thoughtful comments. We note that while indeed the improvement from 3 to 2.997 in the approximation ratio is rather small quantitatively, it breaks a longstanding barrier of 3-approximation for combinatorial algorithms and has an important qualitative value. Additionally, our experiments show that the improvement in the approximation ratio is much more drastic than this theoretical guarantee.
It is also worth pointing out that to break the 3-approximation analysis, our analysis deviates significantly from that of the original pivot algorithm and has to incorporate several new ingredients such as charging non-local triangles and fractional charges. Our hope is that these techniques prove useful in the future for providing analyses that go much below 3 or 2.997 approximations.
The time complexity of our Modified Pivot algorithm and the Pivot algorithm are the same once the parameters of the algorithm are fixed. We'll make sure to clarify this in the next version of the paper.
Thank you for the clarifications. Emphasizing that the proposed method has the same complexity as the original PIVOT is valuable, even though it can be inferred from the paper.
The fact that the experiments show an improvement in the approximation ratio beyond the theoretical guarantee is not surprising, as the proposed algorithm allows for relocating certain nodes to new or alternative clusters compared to the standard PIVOT. This approach has already been shown to be beneficial in related works (see essential references not discussed), which have applied local search optimization to the correlation clustering objective starting from the solution provided by PIVOT. I believe referencing these works would further strengthen the contextualization of the proposed method.
I agree with the authors that the new techniques introduced for the theoretical analysis of the proposed algorithm could be valuable in future studies to improve the theoretical understanding of related algorithms. After also reading the other reviews, I have decided to raise my score to a weak accept.
We appreciate this. We’ll make sure to include more references about local search methods in the next version of the paper.
This submission presents an improved version of the classical pivot algorithm for correlation clustering. While the original version of this algorithm achieves a 3-approximation, the improved version achieves an approximation ratio of 2.997. While this seems to be a minor improvement, it break a long-standing barrier and is certainly a very important contribution. Also the experimental evaluation is valuable even though the reviewers have some comments on how to improve it. Overall, I consider this to be a strong contribution to ICML.