Metric Distortion Under Probabilistic Voting
We study the problem of metric distortion of popular voting rules under probabilisitic voting.
摘要
评审与讨论
Metric distortion is a framework to evaluate the "accuracy" of social choice rules, by considering a worst-case candidate and voter embedding in a metric space, and by assuming that reported votes are derived from distances in the metric space. So far, votes were assumed to be a deterministic function of the distances. The paper investigates the case where they are probabilistic functions of the distances, in the asymptotic limit of a large number of voters. The key finding is that this inverts the evaluation of some voting rules, in particular Copeland and Random Dictatorship, for highly noisy voting.
优点
The model introduced by the authors is an insightful generalization of previous work which, remarkably, provides a markedly different view on social choice rules. Given the growing importance of social choice in machine learning, as well as accounting for noisy inputs while considering embedded vector spaces, I believe that this work scores high in significance.
Additionally, the analysis is quite thorough, with matching lower/upper asymptotic bounds for Plurality upper bound for Copeland and lower/upper bounds for Random Dictatorship.
The paper is also fairly well written.
缺点
My main concern is Lemma 3. The proof seems to argue that the constraints (i.e. inequality constraint on ), but the optimization problem (7) has a constraint . It is not clear to me why these constraints would be equivalent. Note that the latter implies the former set of constraints. Thus if was defined with all voter-wise constraints, then it would be a minimum over a smaller set, and thus . Since Lemma 3's proof actually says , using this inequality seems to imply the actual Lemma 3. Am I reading this correctly?
It is disappointing that the upper-bound for Copeland. It would be helpful if the authors can point out where the argument gets loose.
问题
Could the authors clarify the proof of Lemma 3? I will happily increase my score given a convincing response.
Can the authors provide insight into why the Copeland upper bound is not tight?
More anecdotically, I wonder why "Metric Distortion" is restricted to the distortion of ranking-based algorithms. I feel that the concept is more general than that. Essentially, it is the worst social cost ratio of any voting rule, when executed on votes derived from distances in the metric space, right? In particular, it could be studied even for the social-cost minimizing geometric median, but given, e.g., noisy inputs or strategic votes? I believe that this is an interesting research venue, given that machine learning provides many tools to embed candidate features and voters' preferences in a vector space.
局限性
The paper stated its results very clearly and factually. I have no concerns about unaddressed limitations.
Thank you very much for your detailed and constructive feedback.
Clarification on the proof of Lemma 3
Indeed, as you note, just using the constraint will not be sufficient to obtain the required constraint in problem (7). However, we additionally have that for all voters and for all voters . Here is the distance between candidates W and B. We use this in Line 217 of the paper. This now implies that and , leading to the conclusion that . This is what we use in the optimization problem (7). Now, the problem in (7) is finding the minimum over a bigger set than that implied by using all voter-wise constraints and . Therefore, the result holds. Hope this addresses your concerns in the proof of Lemma 3. We acknowledge that the proof in the paper is a bit too brief; we will add further explanation for clarity.
Intuition for why the Copeland bound is not tight
In the deterministic case, we first provide a straightforward analysis establishing a distortion bound of 3×3=9 for the Copeland Rule. Consider any Copeland winner W; it must belong to an uncovered set [1,Theorem 15]. Let X be the socially optimal candidate. Here, either (a) W defeats X in a pairwise contest, or (b) there exists a candidate Y such that W defeats Y in a pairwise contest, and Y defeats X in their pairwise contest.
According to [1], for any two candidates, A and B, where A defeats B in a pairwise contest, the ratio of their social costs is bounded by 3. Consequently, the distortion of the Copeland winner is bounded by 3×3=9, which comes from case (b) above. Our analysis of the Copeland Rule in the context of probabilistic voting follows a similar approach by multiplying the ratios of social costs, as outlined in Line 240.
In the deterministic case, a more nuanced analysis of the Copeland Rule, which considers more details of the problem's geometric structure, can achieve a distortion of 5. The proof is in [1]. However, we lack a comparable technique that works within our probabilistic voting framework, so we have a looser analysis. Enhancing this analysis for the Copeland Rule is an intriguing direction for future research.
Regarding using the distortion framework beyond ranking-based voting rules
Thank you for mentioning this exciting idea. Indeed, distortion is a domain-specific name for the classic worst-case analysis in theoretical CS. The study of ranking-based rules is motivated by the fact that real-world elections often use ordinal preferences as opposed to eliciting cardinal utilities. Given noisy or adversarial votes, studying the worst-case performance of the geometric median rule would be a very interesting question.
References
[1] Elliot Anshelevich, Onkar Bhardwaj, and John Postl. Approximating optimal social choice under metric preferences. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pages 777–783, 2015.
Thank you to the authors for their helpful answer. I upgraded my rating to 7 and my confidence to 4, as my main concern was fully addressed.
This paper extends the framework of metric distortion, measuring how well voting rules minimize the social cost in a given metric space, to probabilistic voting scenarios where the preferences of voters are drawn from a probability distribution defined by the relative distances between candidates and each voter's ground truth position in the metric space.
The authors base their analysis on three different axioms that the induced marginal probabilities of relative preferences must verify, namely scale-freeness, independence of other candidates and strict monotonicity. They define a general class of marginal probabilities that verify the three axioms, and show that it encompasses the widely used Plackett-Luce model.
They then provide upper and lower bounds for the distortion of the Plurality rule, both linear in the number of candidates and matching asymptotically when the number of voters grows to infinity.
They then provide a upper-bound for the distortion of Copeland rule and show that it is independent of the number of candidates in the limit of a large number of voters.
Moreover, they give upper and (non-matching) lower-bounds for the distortion of Random Dictator.
They finally compare their results under both the Plackett-Luce and Pairwise Quantal Voting models (the latter being inspired form Quantal Response Theory), and show that the classical bounds of the metric distortion literature are recovered in the limit of vanishing randomness (although not for Copeland rule, hinting at a loose analysis).
优点
This paper, in extending the metric distortion analysis to probabilistic voting models perhaps closer to reality, is rather original and proposes a more optimistic view of metric distortion where randomized dictator is beatable in a worst case distortion sense. The paper is fairly well written and the proofs seem correct.
It is an interesting idea, with an interesting result.
缺点
The main problem of this paper is that it is not a good match to NeurIPS. This paper would work very well at an algorithms conference like SODA or ICALP, or a CS econ conference like EC (okay, probably one tier down like WINE), or possibly even at the those AI conferences that have a history in social choice theory like AAAI or IJCAI. And I also understand that NeurIPS has accepted such papers in the past. However, is it really a good fit for NeurIPS 2024?
- There is no mention of the proof of Lemma 1 being in Appendix A.
- In the proof of Theorem 2, is hardly introduced (also not in the Appendix).
- Formatting may be improved in place: e.g. Equations 6, 7, 10, or Theorem 3.
- l.312 "converges to 9 instead of 5". This part is not very clear, reminding the general bound in the deterministic case would improve readability.
- Typos:
- l.220 "and is by solving".
- Equation 24 showcases instead of .
- Equation 34: should be .
- l.618 "LEt".
- l.641 should probably be deleted (equivalent to l.642).
- l.660 weird grammar.
- Footnote 6: missing index .
- l.670: missing in inequalities and Furthermore, Equation 63 is used in rather than in .
问题
n/a
局限性
- The existence of distributions on rankings that generate pairwise order marginals of the form described in the paper is assumed and left for future work.
Thank you for your review. We have tried to address your concern regarding the relevance of our paper to the NeurIPS community in the overall response, we will mention that again here.
As Reviewer 2kBY observed, the increasing significance of social choice in machine learning, and the importance of handling noisy inputs within embedded vector spaces, underscores the timeliness and relevance of our work to the machine-learning community. Preference models, such as the Plackett-Luce model [1] , have been instrumental in AI alignment, especially through techniques like Reinforcement Learning from Human Feedback (RLHF) [2,3,4,5] and the more recent Direct Preference Optimization (DPO) [6]. These methods incorporate user preferences into AI models, where voting rules are critical in aggregating these preferences. Investigating how these rules perform in the distortion framework under probabilistic voting scenarios can provide valuable insights into refining AI alignment processes.
Moreover, the relevance of our work is further supported by prior NeurIPS publications that have explored related topics, such as random utility models [7] and the smoothed analysis of social choice rules [8]. These connections affirm the alignment of our research with the themes and interests of the NeurIPS community.
References
[1] Robin L Plackett. The analysis of permutations. Journal of the Royal Statistical Society Series C: Applied Statistics, 24(2):193–202, 1975.
[2] Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. Deep reinforcement learning from human preferences. In Advances in Neural Information Processing Systems, volume 30, 2017.
[3] Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155, 2022.
[4] Daniel M Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B Brown, Alec Radford, Dario Amodei, and Paul F Christiano. Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593, 2019.
[5] Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel M Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, and Paul Christiano. Learning to summarize with human feedback. In Advances in Neural Information Processing Systems, volume 33, pages 3008–3021, 2020.
[6] P. W. Koh, K. Zhuang, and S. Lee. Direct preference optimization. arXiv preprint arXiv:2205.01659, 2022.
[7] Hossein Azari, David Parks, and Lirong Xia. Random utility theory for social choice. Advances in Neural Information Processing Systems, 25, 2012.
[8] Lirong Xia. The semi-random satisfaction of voting axioms. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34,
The paper studies the problem of metric distortion in single-winner elections. The key assumption is that the voters' preferences are not exactly compatible with the metric space, but they rather agree with it with a certain probability. The authors propose several axioms that formalize the requirements for the probability distribution for it to make sense in the context of distortion. Then they provide upper bounds of distortion in the probabilitic setting for Plurality, Random Dictator and Copeland (in case of the first two rules, they provide also lower bounds).
优点
This work is the first one to combine probabilistic voting with metric distortion, hence the novelty is clear. The paper is overall of good quality, the axioms proposed in their work make sense to me, and the results are sound. The research direction introduced in this paper can be continued in further follow-up papers.
缺点
The paper could have been more clearly written --- for example, the formal notation should be introduced at the beginning of Section 2 (together the model) rather than in the middle of the introduction.
Besides, I think that Axiom 2 (Independence of Other Candidates) could have been better motivated. I can imagine that it was crucial to obtain the authors' results, but it seems rather natural to me that in the real-life scenarios that motivated the research, the presence of additional candidates can impact the voter's probability of ranking one candidate over another.
Another weakness is that the authors only consider three rules, and the analysis of only two of them is complete --- many important rules (like Borda or PluralityVeto) are not considered at all. This could raise a question whether this amount of technical contribution is enough for a top conference like NeurIPS.
问题
局限性
The authors adequately addressed the limitations and there are no potential negative societal impact of their work.
Thank you for your feedback. We agree that the presentation could be improved, and we are happy to incorporate the suggestions in the final version.
Regarding the motivation behind Axiom 2
We acknowledge that Axiom 2 (Independence of Other Candidates) may not always hold in certain real-life scenarios, where the presence of additional candidates could indeed influence a voter's probability of ranking one candidate over another. However, this assumption is well-established and widely accepted in the field of social choice theory [4,5,6,7,8]. Additionally, models such as Bradley-Terry and Plackett-Luce [9] also adhere to this axiom and have demonstrated significant applicability in machine learning and AI alignment contexts [10,11,1,2,3]. This widespread acceptance and practical utility underscore the relevance of Axiom 2 in our work.
Regarding the number of voting rules analyzed
While we understand the reviewers' concern about the limited number of rules presented, the core contribution of our paper is the introduction of a novel model that combines the distortion framework with the concept of probabilistic voting. The three rules we analyze are sufficiently distinct and incorporate innovative mathematical techniques, making them valuable contributions in their own right.
Extending our analysis to other rules, such as Borda and Plurality Veto, would require the development of new techniques that are likely to differ significantly from those presented here. Therefore, we have chosen to leave this as an area for future research.
References
[1] Paul F. Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. Deep reinforcement learning from human preferences. In Advances in Neural Information Processing Systems, volume 30, 2017.
[2] Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155, 2022.
[3] P. W. Koh, K. Zhuang, and S. Lee. Direct preference optimization. arXiv preprint arXiv:2205.01659, 2022.
[4] Kenneth J. Arrow. Social Choice and Individual Values. John Wiley and Sons, New York, 1951.
[5] Amartya Sen. Collective Choice and Social Welfare. Holden-Day, San Francisco, 1970.
[6] Allan Gibbard. Manipulation of voting schemes: A general result. Econometrica, 41(4):587–601, 1973.
[7] Mark Allen Satterthwaite. Strategy-proofness and arrow’s conditions: Existence and correspondence theorems for voting procedures and social welfare functions. Journal of Economic Theory, 10(2):187–217, 1975.
[8] Donald G. Saari. The symmetry and complexity of the voting problem. Journal of Economic Perspectives, 9(1):93–105, 1995.
[9] Ralph Allan Bradley and Milton E. Terry. Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39(3/4):324–345, 1952.
[10] David R. Hunter. Mm algorithms for generalized bradley–terry models. The Annals of Statistics, 32(1):384–406, 2004.
[11] Tzu-Kuo Huang, Ruby C. Weng, and Chih-Jen Lin. Generalized bradley–terry models and multi-class probability estimates. In Proceedings of the 23rd international conference on Machine learning, pages 425–432. ACM, 2006.
Thank you for the answer. I was convinced by your arguments in favor of Axiom 2, and I decided to increase my score to 6.
This paper considers metric distortion in probabilistic models of voting. In the metric distortion framework the voters and alternatives are embedded in a metric space, and given the ranked preferences the goal is to find an alternative with low distortion. In this setting these rankings come from a probabilistic model.
In the first part of the paper, there axioms are introduced and authors show which axiom is satisfied by which probabilisitic model. In the second half of the paper the goal is to find the distortion of Plurality and Copeland rules for a specific class of probabilistic models. The results show matching upper and lower bound of for plurality and constant upper bound for Copeland.
优点
Defining a model for distortion in stochastic models is a useful idea in future analysis of voting systems. The idea of the paper is novel and it uses novel techniques in the second half. I like the definition of the three axioms. I find them natural and easy to understand.
缺点
My main concern is about the presentation. The preliminaries section is incomplete. The definition of distortion is hard to understand for a general audience and you made it harder by just putting the formula there. You have to add a description in words and give some intuition on why this definition makes sense.
It's not clear how the probabilistic model works and how you define distortion on it until section 3 where you define it for a specific class. You have to formally define your probabilistic approach in Section 1.1 and also define distortion in this model. Not knowing the exact definition makes following the first paragraph of section 2 really hard. Before reading the rest of the paper I didn't understand why is a function of or why the preferences may not be consistent with the distances.
I think you have to add more intuition on the probabilisitic models. For instance you mention ground-truth in the definition of Mallows model but you have to explain in more details that how this model distributes the probabilities based on the distance to this ground-truth. The same explanations are needed for PQV.
My understanding is that the analysis that you provide works for any member of , but currently the only members for which we have the final bound are PL and PQV. Is that true? If so is there any other interesting member of this class?
问题
See above.
局限性
Thank you for your feedback on our paper. We acknowledge that there is scope for improvement in the presentation of the paper and rearrange the order in which some concepts are introduced. We also agree with the reviewer that our explanations of the probabilistic models could have been more detailed. We are happy to take these suggestions into account while creating the final version.
We want to clarify that all our results apply to the entire class . Plackett-Luce (PL) and Pairwise Quantal Voting (PQV) are highlighted in our paper because they are well-known in the social choice and machine learning (AI alignment) literature, which is why we provided precise numerical values for these specific cases. However, the structural results on the upper and lower bounds are general and applicable to any member of the class .
We thank the reviewers for their invaluable feedback. We address a common concern below and respond to specific questions separately to each reviewer.
Regarding relevance to NeuRIPS
As Reviewer 2kBY observed, the increasing significance of social choice in machine learning, and the importance of handling noisy inputs within embedded vector spaces, underscores the timeliness and relevance of our work to the machine-learning community. Preference models, such as the Plackett-Luce model [1] , have been instrumental in AI alignment, especially through techniques like Reinforcement Learning from Human Feedback (RLHF) [2,3,4,5] and the more recent Direct Preference Optimization (DPO) [6]. These methods incorporate user preferences into AI models, where voting rules are critical in aggregating these preferences. Investigating how these rules perform in the distortion framework under probabilistic voting scenarios can provide valuable insights into refining AI alignment processes.
Moreover, the relevance of our work is further supported by prior NeurIPS publications that have explored related topics, such as random utility models [7] and the smoothed analysis of social choice rules [8]. These connections affirm the alignment of our research with the themes and interests of the NeurIPS community.
References
[1] Robin L Plackett. The analysis of permutations. Journal of the Royal Statistical Society Series C: Applied Statistics, 24(2):193–202, 1975.
[2] Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. Deep reinforcement learning from human preferences. In Advances in Neural Information Processing Systems, volume 30, 2017.
[3] Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155, 2022.
[4] Daniel M Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B Brown, Alec Radford, Dario Amodei, and Paul F Christiano. Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593, 2019.
[5] Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel M Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, and Paul Christiano. Learning to summarize with human feedback. In Advances in Neural Information Processing Systems, volume 33, pages 3008–3021, 2020.
[6] P. W. Koh, K. Zhuang, and S. Lee. Direct preference optimization. arXiv preprint arXiv:2205.01659, 2022.
[7] Hossein Azari, David Parks, and Lirong Xia. Random utility theory for social choice. Advances in Neural Information Processing Systems, 25, 2012.
[8] Lirong Xia. The semi-random satisfaction of voting axioms. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34,
The paper has some clear merits, but there is also a number of issues:
(1) The paper is not really a good fit for NeurIPS. The authors argue why social choice is relevant for ML (and parts of it surely are) and that some previous papers on distortion were published in NeurIPS. But what they should be arguing is why their paper is relevant to NeurIPS/machine learning etc. I have not seen such arguments in their responses
(2) The authors study Plurality, Copeland, and random dictator. On the one hand, this includes examples of rules from several important families (scoring rules, tournament-based rules, randomized rules), but on the other hand, feels insufficient.
(3) It is not clear what the value of the results are. Are we supposed to use Plurality or Copeland after all? What did we really learn? How does this paper affect anyone who is not studying distortion themselves? There should be clear answers to such questions in the papers, but I did not really find them.