PaperHub
5.5
/10
Rejected4 位审稿人
最低5最高6标准差0.5
6
5
6
5
2.5
置信度
正确性2.5
贡献度2.3
表达2.3
ICLR 2025

Machine Unlearning for Contrastive Learning under Auditing

OpenReviewPDF
提交: 2024-09-26更新: 2025-02-05
TL;DR

We propose an unlearning framework for contrastive learning.

摘要

关键词
Machine unlearningContrastive learning

评审与讨论

审稿意见
6

This work fills the gap in the machine unlearning literature, specifically for contrastive learning pre-training, proposing a new unlearning method which is called the Alignment Calibration method. A new white-box auditing metric is also proposed for an easier auditing process for the users who request the unlearning.

优点

Overall,

  1. The paper is well-written and well-presented.
  2. The setup is clean and clear. Figure 1 helps a lot.
  3. The experiment is thoroughly done and well-organized.

缺点

The main weaknesses are the motivation of the design of Alignment Calibration:

  1. The motivation of positive alignment calibration: in the paper, the motivation is to achieve a good FS score. However, as mentioned in Section 3.3, FS itself is not a reliable metric, which raises the question of why we care about FS at all. This novelty does not seem principle to me with this motivation.
  2. The design of Alignment Calibration (Equation (5)) is quite heuristics. It seems natural, however, that some of the designs (including both negative and positive AC) directly optimize the proposed auditing metric (e.g., AGM), which seems a bit too ad-hoc and in the proposed method's favor.

I'm unfamiliar with the contrastive learning literature, and hence maybe some theoretical results support such a heuristic way of designing loss.

Additionally, 3. The claim is a bit too broad. Specifically, the proposed new auditing tool is quite ad-hoc, and I do not find it visually intuitive. Moreover, if I understand it correctly, this essentially is a pair-wise version of FS, which as mentioned in Section 3.3, is unreliable.

问题

See Weaknesses. Additionally:

  1. Are there any theoretical results in this field that support the design of Alignment Calibration?

Some minor suggestions in writing:

  1. Use 1\ell_1 (\ell_1) instead of l1l_1.
  2. Line 215, replace \citep{} with \citet{} for "in evaluating data attribution in ...".
  3. Line 250, the word "shadow models" for MIA appears without explanation. It might cause some confusion for someone who is unfamiliar with MIA.
  4. Line 251, it's unclear what the "lock" means in this context.
  5. Line 255, Maybe something is wrong with the LaTeX\LaTeX{} environment? "[Exact unlearning...]" should not be intended I suppose.
  6. Line 290, the unlearned model should be hatg\\hat{g}.
  7. Line 814, maybe a typo in "...pair of features, i.e.45 pairs..."?
  8. Throughout the paper, the style of "i.e." changes between "i.e." and "i.e.,."
评论

We thank the reviewer for the detailed and informative review. Here we address your concerns:

[W1(a): Motivation of positive alignment calibration]:

We clarify that FS may not be a reliable metric for black-box auditing, particularly when evaluating small unlearning sets (e.g., for individual data removal requests) due to high variance, as shown in Section 3.4. However, FS remains valuable as a white-box evaluation metric when analyzing the effectiveness of unlearning algorithms.

[W1(b): Optimization is in the proposed method's favor]:

We would like to clarify that in terms of choosing an optimal unlearning algorithm for the model owner, the auditing tool is not used as a white-box evaluation. Thus, our optimization is not in the proposed method's favor.

[W2: Theoretical Results support our design]:

  • Our method's design is fundamentally motivated by the theoretical analysis of contrastive learning presented in [1], which decomposes the loss into two key components;

  • As shown in Equation 3 of our paper, the contrastive learning objective consists of : (1) maximizing alignment between positive samples and; (2) ensuring uniformity of normalized features on the hypersphere

  • Our method systematically addresses both components for unlearning:

    (1) Negative alignment calibration: Reduces uniformity within the unlearn set, directly impacting unlearn performance (UA) as visualized in AM

    (2) Positive alignment calibration: Weakens alignment between positive pairs, affecting UA as demonstrated in AGM

    (3) Performance preserving term: Maintains the overall distributional uniformity on the hypersphere, as validated in Section 5.5

[1] Wang and Isola, Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere. ICML 2020

[W3: AM is a pair-wise version of FS]:

  • We would like to clarify that AM and AGM are not simply decomposed versions of the FS score, as they additionally measure similarity between negative pairs within the unlearn set. These metrics were motivated by the need to provide data owners with comprehensive visualization tools for understanding the unlearning effect.
  • AM and AGM correlate well with other unlearning metrics to help the data owner understand the unlearning effect. Specifically, the diagonal elements demonstrate reduced alignment of positive pairs, while the off-diagonal elements show reduced uniformity of the unlearn set. Together, these contribute to the reduced Unlearn Accuracy (UA), a white-box evaluation provided by the model owner.

[Suggestions in writing:] Thank you for your constructive feedback. All suggested revisions have been incorporated into the modified draft (changes marked in blue).

Finally, thank you for your attention to our responses. We hope we have adequately addressed your concerns and welcome any further discussion.

评论

Thanks for the clarification. I have raised my score correspondingly.

评论

Thank you for your quick response. We very much appreciate that.

审稿意见
5

This paper tackles the problem of machine unlearning in self-supervised pre-training (for contrastive approaches). The authors identify two challenges, namely that adapting methods from either supervised or unsupervised machine unlearning might not be sufficient and the need for black-box auditing methods that could allow data owners to verify the removal of their data without access to the model.

The authors propose a novel unlearning method targeted specifically at unlearning data from pre-trained models trained through contrastive learning. This method is tested against proposed baselines adapted from the machine unlearning literature on three different pre-trained models across 2 datasets.

A black-box auditing tool in the form of alignment matrices that a data-owner can compute from the embeddings for the unlearning data is also proposed.

优点

The paper tackles an interesting problem that seems relevant (although I am not an expert in machine unlearning). The authors do a good job presenting and motivating the two distinct challenges they identify with contrastive unlearning.

The selection of baselines in the experimental setup also seems reasonable to me.

缺点

1. Motivation of the proposed black-box metrics

  • The authors define the FS score by analogy to the memorization score in data attribution. This still lacks motivation as to why the difference in alignment between positive pairs before and after unlearning should be a reasonable and complete measure of unlearning.
  • Furthermore, the proposed AGM is essentially a decomposed version of this FS score and similarly lacks motivation. The authors should make an effort to show why a data owner should thrust this metric, for example, by showing that it correlates well with other established metrics of unlearning (even if they are white box). Simply showing that the alignment matrix changes with the unlearning process is not sufficient evidence for the data being unlearned.
  • The latter seems particularly problematic since the proposed AC method is aimed directly at improving this proposed metric. It is however not clear in the text how this is accomplished (see question 2 below).

2. Experimental Results

  • Standard errors are not reported for tables 1 through 3. The results from all methods seem too close and too metric dependent to really draw meaningful conclusions.

Furthermore, the aggregation of these metrics into an average gap seems problematic for multiple reasons:

  • It's not clear to me how the RA, TA and UA metrics should be accounted for and aggregated into a meaningful overall measure. The authors take the absolute difference to the retrain baseline and average across the three metrics. Perhaps the gap between TA and UA would be a better measure of failure to unlearn. At the same time, higher is generally better for these metrics so penalizing a higher RA than the retrain baseline does not seem correct.
  • Similarly, the sign of the difference is not taken into account for the other metrics. For example, higher EMIA than the retrain baseline seem to be counted as a positive gap.

I believe fixing these issues with the reporting of experimental results would improve the paper.

问题

  1. Why should I thrust AGM as a valid measure of unlearning as a data owner?
  2. pu×p_u^\times in the equation for Lunlearn\mathcal{L}_\mathtt{unlearn} at the end of page 6 is never defined. What is the difference between the first two terms in that equation (other than the different sign) and how should this modification help improve AGM compared to Equation 4?
  3. Why are standard errors not reported?
评论

We thank the reviewer for acknowledging our contribution and the insightful feedback. Here we address your concerns:

[W1(a): Motivation of FS]:

We emphasize that we do not claim FS is a reasonable nor complete measure of unlearning auditing. As we demonstrated in Section 3.4, we indeed find FS to be insufficient and unreliable for black-box auditing for a small subset (e.g., 8 images), matching the reviewer's intuition. However, we would like to point out FS remains valuable as a white-box evaluation metric when analyzing the effectiveness of unlearning algorithms.

[W1(b): Motivation of AM and AGM]:

(1) We would like to clarify that AM and AGM are not simply decomposed versions of the FS score, as they additionally measure similarity between negative pairs within the unlearn set. These metrics were motivated by the need to provide data owners with comprehensive visualization tools for understanding the unlearning effect.

(2) We agree that AM and AGM correlate well with other unlearning metrics. Specifically, the diagonal elements demonstrate reduced alignment of positive pairs, while the off-diagonal elements show reduced uniformity of the unlearn set. Together, these contribute to the reduced Unlearn Accuracy (UA), a white-box evaluation provided by the model owner.

(3) Regarding the difference between negative and positive alignment calibration: pu×p_u^{\times} represents negative pairs within the negative set, which influences the off-diagonal elements in AM and AGM matrices. We have added relevant definitions in the modified paper (marked in blue).

[W2: Experimental Results]:

(1) Standard Errors: We report standard deviations across 5 random trials for all experimental results (Tables 1-3) in Appendix B.6 and Table 11 in our modified paper (marked in blue). Our evaluation metrics align with established practices in machine learning unlearning literature (e.g., [1][2][3]). The comprehensiveness of our evaluation in contrastive learning ensures unbiased assessment of our method's performance.

(2) RA, UA, TA: Our choice of retraining as the primary baseline is justified by:

  • Retraining serves as the gold standard for unlearning, representing the ideal outcome we aim to approximate.
  • Higher RA does not necessarily indicate better unlearning performance, as our objective is to match retraining behavior rather than maximize accuracy on the retained dataset.
  • The use of absolute differences in our evaluation is methodologically sound, as our goal is to measure deviation from retraining results rather than directional changes in EMIA.
  • Consequently, we employ absolute gaps to quantify how closely our method approximates retraining outcomes, without preferential treatment to either positive or negative deviations.
  • Same or similar evaluations have been commonly used as baseline measures of approximating unlearning in [1]-[5].

[1] Fan et al., SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation. ICLR 2024

[2] Jia et al., Model sparsity can simplify machine unlearning. NeurIPS 2023

[3] Golatkar et al., Eternal sunshine of the spotless net: Selective forgetting in deep networks. CVPR 2020

[4] Shen et al., Label-agnostic forgetting: A supervision-free unlearning in deep models. ICLR 2024

[5] Chen et al., Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary. CVPR 2023

Finally, thank you for your attention to our responses. We hope we have adequately addressed your concerns and welcome any further discussion.

评论

As the discussion phase is concluding soon, we would greatly appreciate your feedback on whether our rebuttal has adequately addressed your concerns. We welcome any additional discussion if needed.

评论

Thank you for the detailed responses. After going through the other reviewers' reviews as well as your rebuttals and the additions to the paper I have decided to keep my score.

  • Auditing: I echo the concern of reviewer Ua6w in that I still don't find the suggested auditing metrics convincing. A data owner might see that there is a change in the proposed matrices from before to after unlearning but how does this translate into any assurance that the data was unlearned?
  • Reporting: I think standard errors should be reported in the tables in the main paper as they cast doubts into the conclusions one can draw from these results. The authors report a gap w.r.t. the retrain baseline that is redundant information instead of standard errors.
  • Methodoloy: I understand that this is how results are reported in prior works but this might still be inappropriate in this particular application to contrastive learning. I am particularly concerned that this approach doesn't lend itself well to scenarios where the results are high variance. Take an hypothetical scenario where a method is the same as the retrain baseline in expectation. Taking the average of a (signed) gap over 5 trials, the expected value is 0 and the variance is 5 times smaller. Taking the average absolute gap over 5 trials the expected value is not zero and scales with the variance. The variance in the numbers reported seems too high for this kind of approach to be reasonable and the results are overall not very convincing to me. It seems that slight changes in methodology or even just changing the random seed might change the conclusions.
评论

Thank you for the further feedback! We address your comments further here:

Auditing:

  • While we agree with the reviewers that some assurance or certification for approximate learning is desired, this goal is extremely challenging to achieve as we are approximating the retrain baseline.
  • In this paper, we aim to achieve efficient unlearning while providing individual data owners with evidence of unlearning. Specifically, (1) the model owner can provide white-box evaluation results (such as those in Tables 1-3), though such results cannot be verified by the data owner; (2) In addition to such evaluation, we also encourage data owners to calculate their own AM and AGM metrics. Combined with the white-box evaluation, data owners can observe the alignment between these two pieces of evidence and may be convinced of proper model tuning.

Reporting: We will report the standard errors in the main paper in the final draft.

Methodology:

  • Regarding the reviewer's comment that "this might still be inappropriate in this particular application to contrastive learning", we are a bit confused and wondering if the reviewer is referring to UA, RA, and TA. Could the reviewer clarify why these metrics are inappropriate?
  • Concerning the high variances in Table 11, we note that the retrain baseline also shows high variances for EMIA and CMIA. We believe this occurs because the membership inference attacks may be sensitive to the choice of unlearn sets (which vary across our 5 random trials).
  • The reviewer suggested that "slight changes in methodology or even just changing the random seed could easily change the conclusion." Regarding methodological changes, we performed an ablation study on our objective function in Table 6, where our conclusion remains consistent. As for random seeds, we emphasize that our choices were not cherry-picked, and the optimality of our method has been repeatedly confirmed across various datasets, choices of unlearn sets, and modalities. If requested, we would be happy to add more trials in the final version.
评论

Dear Reviewer 8sny,

We note your remaining concern echoing Reviewer Ua6w regarding the auditing metrics. We encourage you to review our detailed response to Reviewer Ua6w, which has adequately addressed this point. Please let us know if you need any additional clarification as we approach the end of the discussion period.

Best regards,

Authors

评论

Thank you for your response.

I had already read the reply in question but without any ground truth, I still don't know how to assess this contribution. As an example, I am not sure how to interpret Figures 6 and 7. The proposed method (AC) does not seem to produce gap matrices that are qualitatively closer to the retrain one which would seem desirable when the goal is to approximate retraining. It rather seems the opposite is true, but without a single numerical measure even this is hard to assess.

Regarding the metrics used to evaluate AC, I was merely observing that measuring absolute values of a gap effectivelly penalizes variance "both ways" and thus might not be the best choice in this scenario where results seem rather random. Increasing the number of trials would indeed help with this but, as it stands, the results do not seem convincing to me in the face of the standard errors (and not just for EMIA and CMIA but also the other metrics).

评论

Thank you for the further comment! We understand your concerns and here we further address them:

I still don't know how to assess this contribution.

To clarify, our unlearning procedure involves two stages:

  • (1) Algorithm selection by model owner: where the model owner chooses a candidate algorithm according to the gap with retraining (for example, Table 1-3). Here we assume the model owner already chooses AC as a candidate at the end of this stage.
  • (2) User verification: Figure 3, Figure 6, and Figure 7 confirm that:

If a model owner claims to use AC for unlearning, the data owner can verify the claim by plotting AM and AGM and observing significant positive and negative calibration.

We compare this with other methods in Figures 6 and 7 and confirm that our optimization design can be used to confirm unlearning with AC. Note that approximating retraining is only the goal for stage 1. For users, visual auditing is performed specifically to verify AC unlearning.

Regarding the metrics used to evaluate AC, I was merely observing that measuring absolute values of a gap effectively penalizes variance "both ways" and thus might not be the best choice in this scenario where results seem rather random

We believe there is some confusion here. To ensure our evaluation is valid and fair, we emphasize that across Table 1-3:

  • Step 1: For an unlearning algorithm A\mathcal{A}, the gap of each column (representing one evaluation metric) is the avg(A)avg(R)|\text{avg}(\mathcal{A})-\text{avg}(\mathcal{R})| , where R\mathcal{R} is retraining, and avg\text{avg} is the average across 5 random seeds.
  • Step 2: As for the average gap column, we take another average across the five evaluation metrics (columns).

If we calculate the average of the signed difference avg(A)avg(R)\text{avg}(\mathcal{A})-\text{avg}(\mathcal{R}), i.e., without taking the absolute value, the values between different metrics (columns) could cancel each other and render the average gap in Step 2 hard to interpret.

Increasing the number of trials would indeed help with this.

We will increase the number of trials in the final version.

审稿意见
6

This paper introduces Machine Unlearning for Contrastive Learning (MUC), a framework designed to remove specific data influences from contrastive learning models efficiently. The authors show the shortcomings in the existing unlearning methods when applied to models trained with contrastive learning like SimCLR, MoCo, and CLIP. Then they propose a novel approach called Alignment Calibration (AC) to address these shortcomings. This method optimizes the unlearning process by new retraining procedure minimizing loss on retained data and maximizing an unlearning criterion on the data to be unlearned. This also gives data owners tools to visually audit unlearning. Experiments show that AC outperforms existing unlearning methods, providing near-exact unlearning while maintaining model utility.

优点

  • This paper tackles an important problem—how to unlearn data from self-supervised models, especially for contrastive learning. This is important as contrastive models trained on internet-sourced data are growing and powering multiple applications.

  • Provides a systematic study on adapting existing unlearning methods to contrastive models, revealing their inadequacy in unlearning with contrastive models. Then the authors provide a solution called Alignment Calibration along with auditing tools/metrics suited for contrastive models.

  • The authors conduct extensive experiments with AC and baselines across multiple contrastive learning models, showing that it performs well in removing unwanted data influence while retaining the model’s core functionality. The results are backed by multiple metrics and support the claim that AC offers improved unlearning over baselines

缺点

  • The loss Lretain\mathcal{L}_{retain} in equation 3, is computed on the retained data. As this data could be huge, computing and minimizing this loss could take a lot of time. Often, the unlearning could be even for a single data point but invoked by multiple users, in such cases a simple editing technique that can remove specific data points without expensive training would be preferable.

  • I do not find the metrics and other tools to measure unlearning convincing. A user would want a guarantee that their data point(s) have been removed and a model provider would like to provide such a guarantee. How can these metrics be translated into such guarantees, if at all? In Figure 3, the matrices look nearly the same, the point of “visual auditing” is not clear to me.

  • The evaluation setup also does not seem realistic. Randomly removing x% of points from CIFAR-10, and CIFAR-100 seems artificial and far from the realistic settings of unlearning where some specific data points need to be removed (even those that might have a huge influence on the performance). Given that very good performance can be achieved on these datasets even with very few samples (see results in zero-shot/few-shot classification or semi-supervised learning), I am not convinced by the effectiveness of the proposed method in unlearning while retaining high performance. Relatedly, you can see research on data pruning strategies, the setup of dropping x% points randomly is at best a simple data pruning strategy.

Typo: line 138, "retaining dataset"

问题

See weaknesses above. Also,

  1. Aren't there any systematic benchmarks to evaluate unlearning? Where the data to be removed is selected depending on various criteria reflecting the real-world scenarios.

  2. Is there any theoretical research on giving a guarantee/certificate that the requested data points have been removed? Can such results be adopted to contrastive models?

评论

We thank the reviewer for the detailed and informative feedback, here we address your concerns:

[W1: Time Efficiency]:

We emphasize two key points:

(1) Training on retain set is standard practice

  • All major unlearning algorithms (except gradient ascent) incorporate retain set training
  • This is a well-established approach in the field

(2) Our method is computationally efficient

  • Requires only 10 epochs for unlearning
  • Runtime: 1.87 minutes (ours) vs. 109.47 minutes (retraining) in Table 1
  • Tables 1-3 demonstrate efficient unlearning while maintaining unlearning performance comparable to retraining

[W2: Evaluation Metrics are not convincing]:

(1) Theoretical Guarantees: Only complete retraining (with randomly reinitialized parameters) can provide a strict certificate of removal, which, as noted, is computationally prohibitive. Therefore, we explore more practical unlearning methods to approximate retraining effects.

(2) Verification Challenge: Even when approximate unlearning achieves results comparable to retraining (as shown in Table 1), these claims by model owners cannot be independently verified by data owners, as discussed in Section 3.4. This necessitates dedicated auditing mechanisms.

(3) Auditing Need: Given data owners' limited model access, they require assurance that their unlearning requests are genuinely implemented rather than circumvented through inference-time tricks (e.g., conditional zeroing of outputs for unlearned data). Our AM and AGM metrics provide transparent visual evidence of how unlearning affects model behavior.

(4) Interpretation of Results: While unlearning a small dataset may not dramatically alter model weights, the consistent presence of non-zero elements across AGM matrices demonstrates genuine model recalibration rather than superficial modifications.

In summary, approximate unlearning presents a tradeoff between complete certification and computational efficiency. Our work provides both algorithms and validation tools to support model owners and data owners in implementing and verifying the unlearning process.

[W3: Unlearning Setup]:

(1) Unlearning success \neq lower test accuracy:

  • The goal of machine unlearning is to approximate retraining after data removal; not necessarily to achieve specific model performance metrics
  • For data owners requesting unlearning, the primary concern is removing their data's influence for privacy protection, not the model's eventual accuracy
  • Therefore, unlearning success should be measured by comparing our method's outputs to full retraining results, rather than focusing on absolute model performance.

(2) Random selection as a valid evaluation strategy:

  • We employ multiple random trials (n=5) for unlearning set selection to ensure robust evaluation and prevent selection bias;
  • Random selection is a well-established evaluation protocol in the unlearning literature, as demonstrated in some example works [1][2][3].

(3) Few-shot learning is a downstream task while we consider pretraining:

  • We address contrastive learning pretraining with large datasets
  • Unlike few-shot learning scenarios, removing 10% of data during pretraining has minimal impact on downstream performance
  • Reviewer's concern about the few-shot learning impact doesn't apply to our pretraining context

[1] Fan et al., SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation. ICLR 2024

[2] Jia et al., Model sparsity can simplify machine unlearning. NeurIPS 2023

[3] Golatkar et al., Eternal sunshine of the spotless net: Selective forgetting in deep networks. CVPR 2020

Q1: Systematic benchmark: While studying strategic data selection for unlearning is an interesting research direction, to the best of our knowledge, there is no standardized benchmark that systematically evaluates unlearning methods based on specific data selection strategies.

Q2: Theoretical research: While prior work such as [4] provides theoretical guarantees for exact unlearning through retraining, these analyses do not extend to approximate unlearning methods, particularly in the context of contrastive learning which is the focus of our work.

[4] Bourtoule et al., Machine Unlearning. IEEE S&P 2021.

Finally, thank you for your attention to our responses. We hope we have adequately addressed your concerns and welcome any further discussion.

评论

In the absence of clear protocols and guarantees to verify/certify unlearning, I am not convinced of the value of this research.

评论

Thank you for the prompt feedback!

While we appreciate the reviewer's interest in certified approximate unlearning, we want to clarify that certification is neither the focus nor the goal of our work, similar to other approximate unlearning algorithms in the literature. Approximate unlearning remains valuable as it offers a significantly more efficient and practical alternative to exact unlearning methods.

The importance of efficient removal approaches is evidenced by the NeurIPS 2023 machine unlearning challenge (https://unlearning-challenge.github.io/) and significant recent advances in approximate unlearning ([1-5], with [1] receiving an ICLR 2024 spotlight).

Our work makes several novel contributions to the field:

  • We present the first study of approximate unlearning in contrastive learning
  • We provide critical insights into how existing unlearning methods perform in contrastive learning settings
  • We introduce a novel approximate unlearning algorithm specifically designed for contrastive learning
  • We develop practical auditing tools to help data owners verify unlearning effectiveness

While we acknowledge the value of certification, we believe our work makes important contributions to the understanding and application of machine unlearning in contrastive learning settings.

[1] Fan et al., SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation. ICLR 2024

[2] Jia et al., Model sparsity can simplify machine unlearning. NeurIPS 2023

[3] Golatkar et al., Eternal sunshine of the spotless net: Selective forgetting in deep networks. CVPR 2020

[4] Shen et al., Label-agnostic forgetting: A supervision-free unlearning in deep models. ICLR 2024

[5] Chen et al., Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary. CVPR 2023

评论

Thank you for the clarifications. I have revised my scores.

评论

We appreciate the positive feedback and increased scores. We're pleased that our responses have adequately addressed your previous concerns.

评论

As the discussion phase is concluding soon, we would greatly appreciate your feedback on whether our rebuttal has adequately addressed your concerns. We welcome any additional discussion if needed.

审稿意见
5

This paper proposes a new method of machine unlearning for contrastive learning

优点

The reported numerical results show that the proposed method outperforms other method

缺点

  1. More intuition/insights about the proposed method would be helpful. For example, I don't get why AM and AGM are good metrics. Shall we consider loss evaluated on both positive and negative pairs?
  2. Any insight why it is beneficial to maintain the term for negative pairs?
  3. I also don't understand the formula UATA<RAUA \approx TA < RA
  4. The experiments are only performed on CIFAR-10, CIFAR-100, and MS-COCO datasets. This seems inadequate.
  5. How the tuning parameters for other unlearning methods are tuned should be described

问题

see above

评论

We thank Reviewer T8fZ for the insightful feedback. Here are our responses:

[W1: More intuitions about the proposed method]:

Our method provides two benefits:

(1) More effective unlearning for model owners

(2) More reliable auditing for data owners

We propose AM and AGM as stronger auditing metrics based on:

  • Motivation: Our analysis in Section 3.4 reveals that comparing only positive alignment scores (FS) before and after unlearning can be unreliable for individual data owners.
  • Technical Contribution: AM and AGM utilize comprehensive information obtained by data owners, visualizing element-wise similarity for both positive and negative pairs within the unlearning subset.
  • Key Findings: The metrics AGM reveal:
    • Diagonal elements (positive pairs) demonstrate calibrated lower alignment;
    • Off-diagonal elements (negative pairs) show reduced uniformity within the unlearn set;
    • Together, these patterns confirm diminished training effects on the unlearn set.

In summary, AM and AGM provide evidence to data owners that their requests are not circumvented by simple inference-time conditions, but have been properly implemented in the updated model.

[W2: Insights on maintaning the negative pairs]:

We assume the reviewer was referring to the performance preserving term in our unlearning objective. This term is used to preserve uniformity, which is a key property for achieving good performance in contrastive learning [1]. In Section 5.5, we demonstrate that this term indeed preserves uniformity of the learned representation after unlearning.

[1] Wang and Isola, Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere. ICML 2020

[W3: UA, TA, RA]:

The relationship between these metrics follows a clear pattern from retraining:

  • UA ≈ TA: Indicates successful unlearning, as unlearned data is treated like unseen test data
  • RA > UA/TA: Expected behavior, as training accuracy is naturally higher

This pattern (UA ≈ TA < RA) serves as a practical validation of successful unlearning, i.e., an approximate way of retraining.

[W4: Datasets]:

According to your suggestion, we have added additional experiments on the SVHN dataset (Appendix B.5 and Table 10) to further validate our Alignment Calibration algorithm's superiority. Please refer to the modified draft for the complete experimental results (marked in blue).

[W5: hyperparameters]:

We have indeed included training hyperparameters in Appendix A.2.

Finally, thank you for your attention to our responses. We hope we have adequately addressed your concerns and welcome any further discussion.

评论

As the discussion phase is concluding soon, we would greatly appreciate your feedback on whether our rebuttal has adequately addressed your concerns. We welcome any additional discussion if needed.

评论

Dear Reviewer T8fZ,

We kindly invite you to review our rebuttal and revised paper as the discussion period concludes. Our new experiments and responses address the concerns raised. We welcome any additional questions you may have.

Best regards,

Authors

评论

We again thank the reviewer for reviewing our paper and providing valuable suggestions. We hope our rebuttal and revision address your concerns and please consider these updates in your final rating.

评论

In response to reviewer feedback, our revised draft (modifications in blue) includes:

  • New SVHN dataset experiments (Appendix B.5) validating our Alignment Calibration algorithm
  • Standard deviations for Tables 1-3 (Appendix B.6, Table 11)

We have addressed typos and clarified definitions. These additions strengthen our paper, and we welcome further discussion.

AC 元评审

The paper introduces a Machine Unlearning for Contrastive Learning (MUC) framework to address the gap in unlearning for CL models, proposing Alignment Calibration (AC) for effective unlearning and auditing.

Strengths: Novel unlearning method for CL, state-of-the-art performance, and well-presented paper.

Weaknesses: the change of auditing metrics does not assure that the data was unlearned; The high variance makes the results not very convincing.

Reasons for rejection: The current evaluation is not sufficiently convincing.

审稿人讨论附加意见

Reviewer 8sny and Reviewer Ua6w think that the suggested auditing metrics are not very convincing. Reviewer 8sny thinks that the variance is too high, so that the results are not very convincing. These problems are not fully solved in the current version.

最终决定

Reject