Adaptive Localization of Knowledge Negation for Continual LLM Unlearning
We investigate the practical challenge of LLM continual unlearning and propose the ALKN algorithm, which mitigates both accumulative decline and cascading degradation in model utility by adaptively refining gradient updates.
摘要
评审与讨论
This paper focuses on the scenario of LLM continual unlearning, which is a more practical and challenging setting than one-time unlearning. Most existing unlearning methods achieve forgetting by fine-tuning the pretrained model, which unavoidably impairs the general performance of the LLMs. In continual unlearning scenarios, a more severe decline in general utility is observed, primarily due to two factors: accumulative decline and cascading degradation. The former refers to a continuous decline in model utility as unlearning is repeatedly performed, while the latter refers to the exacerbated decline in model utility caused by the inter-task dependencies. To mitigate these issues in LLM continual unlearning, the authors propose an integrated framework comprising entropic optimization objective, dynamic gradient sparsity, and learning rate modulation. The entropic optimization loss adaptively adjusts the unlearning objective based on the LLM's current level of memorization of the target content. The dynamic gradient sparsity method identifies and fine-tunes only the most critical parameters for each unlearning task, thereby preserving the model's general utility. These modules of the proposed framework collaboratively mitigate the utility decline issue in LLM continual unlearning. The proposed method outperforms various baseline methods across multiple benchmarks and a newly constructed dataset, TRAVIS.
给作者的问题
- In this paper, the authors argue that there are two critical issues in continual unlearning: accumulative decline and cascading degradation. While the proposed modules appear to focus primarily on mitigating cascading degradation, it remains unclear how the method addresses the issue of accumulative decline.
- Gradient sparsity is dynamically computed during the fine-tuning process. Why not calculate the gradient sparsity only once prior to the fine-tuning process? This may save some computational overhead.
论据与证据
Pros
- In this paper, the authors argue that continual unlearning poses greater challenges than one-time unlearning, primarily due to two issues: sustained utility decline and detrimental inter-task interference. To validate the latter phenomenon, the authors present theoretical proofs using a toy example and empirical evidence on the TOFU dataset. This in-depth analysis of the underlying causes of utility decline in continual unlearning strengthens the motivation of the proposed method.
- The authors claim that the task vector method reduces the risk of excessive unlearning compared to GA. To substantiate this claim, the authors conduct a theoretical comparison between the baseline methods task vector and GA, resulting in a convincing and inspiring argument.
Cons
- The paper lacks a thorough discussion regarding the issue of accumulative decline in model utility. Although it is intuitive that the utility of the LLM decreases during each unlearning task and that such declines accumulates over time, a more detailed discussion is necessary. For example, the rate of decline may diminish as the model is exposed to more tasks, potentially making the accumulative decline issue less severe.
- The task vector method is claimed to be able to reduce the risk of excessive unlearning compared to GA. However, GA is typically used alongside regluarization terms such as gradient descent on the retain dataset or a KL divergence term, whereas the training of task vectors does not incorporate retain data. This may render the task vector method less effective in practice, which needs further discussion.
方法与评估标准
The proposed method makes sense overall. The utilization of the task vector is supported by theoretical validation. Furthermore, the three modules in the proposed method are well-motivated and thoughtfully designed to address the issues of utility decline in LLM continual unlearning. Nevertheless, there are some questions regarding the proposed method.
理论论述
The authors provide theoretical proofs on a toy example to validate the claim that cascading degradation results in larger changes to model parameters during continual unlearning, potentially exacerbating the decline in model utility. Additionally, the comparison between the task vector method and GA is also theoretically supported. Proposition 2.1, Theorem B.1, and Theorem B.2 have been verified and are correct.
实验设计与分析
I have checked the validity of the experiments in the main text. The experimental designs are robust, and the comparisons across various methods are comprehensive. However, an ablation study examining the impact of hyperparameters is not provided.
补充材料
I reviewed Appendix A to Appendix D.
与现有文献的关系
The study falls within the scope of LLM unlearning, a field where research on continual unlearning remains scarce. And the investigation of the cascading degradation issue, along with the proposed method, represents a novel contribution to the literature.
遗漏的重要参考文献
The essential related works are cited in the paper as far as I know.
其他优缺点
Strengths
- The authors investigate the intrinsic causes of utility decline in LLM continual unlearning, combining theoretical and empirical analysis. Their exploration of the cascading degradation issue is both insightful and thought-provoking.
- The proposed framework is comprehensive and novel. The derivation process of the dynamic gradient sparsity module is particularly ingenious, offering valuable insights not only for general LLM unlearning but also for other related fields.
- To validate the effectiveness of the proposed method, the authors conduct extensive experiments across three benchmarks. Additionally, they construct an evaluation dataset to enable a more comprehensive assessment of the LLM utility. The evaluation of the proposed and baseline methods is sufficient and comprehensive.
Weaknesses
- The authors argue that cascading degradation of model utility occurs when different unlearning tasks are related. However, it remains unclear whether task relevance is common in real-world scenarios. Specifically, are unlearning requests arriving at different times inherently related? If not, cascading degradation may be less prevalent in practice than suggested.
- The authors introduce a new evaluation corpus, TRAVIS. However, the difference and advantages of TRAVIS compared to existing benchmarks are not clearly explained. A more detailed explanation of the motivation behind TRAVIS's construction should be included in the main text.
- The paper lacks a detailed discussion on the computational complexity of the proposed method. Specifically, the dynamic gradient sparsity module of the proposed method involves calculating and storing gradient masks, which may incur extra computational overhead and memory consumption.
其他意见或建议
In line 280, "mu" should be displayed with the symbol.
We appreciate the reviewer for the constructive and positive comments.
Con 1: lack discussion of accumulative decline
- As for the accumulative decline in model utility, the rate of decline should not diminish because each unlearning process could cause the same degree of model parameter changes, thereby the same degree of general utility degradation.
- Given the widespread presence of cascading degradation, the damage to utility from each unlearning task generally increases progressively. Therefore, the rate of decline usually does not diminish but increases. Please refer to the experiments in Figure 2 for evidence.
Thanks for pointing out the issue. We will extend the above discussion in our revision of the paper.
Con 2: Task vector does not incorporate retain data
- Although GA training can incorporate retain data, the limited retain data struggles to represent the broad knowledge possessed by the LLM. As a result, methods like GA+RT or GA+KL still lead to a significant decline in model performance.
- Although vanilla task vector training cannot incorporate retain data, our proposed method utilizes retain data in updating the gradient mask. By incorporating it during mask updates rather than directly involving retain data in the optimization objective, this approach mitigates the bias introduced by retain data to some extent.
- The experimental results on several datasets demonstrate that the task vector method exhibits comparable ability of utility preservation with GA+RT and GA+KL, even if it does not incorporate retain data.
Experiment: Ablation study of hyperparameters
Thanks for pointing out the issue. We have supplemented ablation studies for hyperparameters and on the TOFU dataset. The results of the final task are displayed.
| F-Rouge | FQ | MU | |
|---|---|---|---|
| 1 | 0.2934 | 1.5e-2 | 0.4355 |
| 5 | 0.3257 | 1.4e-2 | 0.4487 |
| 10 | 0.3314 | 1.1e-2 | 0.4534 |
| 20 | 0.3536 | 9.5e-3 | 0.4596 |
| F-Rouge | FQ | MU | |
|---|---|---|---|
| 0.5 | 0.3448 | 1.0e-2 | 0.4526 |
| 0.8 | 0.3314 | 1.1e-2 | 0.4534 |
| 1.0 | 0.3305 | 1.1e-2 | 0.4528 |
| 1.2 | 0.3195 | 1.4e-2 | 0.4203 |
Please refer to the responses of Q4 to Reviewer ex1z for more ablation studies.
Weakness 1: Whether cascading degradation occurs commonly in real-world scenarios.
- In real-world applications, continuously arriving unlearning tasks are typically correlated in terms of data format and content. For instance, as shown in the example in Figure 1, data from different users on the same website tends to be quite similar. Therefore, such inter-task dependency commonly results in cascading degradation in real-world application.
- Even if different tasks are unrelated to each other, the decline in general utility caused by preceding unlearning tasks can lead to partial forgetting of subsequent tasks, resulting in cascading degradation. We conducted an experiment to verify this: after completing the WHP unlearning task, the performance of the model (F-Rouge) on the TOFU and MUSE News forget set also exhibited some decline, as shown in the table below.
| State | TOFU | MUSE News |
|---|---|---|
| Before unlearning WHP | 0.9824 | 0.5862 |
| After unlearning WHP | 0.7853 | 0.5134 |
Weakness 2: Advantages of TRAVIS
Please refer to the responses of Q1 to Reviewer ex1z.
Weakness 3: Computational complexity and memory consumption
We propose a computationally efficient algorithm and a memory-efficient algorithm for the mentioned issue. The running time and memory consumption of our method are comparable to baselines. Please refer to Appendix and the responses of Q4 to Reviewer ex1z.
Q1: How the method addresses the issue of accumulative decline
- Our method alleviates the decline in utility during conducting each unlearning task, thereby minimizing the overall utility degradation in the continual unlearning process.
- The components of our method also incorporate mechanisms to address accumulative decline. For instance, the Dynamic Gradient Sparsity module facilitates selective fine-tuning of different parameter sets for distinct tasks, preventing the cumulative drift of model parameters that leads to accumulative decline. Additionally, the Adaptive Parameter Modulation module applies smaller learning rates to model parameters fine-tuned by preceding tasks, further avoiding cumulative parameter drift.
- The mechanisms in our method to mitigate cascading degradation also serve to alleviate accumulative decline, as cascading degradation causes the utility decline of each unlearning task to be increasingly severe, exacerbating the issue of accumulative decline.
Q2: Why not calculate the gradient sparsity only once
Thanks for the insightful question. We apply dynamic gradient sparsity to obtain a trade-off between fine-tuning all parameters and drastic changes to a small subset. We also propose a computationally efficient algorithm for gradient sparsity. Please refer to Appendix E.4 and E.2 for more details.
The authors have addressed my questions and concerns regarding the method and experiments. Although the GA baseline method can incorporate retain data, the retain data is biased and cannot fully reflect the overall utility of LLMs. The rebuttal explains the commonality of the cascading degradation problem and the severity of accumulative decline, highlighting that their combined effect can lead to a catastrophic decline in LLM utility. The rebuttal also explains the mechanism by which the proposed method addresses the accumulative decline problem and supplements ablation studies on hyperparameters. Overall, the method proposed in this paper effectively tackles the two major challenges encountered in continual unlearning, and the additional efficient algorithms reduce time and space complexity. I have raised my score to 4, and I request that the authors incorporate the hyperparameter-related experiments and the discussion on accumulative decline into the paper.
Thank you so much for raising the score and recognizing the value of our work! Your insightful questions and comments are crucial for enhancing the validity of the paper. The comparison of the GA baseline method and task vector indeed requires additional discussions for clarity. The issues of cascading degradation and accumulative decline also need more explanation regarding their commonality and harms. We will supplement these discussions and additional ablation studies in our revision to the paper. We are truly grateful for your valuable feedback and encouragement!
LLM unlearning seeks to eliminate sensitive knowledge from LLMs. Existing methods for such scenarios often lead to significant degradation of the model's general ability, with utility losses accumulating over time. Additionally, interactions between previous and current unlearning tasks can cause partial forgetting, leading to over-unlearning. To address these issues, the authors propose ALKN, which is built on the task vector paradigm. Three modules are employed in the fine-tuning phase of the task vector to adaptively regulate the gradients of LLM parameters, enabling the model to preserve the general utility of LLMs while sufficiently unlearning target content. For rigorously evaluating unlearning methods, the authors introduce TRAVIS, an evaluation corpus composed of synthetically generated pre-training data spanning diverse topics. Experimental results demonstrate the effectiveness of ALKN across multiple benchmarks.
给作者的问题
Please refer to the questions in other parts of the review.
论据与证据
Overall, the claims made in the paper are clear. However, certain aspects warrant evidence.
Question 1 While TRAVIS is positioned as a superior evaluation corpus, its efficacy relative to existing benchmarks like TOFU remains unproven. The authors omit empirical comparisons to validate its precision in assessing model utility post-unlearning.
方法与评估标准
The proposed method makes sense and is well designed. But certain concerns remain.
Question 2 The authors argue that the target contents of different unlearning tasks often exhibit homogeneity, leading to significant utility degradation during unlearning, termed cascading degradation. The proposed ALKN is designed to address this issue. However, it remains unclear how the method performs when the unlearning tasks are not correlated. Could the authors provide insights or experimental results to clarify the effectiveness of ALKN in such scenarios?
Question 3 The dynamic gradient sparsity module selectively updates vital parameters to preserve general performance. However, limiting adjustments to a subset of parameters risks residual retention of sensitive information in unaltered parameters, potentially leading to incomplete unlearning.
理论论述
Yes, no issues found
实验设计与分析
I have checked the soundness of the experimental designs, but there are some experiments missing that can further validate the proposed method.
Question 4 Appendix E.2 introduces efficiency-focused algorithms for gradient mask computation. Additionally, an algorithm to reduce memory usage is also introduced. However, the impact of these algorithms on performance remains unverified, as no supporting experiments have been conducted.
补充材料
Yes. A.RelatedWork, E.Implementationdetails, F.ExperimentalSetup, and H.Moreexperimentalresults.
与现有文献的关系
The proposed dynamic gradient sparsity module is related to model sparsity methods across multiple domains [1, 2]. However, this paper introduces a distinctive mechanism to progressively sparsify gradients throughout the unlearning process, effectively balancing the objectives of forgetting and retaining.
[1] Jia, Jinghan, et al. "WAGLE: Strategic weight attribution for effective and modular unlearning in large language models." arXiv preprint arXiv:2410.17509 (2024).
[2] Von Oswald, Johannes, et al. "Learning where to learn: Gradient sparsity in meta and continual learning." Advances in Neural Information Processing Systems 34 (2021): 5250-5263.
遗漏的重要参考文献
No
其他优缺点
Strengths This paper studies a practical scenario, with a clear motivation supported by both theoretical and empirical results. The proposed method is novel and inspiring, particularly the dynamic gradient sparsity module. The utilization of the task vector method and the proposed modules are well suited to addressing the issues of LLM continual unlearning. The experimental designs are comprehensive.
Weakness The proposed dataset, TRAVIS, demands more empirical validation. Some ablation experiments that could further validate the effectiveness of the method are missing. Moreover, some aspects of the proposed method, as previously discussed, require clarification and empirical evidence.
其他意见或建议
No
We would like to thank the reviewer for the positive and insightful comments.
Q1: Empirical validation regarding TRAVIS dataset
Thanks for the constructive advice. We would like to explain it as follows:
- The TRAVIS dataset consists of a wide variety of topics, enabling a comprehensive evaluation of the model’s overall utility.
- TRAVIS is generated by the LLM prior to unlearning, thus allowing an accurate assessment of the LLM’s original utility.
- To evaluate the sensitivity of the TRAVIS dataset to model performance, we conducted a synthetic experiment by adding a small amount of random Gaussian noise to the model parameters. We compared the Rouge scores of the TRAVIS dataset, the TOFU dataset, and the WHP dataset. As shown in the table below, the TRAVIS dataset is the most sensitive to changes in model utility, with TRAVIS Rouge showing a decline when the noise standard deviation reaches 0.5%.
| Noise Std (%) | TRAVIS Rouge | TOFU Rouge | WHP Rouge |
|---|---|---|---|
| 0% | 0.7643 | 0.8965 | 0.6940 |
| 0.1% | 0.7643 | 0.8965 | 0.6940 |
| 0.5% | 0.7576 | 0.8965 | 0.6940 |
| 1% | 0.7425 | 0.8965 | 0.6940 |
| 2% | 0.7320 | 0.8958 | 0.6940 |
Q2: When the unlearning tasks are not correlated
Thanks for the valuable question. We would like to explain it as follows:
- Since our method is adaptive, it can effectively handle cases where there is no correlation between tasks. When the unlearning of previous tasks does not affect the performance of subsequent tasks, the soft labels in the Entropic Soft Labels module reduce to one-hot labels; the Dynamic Gradient Sparsity module naturally selects different model parameters for different tasks, and consequently, the Adaptive Parameter Modulation module applies normal learning rates to these non-overlapping parameters. Overall, our method is directly applicable to uncorrelated tasks.
- In Figure 5, we conducted an experiment on continual unlearning with unrelated tasks, where tasks from different datasets were interleaved. The experimental results demonstrate that the performance of our method surpasses baseline methods in this scenario.
Q3: The dynamic gradient sparsity may lead to incomplete unlearning
Thanks for the insightful question. We would like to explain it as follows:
- Vital parameters are dynamically selected during the training process. At the beginning of training, the threshold is set low, allowing a larger number of model parameters to be chosen for fine-tuning. As training progresses, the threshold is gradually increased, resulting in fewer model parameters being selected. This approach enables us to achieve a better trade-off between the two objectives of effective unlearning and utility preservation, thus avoiding incomplete unlearning. Please refer to Appendix E.2 and E.3 for more details.
- To validate the effectiveness of forgetting, we attacked the unlearned model using a relearning method. Specifically, we fine-tuned the model with gradient descent using a small portion of the forget set data and assessed the extent to which the model’s performance on the full forget set (F-Rouge) recovered, thereby determining whether the unlearned knowledge could be easily 'reawakened.' This allowed us to evaluate whether the unlearning algorithm thoroughly forgets the target data. As shown in the table below, compared to other baseline methods, the knowledge unlearned by our method is less likely to be 'reawakened' by this attack approach.
| Method | Unlearned | 5% Data | 10% Data | 20% Data |
|---|---|---|---|---|
| GA+KL | 0.3108 | 0.3543 | 0.4208 | 0.5125 |
| DPO+RT | 0.3771 | 0.4531 | 0.5807 | 0.7923 |
| WAGLE | 0.3731 | 0.4837 | 0.6242 | 0.8627 |
| Ours | 0.3314 | 0.3583 | 0.4072 | 0.5037 |
Q4: The dynamic gradient sparsity may lead to incomplete unlearning
Thanks for pointing out the issue. We supplement the following ablation study on the TOFU dataset and will extend it in our revision of the paper. Specifically, we validate whether using the efficient threshold calculating algorithm in Appendix E.2 (eff-threshold) and whether using the memory-efficient algorithm in Appendix E.4 (eff-memory) result in performance degradation. As shown in the table, using these two efficient algorithms sacrifices a slight amount of precision but does not significantly impact model performance.
| Method | F-Rouge | FQ | MU |
|---|---|---|---|
| w/o eff-threshold | 0.3354 | 1.2e-2 | 0.4566 |
| w/o eff-memory | 0.3302 | 1.1e-2 | 0.4527 |
| Ours | 0.3314 | 1.1e-2 | 0.4534 |
Weakness: Empirical validation and ablation studies
Thanks for your valuable suggestions. Please refer to the responses above and ablation studies in the paper. Please also refer to ablation studies in the responses to Reviewer HgFq. We will supplement and extend the above experiments in our revision of the paper. If the reviewer has any further concerns, we are more than happy to address them.
The rebuttal has addressed most of my concerns on empirical validation.
We are glad to know that the rebuttal has addressed most of the concerns of the reviewer. If the reviewer has any additional questions, we are more than happy to address them. We express our gratitude to the reviewer for their insightful and positive comments, which have been greatly enlightening and are crucial for enhancing the quality of our paper. In revising the paper, we will incorporate these discussions and empirical results.
The paper proposes a combination of techniques for continual unlearning while maintaining model utility. Those are (1) fine-tuning with soft labels, (2) dynamically sparsifying the gradients and (3) adaptively setting learning rates depending on how significantly they have been adjusted for past tasks. In the experiments, the method trades-off forgetting the relevant data with maintaining model utility better than baselines from the literature on TOFU, MUSE News and Who's Harry Potter.
给作者的问题
n/a
论据与证据
Primarily, the paper develops a method and compares it favorably to baselines from the literature on relevant benchmarks.
方法与评估标准
Yes.
理论论述
No.
实验设计与分析
The setup of the experiments makes sense and the paper considers appropriate baselines, even augmenting some existing methods with auxiliary objectives to prevent forgetting in the continual setting.
补充材料
Briefly checked Appendix D and E.1+2
与现有文献的关系
Generally, the paper cites most important reference. The discussion in section 2.3 looks from my perspective like an instance of catastrophic forgetting as in continual supervised learning, perhaps this could be touched on (whether or not the authors agree with the analogy).
遗漏的重要参考文献
n/a
其他优缺点
Strengths:
- the method seems to work well
- it is described clearly
- the experiments are extensive
- appropriate ablations are included
Weaknesses:
- the method is overall rather ad-hoc and it is not clear to me that it will lead to much follow-up work
- the theoretical discussion in 2.3 seems rather disconnected from how the method is constructed
其他意见或建议
Table 1 is incredibly hard to read (one needs to keep track of trade-offs between metrics, track them over tasks and compare between methods) and should be moved to the appendix to replace with figures in the main text.
We would like to thank the reviewer for the positive and very valuable comments. Below are our responses to the comments.
Q1: The connection to catastrophic forgetting in continual learning
We agree with the reviewer that the connection can be further clarified.
The cascading degradation issue we studied in Section 2.3 is similar to, yet distinct from, the catastrophic forgetting problem in continual learning. The intrinsic causes and outcomes of the two are different.
- Catastrophic forgetting happens in continual learning primarily due to model parameters overwriting. And it results in utility declines of previous tasks.
- Cascading degradation happens in continual unlearning primarily due to inter-task relationships. And it results in drastic declines in normal utility.
We will expand this discussion in the related work section in our revision.
Q2: The method is ad-hoc
Thanks for the insightful comment. We would like to explain it from two aspects.
The proposed method is an integrated system, not merely an aggregation of separate components:
- The core idea of our method is the dynamic modulation of model parameter changes during each unlearning process. Based on this idea, we propose three components regarding the training objective, model parameters, and optimization, with the core idea running throughout.
- The different components of our method are interconnected as shown in Figure 3. For example, the Adaptive Parameter Modulation module leverages the learnable mask , computed by the Dynamic Gradient Sparsity module, to represent the relationship between model parameters and the features of each task. This enables adaptive learning rates to prevent the re-unlearning of already forgotten features.
We believe that our work introduces multiple new contributions that inspire further exploration. Here, we would like to list some factors that merit follow-up studies:
- We are the first to study the cascading degradation phenomenon in continual unlearning, which can easily lead to complete model failure when handling continuous unlearning requests. Although we propose a highly promising method to address this issue, performance of the algorithm can still be further improved.
- Our proposed method designs the algorithm through three perspectives, paving several potential paths for effective LLM unlearning.
- The task vector method has received limited attention in LLM unlearning. We validate the effectiveness of the task vector method in LLM unlearning from both theoretical and experimental perspectives, establishing it as a highly promising direction. Our research also aims to inspire interest and motivate further in-depth exploration of this important direction.
- The method we propose may also provide insights for other domains. The approach of adaptively identifying crucial model parameters based on data and dynamically adjusting the parameter mask could provide insights for the fields of model editing and model sparsity.
We sincerely appreciate your comment and we will add the related discussion in the introduction as well as the conclusion sections in our revision.
Q3: Theoretical discussion in 2.3 seems disconnected from the method
Thanks for the very constructive comments. We would like to explain it as follows:
- Proposition 2.1 in Section 2.3 is presented to formally verify the existence of the cascading degradation issue. Specifically, we study how the first unlearning task influences the second task.
- Our method is built on the task vector method. Theorem B.1 and Theorem B.2 mentioned in Section 2.2 theoretically compare the GA and task vector methods, demonstrating that task vector is less prone to over-unlearning and may cause less utility degradation in LLMs.
- Our method includes formulas obtained through theoretical derivation. For instance, the update rule for underlying vector is derived based on a utility-retaining objective in Equation 8. The derivation details are in Appendix D.
- We also introduce the following corollary that validates the effectiveness of our proposed method, which will be added in our revision:
Corollary 2.1. Consider the optimization scenario where the model successively unlearns on and with entropic soft labels, yielding intermediate and final parameters: and . The parameter changes in such a scenario during unlearning on are less than vanilla continual unlearning:
where is a constant depending on the datasets. The corollary is based on Proposition 2.1 and it demonstrates that using the proposed entropic soft labels method yields fewer parameter changes in continual unlearning and may result in milder utility declines of LLMs.
Q4: Hard to read Table 1
Thanks for the valuable advice. We will replace it with figures in our revision of the paper.
Thank you for the additional discussion on the concerns I had raised. I trust these will be incorporated into the final version of the paper and will increase my score to facilitate a consensus.
Thank you so much for your positive response and raising the score. Your questions and comments are insightful and greatly helpful for improving the quality of the paper. We will incorporate discussions regarding the connection between our study and catastrophic forgetting in continual learning, the integrity of the proposed method, and the potential follow-up research our method may inspire into the revised paper. We will also supplement the corollary along with its details and proof to validate the effectiveness of our proposed method in our revision. Thank you again for your valuable comments and recognition of our paper!
This paper provides a new method for continual unlearning where unlearning happens in multi stage with multiple tasks where tasks could be slightly related to each other, and utilizing existing methods for unlearning for such scenario can result in severe degradation of model utility.
Their proposed method includes three main components, and improve over existing baselines.
给作者的问题
n/a
论据与证据
Yes, the paper is well-written and experiments are well-designed.
方法与评估标准
The method, and the task setup to evaluate continual unlearning has been proposed very well.
理论论述
Yes, they look good.
实验设计与分析
Experiments and analysis look compeliing.
补充材料
Yes, the appendix looks comprehensive.
与现有文献的关系
The problem of continual unlearning is indeed very important, as similar to the problem of sequential model editing that has been studied in the literature.
遗漏的重要参考文献
Literature is covered pretty well.
其他优缺点
This paper introduces continual unlearning, a novel problem within the field, and demonstrates the limitations of existing methods for this task. It proposes a thoughtfully designed approach that effectively addresses these limitations, achieving improved results over current techniques.
其他意见或建议
n/a
We sincerely appreciate the reviewer’s positive and encouraging feedbacks, as well as their recognition for the value of our work. We are delighted that the reviewer finds our approach and theory compelling. In our future work, we plan to build upon this foundation by further refining our methods and exploring additional applications of our findings such as sequential model editing. If the reviewer have any additional questions, we are more than happy to address them.
I see the following as the main strengths of this work:
- The paper highlights the problem of cascading degradation in model utility with continual unlearning, and proposes a well-motivated method to improve the unlearning-utility tradeoff. The empirical evaluation is quite thorough, and the proposed scheme significantly outperforms various baselines by being able to maintain high levels of utility with the (continually) unlearned models.
- The proposed scheme utilizes task vectors, and the paper provides a theoretical motivation for the use of task vectors over gradient-ascent based unlearning.
The following are some of the weaknesses:
- The newly presented TRAVIS unlearning benchmark requires further evaluation to highlight why this is a worthwhile benchmark to use when compared to existing benchmarks such as TOFU or WHP. The authors provide some motivation in the author response to reviewer ex1z where they show that the utility on the TRAVIS dataset is more sensitive to (random) changes to the model parameters when compared to TOFU and WHP, highlighting the need for more targeted model updates.
- The computational overhead of the proposed scheme requires further elaboration and the need to store various masks would probably increase the memory overhead. A memory efficient version is presented in the appendix, and the author provide some preliminary results in the author response to reviewer ex1z showing that the model utility is maintained though unlearning efficacy drops slightly.
One minor comment is that the authors should clarify the difference (if any) between "catastrophic degradation" and "accumulative decline". To me both seem to be the same thing referring to the cumulative decline in the model utility as we continually modify the model parameters to address each unlearning task.
Overall, the paper presents solid contribution to the problem of continual unlearning. The main weaknesses appear to be appropriately addressed in the author-reviewer discussion.