PaperHub
8.0
/10
Oral4 位审稿人
最低8最高8标准差0.0
8
8
8
8
3.3
置信度
正确性3.5
贡献度3.5
表达3.3
ICLR 2025

AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models

OpenReviewPDF
提交: 2024-09-24更新: 2025-05-01
TL;DR

We propose a novel model editing method named AlphaEdit to minimize the disruption to the preserved knowledge during editing.

摘要

关键词
Model EditingNull-SpaceLarge Language Model

评审与讨论

审稿意见
8

Post training an LLM can often cause it to disrupt its originally preserved knowledge. To circumvent this, AlphaEdit utilizes Null Space Projection to preserve old knowledge while injecting the new knowledge effectively. Results show that AlphaEdit significantly reduces the domain shift between pre and post edits compared to existing methods.

优点

  1. The paper is well-written and easy to follow. The authors explained the null space and how to leverage null space projection to optimize the model editing objective well.
  2. I think the choice of RQs is well thought out and thorough as well. The paper answered most of the questions I had about AlphaEdit.
  3. I think figure 6 is interesting to show how AlphaEdit can generalize to existing methods.

缺点

  1. The paper did not mention the correlation between the accuracy and the dataset size. More concretely, how much data is needed for AlphaEdit to work well?

问题

  1. In line 174, ‘B is in the null space of B’ -> change to ‘B is in the null space of A’
  2. For figure 7(a), what does the ylabel refer to?
  3. Minor presentation suggestion: In figure 5, the pre and post edited distributions are difficult to set apart due to color choice. Picking two contrasting colors similar to Memit would be great for the final version.
评论

Dear Reviewer cdNJ:  

Thank you for your kind words and positive feedback of our novelty, presentation and effectiveness! Your approval is the great encouragement for us and motivates us to continue advancing our work.  

Below, we meticulously provide responses to each of your comments and outline the modifications made to the manuscript. All revisions are highlighted in blue.


W1: The paper did not mention the correlation between the accuracy and the dataset size.

Thank you for raising this important concern. We acknowledge that the paper did not address the correlation between dataset size and AlphaEdit's performance.

Following your suggestion, we conducted additional experiments by reducing the size of the dataset used to compute K0K_0 to proportions [0.9, 0.8, 0.7, ... , 0.1] of its original size and observed the impact on AlphaEdit's performance. Detailed results and analyses are provided in Appendix C.8 (Page 26-27, Line 1403-1475).

For your convenience, we summarize the key results and observations here:

Counterfact\text{Counterfact}ZsRE\text{ZsRE}
Model\text{Model}Ratio\text{Ratio}\text{Eff.}$$\uparrow\text{Gen.}$$\uparrow\text{Spe.}$$\uparrow\text{Eff.}$$\uparrow\text{Gen.}$$\uparrow\text{Spe.}$$\uparrow
1.098.90±1.2198.90±1.2194.22±0.8994.22±0.8967.88±1.3467.88±1.3494.47±0.9794.47±0.9791.13±1.0291.13±1.0232.55±1.7832.55±1.78
0.998.32±0.9298.32±0.9293.87±1.5693.87±1.5666.23±1.1866.23±1.1894.12±1.4494.12±1.4491.76±1.0291.76±1.0231.89±1.2331.89±1.23
0.896.75±1.3596.75±1.3592.45±0.7892.45±0.7866.45±0.9966.45±0.9994.12±1.2394.12±1.2390.95±1.4290.95±1.4230.67±1.0930.67±1.09
0.696.12±0.8696.12±0.8691.34±1.2391.34±1.2363.34±0.9463.34±0.9493.87±1.3693.87±1.3691.12±1.1191.12±1.1129.34±1.2529.34±1.25
LLaMA3\text{LLaMA3}0.597.93±1.2397.93±1.2394.01±1.0994.01±1.0963.51±0.9763.51±0.9792.96±0.8992.96±0.8991.67±1.0391.67±1.0328.56±1.3428.56±1.34
0.495.88±0.7895.88±0.7892.67±1.1192.67±1.1161.78±1.0961.78±1.0992.98±1.0992.98±1.0991.76±1.2891.76±1.2828.56±0.9928.56±0.99
0.396.98±1.6796.98±1.6793.22±0.9993.22±0.9958.56±1.0858.56±1.0893.12±1.4393.12±1.4389.12±1.2389.12±1.2327.56±1.6727.56±1.67
0.297.45±0.9797.45±0.9791.89±1.2291.89±1.2256.89±0.8956.89±0.8993.01±0.8493.01±0.8489.99±1.0989.99±1.0926.34±1.3426.34±1.34
0.195.21±1.0395.21±1.0390.12±1.4590.12±1.4556.12±1.2256.12±1.2292.01±1.0292.01±1.0289.97±1.2889.97±1.2825.89±1.4725.89±1.47
 
Counterfact\text{Counterfact}ZsRE\text{ZsRE}
Model\text{Model}Ratio\text{Ratio}\text{Eff.}$$\uparrow\text{Gen.}$$\uparrow\text{Spe.}$$\uparrow\text{Eff.}$$\uparrow\text{Gen.}$$\uparrow\text{Spe.}$$\uparrow
1.099.50±0.9899.50±0.9893.95±1.1393.95±1.1366.39±0.8966.39±0.8994.81±1.5694.81±1.5686.11±1.2486.11±1.2425.88±1.4225.88±1.42
0.997.82±1.4397.82±1.4392.78±0.8792.78±0.8765.24±0.9265.24±0.9293.67±1.0593.67±1.0585.73±1.1285.73±1.1225.05±1.3225.05±1.32
0.898.47±1.1298.47±1.1292.54±1.3292.54±1.3264.89±0.7664.89±0.7693.21±0.9993.21±0.9985.48±0.7885.48±0.7823.98±1.2423.98±1.24
0.699.12±0.8799.12±0.8791.33±1.0791.33±1.0764.45±0.9364.45±0.9394.31±0.9994.31±0.9984.85±1.4584.85±1.4523.45±0.9823.45±0.98
GPT2-XL\text{GPT2-XL}0.595.68±0.9295.68±0.9290.89±1.3490.89±1.3461.76±0.7661.76±0.7694.74±0.8794.74±0.8785.92±1.2385.92±1.2322.78±0.7622.78±0.76
0.497.54±1.4597.54±1.4591.76±1.2391.76±1.2360.23±1.0960.23±1.0993.52±0.9893.52±0.9884.45±0.8884.45±0.8822.34±1.0122.34±1.01
0.399.01±1.0999.01±1.0990.32±1.1190.32±1.1158.92±0.9258.92±0.9293.01±1.3493.01±1.3483.78±0.9983.78±0.9922.67±1.2122.67±1.21
0.295.89±0.7895.89±0.7891.03±1.0391.03±1.0358.14±1.2358.14±1.2393.04±1.0993.04±1.0985.07±0.9585.07±0.9522.45±1.1522.45±1.15
0.196.35±1.0296.35±1.0291.03±1.3291.03±1.3258.14±1.1458.14±1.1493.04±1.2293.04±1.2285.07±1.4385.07±1.4321.11±1.1221.11±1.12
评论
Counterfact\text{Counterfact}ZsRE\text{ZsRE}
Model\text{Model}Ratio\text{Ratio}\text{Eff.}$$\uparrow\text{Gen.}$$\uparrow\text{Spe.}$$\uparrow\text{Eff.}$$\uparrow\text{Gen.}$$\uparrow\text{Spe.}$$\uparrow
1.099.75±1.1599.75±1.1596.38±1.4596.38±1.4575.48±1.2575.48±1.2599.79±1.2899.79±1.2896.00±1.6796.00±1.6728.29±1.3228.29±1.32
0.999.43±1.0999.43±1.0996.14±1.0896.14±1.0874.75±1.1174.75±1.1197.63±1.0397.63±1.0396.11±0.9896.11±0.9827.12±1.4327.12±1.43
0.898.34±0.7698.34±0.7696.03±1.1296.03±1.1275.21±0.8975.21±0.8997.65±0.8797.65±0.8796.01±1.3296.01±1.3225.89±0.8925.89±0.89
0.698.94±1.1798.94±1.1795.33±1.1295.33±1.1273.89±1.2173.89±1.2199.05±1.0999.05±1.0995.45±0.8495.45±0.8424.87±1.4524.87±1.45
GPT-J\text{GPT-J}0.598.46±1.2198.46±1.2194.89±1.0994.89±1.0970.88±1.0570.88±1.0598.94±1.0498.94±1.0495.12±1.2895.12±1.2823.78±0.9723.78±0.97
0.497.74±0.8897.74±0.8894.76±1.0494.76±1.0469.89±1.1869.89±1.1898.31±1.2398.31±1.2394.91±1.0994.91±1.0923.67±1.2223.67±1.22
0.396.74±1.3296.74±1.3294.34±0.9594.34±0.9567.95±0.8467.95±0.8497.58±0.9897.58±0.9894.23±1.1594.23±1.1522.78±1.1222.78±1.12
0.296.94±1.0996.94±1.0994.73±1.2494.73±1.2468.04±0.9268.04±0.9297.34±0.9797.34±0.9794.12±1.0294.12±1.0223.45±1.0923.45±1.09
0.197.02±0.8997.02±0.8994.73±0.9894.73±0.9868.04±1.0968.04±1.0997.58±1.2197.58±1.2194.23±0.8994.23±0.8923.67±1.3223.67±1.32

According to the above tables, we can find that:

  • As the dataset size decreased, both Efficacy and Generalization demonstrate notable stability. Even at only 10% of the original dataset size, the drop in these metrics is negligible (less than 5%), suggesting that AlphaEdit effectively generalizes to unseen data and remains efficient even with reduced data availability.  
  • In contrast, the Specificity metric experience a significant decline as the dataset size is reduced. When the dataset size is limited to just 10% of its original volume, Specificity drop by 11.76%, indicating that the model's ability to store neighborhood knowledge heavily relies on the availability of a sufficiently large dataset.

Due to time constraints during the discussion phase, we have completed these experiments with LLaMA3, GPT-J and GPT2-XL as the base model on the Counterfact and ZsRE datasets. Additional tests with other base models (e.g., Gemma, Phi) and datasets (e.g., LongformEvaluation, MQUAKE, KnowEdit) are already underway. We will update the reversion as soon as we complete these experiments.  

Hope our additional experiments could address your concerns!  


Q1 & Q2: In line 174, change BB to AA. For figure 7(a), what does the ylabel refer to?

  Thanks for your comments. Based on your suggestions, we have made the following revisions in the updated manuscript:  

  • Line 174:Replaced "B" with the correct symbol "A": B is in the null space of A if and only if BA=0.  

  • Line 464:Added clarification for the vertical and horizontal axis in Figure 7 (a): The vertical and horizontal axis represent the categories of knowledge and the accuracy of LLM responses involving this knowledge, respectively.  


Q3: Minor presentation suggestion: Picking contrasting colors would be great in Figure 5.

Thank you for your valuable suggestion! We have provided an updated version of the figure with adjusted colors in Lines 1943–1511 of the revised manuscript (enlarged for your convenience to facilitate comparison).

If you feel this meets your expectations, we will apply the same adjustments to all scatter plots in the camera-ready version. Should you find further refinements necessary, please let us know—we would be happy to make additional adjustments.

Hope that these updates could meet your expectations, and we would be thrilled if you could let us know whether your concerns have been addressed or if you have any follow-up questions!


Once again, we deeply appreciate your thoughtful and encouraging feedback. Your suggestions have not only enhanced the current work but have also inspired us to to keep moving forward and contributing to the community!  

Best,  

Authors of Paper 3792

评论

Dear Reviewer cdNJ,

We would like to extend our heartfelt gratitude for your thoughtful and constructive suggestions on our manuscript. Your insightful feedback has significantly strengthened the overall quality of our paper.

We hope that our responses have effectively clarified your concerns and provided satisfactory explanations. If there are any remaining questions or additional points you would like to discuss, we would be more than happy to engage in further dialogue to address them.

Once again, we sincerely appreciate the time and effort you have devoted to reviewing our manuscript!

Best regards,

Authors

评论

Dear Reviewer cdNJ,

Thank you once again for your thoughtful and constructive feedback, which has been instrumental in refining our work. Your initial comments were incredibly valuable, and we deeply appreciate the high evaluation you have already given our submission.

As the discussion phase comes to a close, we wanted to kindly ask if you have any additional suggestions or feedback that could further improve our work. Additionally, we would love to hear whether our responses and updates have satisfactorily addressed your concerns.

(To briefly reiterate our paper's contribution, we identified a common issue in existing model editing methods—the disruption of stored correct knowledge—caused by overfitting to new knowledge, leading to distributional shifts in hidden representations. Our proposed AlphaEdit addresses this issue comprehensively across current methods, achieving this with just a single line of code.)

We are sincerely grateful for the time and expertise you have shared, and we look forward to any final thoughts you might have.

Best regards,

The Authors

评论

Thanks to the authors for the detailed response! The response addresses my concern with the correlation between dataset size and accuracy. I also like the updated figure 12 and it clearly illustrates the distribution.

For figure 7a, I am still slightly unsure what does the 'the categories of knowledge ' in ylabel mean. For specifically, what does the words in y axis -- 'language', 'was born in', 'headquartered', 'produced by' etc mean?

评论

Dear Reviewer cdNJ,

Thank you so much for your prompt and thoughtful response! We are delighted that the additional experiments and updated figures align with your expectations and have addressed your concerns.

Regarding your question about the y-axis labeled "categories of knowledge" in Figure 7a:

  • This axis represents the editing success rate of AlphaEdit for knowledge belonging to different semantic categories. For instance, the bar labeled "language" with a height of 98 indicates that out of 1,000 knowledge instances related to "language" (e.g., "The primary language in the United States is English," or "The official language of France is French"), AlphaEdit successfully edited 98% of them.

This metric provides a fine-grained assessment of AlphaEdit's effectiveness across various knowledge domains, offering a clearer picture of its advantages.

In response to your comment, we have clarified the axes and experimental details for this figure in the revised manuscript. Specifically, the updates can be found in Section 4.4 (Lines 466–469) and Appendix C.4 (Lines 1299–1304). However, since we are unable to upload a new version at this stage, we will ensure these clarifications are reflected in the camera-ready version.

We sincerely hope this response could address your concern. Once again, thank you for your timely and insightful feedback—it has been invaluable in enhancing our work!

Best regards,

The Authors

评论

Thank you authors for the clarification. Yes, please do add the clarification in the final camera-ready version since it was difficult to read without the explanation you described now. I do not have any other critical concerns. Good luck :)

评论

Dear Reviewer cdNJ,

Thank you for your kind response and for taking the time to review our clarification. We greatly appreciate your understanding and constructive feedback, which have been invaluable throughout the review process.

Your thoughtful comments have significantly contributed to improving our work, and your encouragement means a great deal to us. We sincerely thank you for your support and guidance.

Best regards,

The Authors

审稿意见
8

This is a review of the paper entitled “AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models” submitted to ICLR 2025. The paper suggests a new approach to do targeted knowledge updates in LLMs; in particular, if an LLM tells some wrong factual information, the goal is to identify influential parameters and then introduce so-called perturbation to them that, on the one hand, repairs problematic outputs and, on the other, keeps the rest as intact as possible. The main experimental result is that the new suggested method, called AlphaEdit, performs comparably to the state of the art for single updates and shorts sequences of updates, but outperforms them dramatically for longer ones.

优点

I should start by admitting that I am not a specialist in the topic of the paper, and so it is difficult for me to judge the novelty and value of the results. However, I can say that the paper is well-written: I could understand nearly everything and agree with the arguments. Moreover, for an outsider, the results look interesting and promising. Thus, I lean towards acceptance; however, of course, the opinions of reviewers who are more in the topic should be more valuable for the decision.

缺点

问题

Concrete comments:

Question: why do we need to solve the sequential editing task really sequentially, as in L (line) 244—that is, why cannot we just start from scratch with the original model and K1 being all the edits together (i.e., the union of Kp and K1 in the current equation (12)), and use Equation (11)? This seems to promise a better performance even for the AlphaEdit, looking at Figure 4.

Minor: L 11: I do not think “due” is the right word here. L 76: “the coefficient” is unclear. Which coefficient? L 81: besides viewing in colour, one needs a magnifier to read these figures. L 136: “update” and “new” do not make much sense together, either one or another. L 173: B -> A L 177 (equation (7)): \Delta is not really \Delta here, it is projected \Delta. Which should be said or better denoted by another symbol. L 193: it is not clear what does it mean to be “consistent” in this context

评论

Dear Reviewer ngVy:  

Thank you for your kind words and positive feedback! Your approval is the great encouragement for us and motivates us to continue advancing our work.  

Below, we meticulously provide responses to each of your comments and outline the modifications made to the manuscript. All revisions are highlighted in blue.


Q1. Why do we need to solve the sequential editing task, that is, why cannot we just start from scratch and K1 being all the edits together?

Great catch! Your question highlights a fundamental issue in model editing: why we need sequential editing rather than starting from scratch and conducting all the edits at once? Upon reviewing the current literature, we summarize two main reasons for this:  

  1. Time and Computational Efficiency: In real-world scenarios, new knowledge constantly emerges. Sequential editing allows us to incorporate only the new knowledge each time, resulting in a computational cost of 1 + 1 + 1 + ... + 1 (N times) for N updates. In contrast, starting from scratch requires re-editing all previous knowledge with each new addition, leading to a total cost of 1 + 2 + 3 + ... + N, which becomes unsustainable as N grows.

  2. Privacy Concerns: In certain scenarios where privacy and security are critical, model users may not want previously injected knowledge to be visible to future users. Sequential editing can address this need perfectly. In contrast, starting from scratch would require accessing all previously edited knowledge with each new addition, potentially compromising privacy.

Beyond these points, sequential editing also requires less memory than starting from scratch, as it avoids the need to store all prior knowledge.

In a nutshell, sequential editing is far more efficient in terms of time, computation, and memory, and it can be applied to a wider range of real-world scenarios. This is why research into sequential editing has gained increasing prominence more recently.  

However, it's worth noting that while sequential editing has many advantages, earlier approaches faced a key challenge: as the number of edits grows, cumulative changes can degrade the model, potentially reducing its performance. Some prior methods, such as PRUNE, made attempts to mitigate this issue, but the problem persists with increasing edits. Our approach, AlphaEdit, addresses this issue with a minimal adjustment in code, which we believe is the main contribution of our paper to the community.

Hope our response could address your concerns!


Q2. Minor: (1) “due” in L 11 is not right. (2) “coefficient” in L 76 is unclear. (3) A magnifier is needed to readfigures in L 81. (4) “update” & “new” in L 136, either one or another. (5) L 173: replace B with A. (6) L 177: the projected Δ\Delta should be said by another symbol. (7) L 193: What “consistent” means here is unclear.

Thank you for your suggestions! Based on your suggestions, we have made the following revisions in the updated manuscript:

  1. L11:Replaced "due to" with "producing": LLMs often exhibit hallucinations, producing incorrect or outdated knowledge.

  2. L76:Added clarification for "coefficient": λ\lambda is the coefficient to keep balance between e0e_0 and e1e_1 in the objective.

  3. L81:Added reference to detailed numerical results: Detailed settings and results are provided in Section 4.2 and Table 1, respectively.

  4. L 136: Removed "new": each edit needs to update u pieces of knowledge in the form of (s, r, o).

  5. L 173:Replaced "B" with the correct symbol "A": B is in the null space of A if and only if BA=0.

  6. L 177:Introduced symbol Δ\Delta’ to represent the projected \Delta:ΔK0=0\Delta'K_0=0, where Δ\Delta’ denotes the projected perturbation.

  7. L 193: Replaced "consistent" with "equal to": This matrix's null space is equal to that of K0K_0.

Hope that these updates could meet your expectations, and we would be thrilled if you could let us know whether your concerns have been addressed or if you have any follow-up questions!


Once again, we deeply appreciate your thoughtful and encouraging feedback. Your suggestions have not only enhanced the current work but have also inspired us to to keep moving forward and contributing to the community!  

Best,  

Authors of Paper 3792

评论

Dear Reviewer ngVy,

We greatly appreciate for your positive feedback and constructive suggestions, which have been instrumental in improving the quality of our work.

If you have any additional questions or concerns that we can clarify or address, we would be happy to provide further information to ensure all aspects of our work are clear.

Thank you once again for your valuable time and effort in reviewing our submission!

Best regards,

Authors

评论

Dear Reviewer ngVy,

Thank you once again for your thoughtful and constructive feedback, which has been instrumental in refining our work. Your initial comments were incredibly valuable, and we deeply appreciate the high evaluation you have already given our submission.

As the discussion phase comes to a close, we wanted to kindly ask if you have any additional suggestions or feedback that could further improve our work. Additionally, we would love to hear whether our responses and updates have satisfactorily addressed your concerns.

(To briefly reiterate our paper's contribution, we identified a common issue in existing model editing methods—the disruption of stored correct knowledge—caused by overfitting to new knowledge, leading to distributional shifts in hidden representations. Our proposed AlphaEdit addresses this issue comprehensively across current methods, achieving this with just a single line of code.)

We are sincerely grateful for the time and expertise you have shared, and we look forward to any final thoughts you might have.

Best regards,

The Authors

审稿意见
8

The paper introduces AlphaEdit, a method to improve targeted knowledge editing in large language models by projecting updates onto the null space of preserved knowledge, thus reducing interference with existing information. AlphaEdit achieves this with a minimal adjustment in code, enabling it to maintain a model’s pre-existing knowledge while updating targeted information. Experimental results demonstrate AlphaEdit's effectiveness, showing a performance improvement over traditional editing methods across multiple language models.

优点

  1. The use of null-space projection in AlphaEdit minimizes disruption to preserved knowledge while updating new information, effectively addressing a common trade-off in model editing between knowledge update and retention.
  2. The paper provides comprehensive experimental evidence that AlphaEdit outperforms existing methods on critical editing metrics such as efficacy, generalization, specificity, fluency, and consistency.

缺点

  1. Accurate null-space projection may rely on high-dimensional matrix computations, which could pose scalability issues as model sizes or knowledge bases grow.
  2. Limited empirical evaluation on diverse LLMs. The method is tested on models like GPT-2 XL, GPT-J, and LLaMA3 only. It would be good to see results for other models such as gemma, phi.

问题

  1. Authors may have overlooked these methods. There are other latest methods present such as SERAC, GRACE, InstructEdit, MELO methods. Authors either provide the compared results or argue on why they have not considered these methods for comparison. https://sites.google.com/view/serac-editing https://arxiv.org/abs/2211.11031 https://arxiv.org/abs/2402.16123 https://arxiv.org/abs/2312.11795

  2. Can authors show the results on KnowEdit dataset as well?

评论

Dear Reviewer JPxS:

Thank you for your positive feedback and valuable suggestions! We sincerely appreciate the time and effort you have dedicated to reviewing our work. Below, we meticulously provide responses to each of your comments and outline the modifications based on your suggestions. All revisions are highlighted in blue.


W1: null-space projection may rely on high-dimensional matrix computations, which could pose scalability issues as model sizes or knowledge bases grow.

Thank you for raising this important concern! In fact, the computational complexity of null-space projection in AlphaEdit is unaffected by the base LLM’s size and knowledge base. We provide our reasoning from both theoretical analysis and experimental validation below:

  1. Theoretical Analysis:

    • As stated in the original manuscript (Line 191–192), implementing null-space projection in AlphaEdit only requires calculating the null-space projection matrix for K0K0TRd0×d0K_0 K_0^T \in \mathbb{R}^{d_0 \times d_0}. The computational complexity of this calculation depends solely on the hidden dimension d0d_0 of the base LLM, and is independent of the model size and the knowledge base. Furthermore, the hidden dimensions d0d_0 of commonly used LLMs are typically in the range of a few thousand, thus the time cost for null-space projection is negligible compared to the gradient descent time cost of baseline methods such as MEMIT.
  2. Experimental Validation:

    • To empirically verify our claims and analysis, we tested the average runtime for 100 edits performed by AlphaEdit on three different LLMs with varying model sizes and knowledge bases (LLaMA3, GPT-J, and GPT2-XL). The results are summarized below:
Method\text{Method}Counterfact\text{Counterfact}ZsRE\text{ZsRE}
LLaMA3\text{LLaMA3}GPT-J\text{GPT-J}GPT2-XL\text{GPT2-XL}LLaMA3\text{LLaMA3}GPT-J\text{GPT-J} GPT2-XL\text{ GPT2-XL}
MEMIT\text{MEMIT}222.51s222.51\text{s}334.74s334.74\text{s}474.14s474.14\text{s}231.32s231.32\text{s}344.21s344.21\text{s}488.37s488.37\text{s}
AlphaEdit\text{AlphaEdit}223.24s223.24\text{s}336.93s336.93\text{s}476.79s476.79\text{s}231.40s231.40\text{s}345.52s345.52\text{s}490.25s490.25\text{s}
 

From the table, we observe that across LLMs of varying sizes and knowledge bases, AlphaEdit does not incur additional time costs compared to MEMIT. This validates the scalability of our null-space projection approach.  

Additionally, in response to your helpful comment, we have added a new subsection to the revised manuscript (Appendix C.9, Lines 1404–1412), where we present both the theoretical analysis and experimental results on AlphaEdit’s runtime. We hope this addition will address similar concerns from other readers.  

Hope our response could address your concern!

评论

W2: It would be good to see results for other models such as gemma and phi.

Thank you for your valuable suggestion! Following your feedback, we have expanded our experiments to include two additional base LLMs, Gemma and phi-1.5, as you recommended. These new results are now included in the revised manuscript (Appendix C.7, Lines 1347–1401).

For your convenience, we provide a summary of the representative results and analysis below:  

Counterfact\text{Counterfact}
Method\text{Method}Model\text{Model}Eff.\text{Eff.}\uparrowGen.\text{Gen.}\uparrowSpe.\text{Spe.}\uparrowFlu.\text{Flu.}\uparrow
MEMIT\text{MEMIT}64.68±0.2164.68\pm0.2160.36±0.3060.36\pm0.3046.73±0.6246.73\pm0.62373.94±1.12373.94\pm1.12
RECT\text{RECT}Gemma\text{Gemma}65.17±0.1965.17\pm0.1957.48±0.6457.48\pm0.6452.54±0.5452.54\pm0.54388.77±0.44388.77\pm0.44
AlphaEdit\text{AlphaEdit}75.21±0.09\mathbf{75.21\pm0.09}67.83±0.63\mathbf{67.83\pm0.63}52.63±0.49\mathbf{52.63\pm0.49}398.96±0.39\mathbf{398.96\pm0.39}
MEMIT\text{MEMIT}55.71±1.6355.71\pm1.6356.58±0.7856.58\pm0.7835.41±0.9935.41\pm0.99368.57±1.26368.57\pm1.26
RECT\text{RECT}phi-1.5\text{phi-1.5}58.19±0.7358.19\pm0.7358.92±0.7658.92\pm0.7638.46±0.9238.46\pm0.92362.94±1.44362.94\pm1.44
AlphaEdit\text{AlphaEdit}70.79±0.56\mathbf{70.79\pm0.56}65.12±0.88\mathbf{65.12\pm0.88}48.96±0.96\mathbf{48.96\pm0.96}399.47±0.67\mathbf{399.47\pm0.67}
ZsRE\text{ZsRE}
Method\text{Method}Model\text{Model}Eff.\text{Eff.}\uparrowGen.\text{Gen.}\uparrowSpe.\text{Spe.}\uparrow
MEMIT\text{MEMIT}64.38±0.2664.38\pm0.2666.12±0.4666.12\pm0.4624.52±0.38\mathbf{24.52\pm0.38}
RECT\text{RECT}Gemma\text{Gemma}67.18±0.5067.18\pm0.5064.12±0.4764.12\pm0.4720.02±0.4720.02\pm0.47
AlphaEdit\text{AlphaEdit}75.91±0.42\mathbf{75.91\pm0.42}68.12±0.67\mathbf{68.12\pm0.67}23.50±0.5623.50\pm0.56
MEMIT\text{MEMIT}54.41±0.7854.41\pm0.7852.47±0.8952.47\pm0.8920.98±0.58\mathbf{20.98\pm0.58}
RECT\text{RECT}phi-1.5\text{phi-1.5}55.15±0.7255.15\pm0.7253.64±0.8353.64\pm0.8318.58±0.6518.58\pm0.65
AlphaEdit\text{AlphaEdit}70.02±0.85\mathbf{70.02\pm0.85}63.19±0.72\mathbf{63.19\pm0.72}20.69±0.7320.69\pm0.73

According to the above Table, we can find that:

  • AlphaEdit consistently outperforms MEMIT and RECT across key metrics on both the Counterfact and ZsRE datasets. Notably, on Gemma, AlphaEdit achieves the highest fluency (398.96) and consistency (32.91), reflecting its ability to maintain coherence and accuracy. Similarly, on phi-1.5, AlphaEdit excels in efficacy (70.79) and fluency (399.47), showcasing its adaptability to smaller, efficient models.

These findings demonstrate AlphaEdit’s generalizability across a wider range of LLM architectures, underscoring its ability to deliver high-quality edits while preserving model integrity.

Hope our additional experiments could resolve your concern!


 

Q1: Authors may have overlooked these methods: SERAC, GRACE, InstructEdit, and MELO when providing the comparison results.

 

Thank you for highlighting these important methods! We sincerely appreciate your effort to help us improve the manuscript.

In the original submission, we did not include comparisons with methods such as SERAC, GRACE, and MELO, as they primarily focus on parameter-preserving strategies that require additional memory modules, whereas AlphaEdit is specifically designed as a parameter-modifying editing method that directly alters model parameters.  

That said, your insightful comment has helped us recognize the potential value of including such comparisons for a more comprehensive evaluation. In response to your valuable input:  

  1. We have provided detailed descriptions of SERAC, GRACE, InstructEdit, and MELO in Related Work (Line 511–515) and Experimental Setup (Line 344-347, 810-815 and 834–847) of the revised manuscript;  
  2. We have employed SERAC, GRACE, InstructEdit, and MELO to conduct additional experiments, and presented the corresponding results and analysis in Section 4.2 (Line 280, 289 and 299) and Appendix C.5 (Line 1296–1319).
评论

Q2: Can authors show the results on KnowEdit dataset as well?

Thanks for your valuable suggestion! Following you feedback:

  1. We have provided detailed descriptions of KnowEdit in Related Work (Line 516–518) and Experimental Setup (Line 354–355 & 721–727) of the revised manuscript;

  2. We have conducted additional experiments using two representative datasets from the KnowEdit database—wiki_recent and wikibio. The corresponding results and analyses have been included in Appendix C.7 (Line 1347–1395).

For your convenience, we summarize some of the results and analysis below:

Method\text{Method}wiki-recent\text{wiki-recent}wikibio\text{wikibio}
Edit Succ.\text{Edit Succ.}\uparrowPortability\text{Portability}\uparrowLocality\text{Locality}\uparrowFluency\text{Fluency}\uparrowEdit Succ.\text{Edit Succ.}\uparrowLocality\text{Locality}\uparrowFluency\text{Fluency}\uparrow
MEMIT\text{MEMIT}56.25±0.2856.25\pm0.2842.73±0.2742.73\pm0.2741.02±0.2041.02\pm0.20513.35±3.47513.35\pm3.4763.73±0.4063.73\pm0.4064.27±0.4164.27\pm0.41582.38±3.34582.38\pm3.34
RECT\text{RECT}82.47±0.5382.47\pm0.5351.28±0.2551.28\pm0.2548.84±0.2448.84\pm0.24568.62±3.71568.62\pm3.7191.48±0.4891.48\pm0.4872.83±0.4472.83\pm0.44612.04±4.29612.04\pm4.29
AlphaEdit\text{AlphaEdit}96.10±0.47\mathbf{96.10\pm0.47}57.30±0.38\mathbf{57.30\pm0.38}54.76±0.30\mathbf{54.76\pm0.30}594.52±3.91\mathbf{594.52\pm3.91}95.34±0.46\mathbf{95.34\pm0.46}75.34±0.50\mathbf{75.34\pm0.50}618.35±4.22\mathbf{618.35\pm4.22}

According to the above Table, we can find that AlphaEdit demonstrates a remarkable ability to achieve high editing success rates across wiki_recent and wikibio. For instance, on the wiki_recent dataset, AlphaEdit achieves an impressive 96.10% editing success, which is significantly higher than the second-best method, RECT (82.47%).

Additionally, to provide a more comprehensive evaluation of AlphaEdit’s performance, we have also introduced the following experiments in the revised manuscript:

  1. Added LEME [1] dataset to assess the performance of AlphaEdit across different output text lengths.

  2. Added MQuAKE [2] dataset to evaluate the ability of AlphaEdit to answer multi-hop knowledge-based questions.

If you are interested, we warmly encourage you to refer to Appendices C.7 for the complete results and detailed analysis.

Hope that these updates could meet your expectations, and we would be thrilled if you could let us know whether your concerns have been addressed or if you have any follow-up questions!


Once again, we deeply appreciate your thoughtful and encouraging feedback. Your suggestions have not only enhanced the current work but have also inspired us to to keep moving forward and contributing to the community!

Best,

Authors of Paper 3792

[1] Long-form evaluation of model editing. 2024

[2] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions. 2023

评论

To provide a quick overview, we summarize some of the results and analysis below:

Method\text{Method}Model\text{Model}Counterfact\text{Counterfact}ZsRE\text{ZsRE}
Eff.\text{Eff.}\uparrowGen.\text{Gen.}\uparrowSpe.\text{Spe.}\uparrowFlu.\text{Flu.}\uparrowConsis.\text{Consis.}\uparrowEff.\text{Eff.}\uparrowGen.\text{Gen.}\uparrowSpe.\text{Spe.}\uparrow
Pre-edited\text{Pre-edited}LLaMA3\text{LLaMA3}7.85±0.267.85\pm0.2610.58±0.2610.58\pm0.2689.48±0.1889.48\pm0.18635.23±0.11635.23\pm0.1124.14±0.0824.14\pm0.0836.99±0.3036.99\pm0.3036.34±0.3036.34\pm0.3031.89±0.2231.89\pm0.22
InstructEdit\text{InstructEdit}LLaMA3\text{LLaMA3}66.58±0.2466.58\pm0.2464.18±0.3564.18\pm0.3547.14±0.3747.14\pm0.37443.85±0.78443.85\pm0.787.28±0.047.28\pm0.041.58±0.041.58\pm0.041.36±0.081.36\pm0.081.01±0.051.01\pm0.05
SERAC\text{SERAC}LLaMA3\text{LLaMA3}71.21±0.5671.21\pm0.5661.05±0.3961.05\pm0.3966.90±0.2166.90\pm0.21615.72±0.34615.72\pm0.3420.77±0.1320.77\pm0.1367.75±0.2467.75\pm0.2433.96±0.3533.96\pm0.3522.17±0.1522.17\pm0.15
GRACE\text{GRACE}LLaMA3\text{LLaMA3}96.72±0.1396.72\pm0.1350.14±0.0150.14\pm0.0172.23±0.21\mathbf{72.23\pm0.21}620.43±0.63620.43\pm0.6323.79±0.2323.79\pm0.2393.58±0.3193.58\pm0.311.03±0.061.03\pm0.0631.86±0.1231.86\pm0.12
MELO\text{MELO}LLaMA3\text{LLaMA3}65.29±0.1365.29\pm0.1358.58±0.3258.58\pm0.3263.36±0.3763.36\pm0.37608.98±0.82608.98\pm0.8222.18±0.0422.18\pm0.0425.18±0.1425.18\pm0.1424.14±0.2324.14\pm0.2330.36±0.7530.36\pm0.75
AlphaEdit\text{AlphaEdit}LLaMA3\text{LLaMA3}98.90±0.10\mathbf{98.90\pm0.10}94.22±0.19\mathbf{94.22\pm0.19}67.88±0.2967.88\pm0.29622.49±0.16\mathbf{622.49\pm0.16}32.40±0.11\mathbf{32.40\pm0.11}94.47±0.13\mathbf{94.47\pm0.13}91.13±0.19\mathbf{91.13\pm0.19}32.55±0.22\mathbf{32.55\pm0.22}
Pre-edited\text{Pre-edited}GPT-J\text{GPT-J}16.22±0.3116.22\pm0.3118.56±0.4518.56\pm0.4583.11±0.1383.11\pm0.13621.81±0.67621.81\pm0.6729.74±0.5129.74\pm0.5126.32±0.3726.32\pm0.3725.79±0.2525.79\pm0.2527.42±0.5327.42\pm0.53
InstructEdit\text{InstructEdit}GPT-J\text{GPT-J}50.62±0.5850.62\pm0.5851.73±0.4251.73\pm0.4256.28±0.5056.28\pm0.50245.89±0.44245.89\pm0.444.21±0.044.21\pm0.040.92±0.070.92\pm0.070.88±0.030.88\pm0.030.65±0.060.65\pm0.06
SERAC\text{SERAC}GPT-J\text{GPT-J}82.28±0.2682.28\pm0.2658.31±0.3458.31\pm0.3468.98±0.3268.98\pm0.32615.92±0.72615.92\pm0.7228.65±0.1728.65\pm0.1792.37±0.2992.37\pm0.2938.21±0.3238.21\pm0.3225.17±0.2525.17\pm0.25
GRACE\text{GRACE}GPT-J\text{GPT-J}96.50±0.2496.50\pm0.2450.10±0.0150.10\pm0.0174.42±0.4374.42\pm0.43620.56±0.79\mathbf{620.56\pm0.79}31.55±0.2531.55\pm0.2596.54±0.2196.54\pm0.210.40±0.020.40\pm0.0224.78±0.2124.78\pm0.21
MELO\text{MELO}GPT-J\text{GPT-J}78.29±0.2478.29\pm0.2460.52±0.5260.52\pm0.5266.80±0.5266.80\pm0.52610.82±0.44610.82\pm0.4424.31±0.2424.31\pm0.2482.24±0.0782.24\pm0.0732.88±0.0332.88\pm0.0326.65±0.2426.65\pm0.24
AlphaEdit\text{AlphaEdit}GPT-J\text{GPT-J}99.75±0.08\mathbf{99.75\pm0.08}96.38±0.23\mathbf{96.38\pm0.23}75.48±0.21\mathbf{75.48\pm0.21}618.50±0.17618.50\pm0.1742.08±0.15\mathbf{42.08\pm0.15}99.79±0.14\mathbf{99.79\pm0.14}96.00±0.22\mathbf{96.00\pm0.22}28.29±0.25\mathbf{28.29\pm0.25}
Pre-edited\text{Pre-edited}GPT2-XL\text{GPT2-XL}22.23±0.7322.23\pm0.7324.34±0.6224.34\pm0.6278.53±0.3278.53\pm0.32626.64±0.31626.64\pm0.3131.88±0.2031.88\pm0.2022.19±0.2422.19\pm0.2431.30±0.2731.30\pm0.2724.15±0.3224.15\pm0.32
InstructEdit\text{InstructEdit}GPT2-XL\text{GPT2-XL}55.32±0.5855.32\pm0.5853.63±0.4253.63\pm0.4253.25±0.6253.25\pm0.62412.57±0.15412.57\pm0.151.08±0.031.08\pm0.033.54±0.033.54\pm0.034.25±0.024.25\pm0.023.23±0.043.23\pm0.04
SERAC\text{SERAC}GPT2-XL\text{GPT2-XL}72.25±0.1572.25\pm0.1558.18±0.3258.18\pm0.3264.06±0.3764.06\pm0.37595.35±0.35595.35\pm0.3527.35±0.1227.35\pm0.1292.17±0.6792.17\pm0.6736.57±0.7236.57\pm0.7220.67±0.2220.67\pm0.22
GRACE\text{GRACE}GPT2-XL\text{GPT2-XL}98.88±0.1398.88\pm0.1350.05±0.0150.05\pm0.0172.07±0.24\mathbf{72.07\pm0.24}620.21±0.49\mathbf{620.21\pm0.49}28.53±0.1528.53\pm0.1594.33±0.3794.33\pm0.371.59±0.031.59\pm0.0327.63±0.43\mathbf{27.63\pm0.43}
MELO\text{MELO}GPT2-XL\text{GPT2-XL}72.62±0.5872.62\pm0.5853.63±0.4253.63\pm0.4263.25±0.3663.25\pm0.36588.57±0.65588.57\pm0.6523.58±0.3323.58\pm0.3393.54±0.0393.54\pm0.0345.25±0.0245.25\pm0.0223.45±0.2423.45\pm0.24
AlphaEdit\text{AlphaEdit}GPT2-XL\text{GPT2-XL}99.50±0.04\mathbf{99.50\pm0.04}93.95±0.34\mathbf{93.95\pm0.34}66.39±0.3166.39\pm0.31597.88±0.18597.88\pm0.1839.38±0.15\mathbf{39.38\pm0.15}94.81±0.30\mathbf{94.81\pm0.30}86.11±0.29\mathbf{86.11\pm0.29}25.88±0.2125.88\pm0.21

From the above table, we observe the following key findings:

  1. Across all base LLMs and datasets, AlphaEdit consistently achieves the highest scores in efficacy (Eff.) and generalization (Gen.). This indicates that AlphaEdit is highly effective at correctly applying the desired edits while maintaining robust generalization to the relevant knowledge.

  2. While AlphaEdit generally achieves competitive scores in specificity (Spe.) and fluency (Flu.), it does not always surpass the memory-based methods. However, we believe this trade-off is reasonable and acceptable because memory-based methods inherently rely on consuming storage space to better preserve existing knowledge.

If there are any other models you would like us to discuss or include as baselines, we would be more than happy to conduct additional experiments to incorporate them!

Hope our response could address your concerns!

评论

Thank you for incorporating the suggestions from the review. I have increased the rating to 8.

评论

Dear Reviewer JPxS,

Thank you for your kind feedback and for taking the time to review our updated work. We are grateful for your recognition and for increasing the rating—it means a lot to us and inspires us to continue improving.

We are deeply committed to advancing the field of efficient knowledge updates for LLMs. Your valuable comments have helped us refine our work, and we are excited to keep contributing meaningful insights and solutions to this area.

Thank you again for your thoughtful comments and encouragement. We genuinely appreciate your support.

Best regards,

Authors

审稿意见
8

This paper introduces AlphaEdit, a novel method for knowledge editing in large language models (LLMs). The primary goal of AlphaEdit is to enable targeted knowledge updates while minimizing the disruption of existing knowledge. The authors propose projecting perturbations onto the null space of the preserved knowledge before applying them to the model parameters. This approach theoretically ensures that the output of the edited LLM remains unchanged when queried about the preserved knowledge, thereby mitigating the issue of knowledge disruption. Extensive experiments on various LLMs, including LLaMA3, GPT2-XL, and GPT-J, demonstrate that AlphaEdit significantly boosts the performance of existing model editing methods by an average of 36.4% with minimal additional code.

优点

  • The concept of projecting perturbations onto the null space of preserved knowledge is innovative and addresses a significant challenge in the field of knowledge editing for LLMs. The theoretical foundation provided in the paper is robust and well-explained.
  • The authors conduct extensive experiments on multiple representative LLMs, demonstrating the effectiveness of AlphaEdit. The performance improvements are substantial and consistent across different models.

缺点

  1. Well, actually I think the work is great and I donot see any weakness, the thing is that I think the author can do more benchmarks like the LongformEvaluation, MQUAKE which consider some more knowledge utilization ablity for knowledge editing. But the current evaluation is good enough.

[1] Long-form evaluation of model editing

[2] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions

问题

N/A

评论

Dear Reviewer u8aL:

Thank you for your kind words and positive feedback regarding the novelty, presentation, and effectiveness of our work! Your approval is the great encouragement for us and motivates us to continue advancing our research.  

We also appreciate your insightful suggestion to include additional benchmarks: LongformEvaluation and MQUAKE. In response to your valuable input:  

  • In the revised manuscript, we have provided detailed descriptions of these datasets in the Related Work (Line 516–521) and the Experimental Setup (Line 352–355 & 727–736).  
  • We have conducted additional experiments on these two datasets, and presented the corresponding results and analysis in Appendix C.7 (Line 1347–1401).   All changes are highlighted in blue. For your convenience, we summarize some of the results and analysis below:   |||||||| |:-:|:-:|:-:|:-:|:-:|:-:|:-:| ||||MQuAKE\text{MQuAKE}||Longform\text{Longform}|| |Model\text{Model}|Method\text{Method}|\text{Multi-hop}$$\uparrow|\text{Multi-hop (CoT)} $$\uparrow|\text{Edit}$$\uparrow|\text{Factual}$$\uparrow|\text{Internal}$$\uparrow| | |MEMIT\text{MEMIT}|3.35±0.073.35\pm0.07|6.13±0.126.13\pm0.12|2.11±0.182.11\pm0.18|2.02±0.172.02\pm0.17|3.84±0.293.84\pm0.29| | GPT-J\text{GPT-J} |RECT\text{RECT}|3.77±0.043.77\pm0.04|7.61±0.207.61\pm0.20|2.24±0.202.24\pm0.20|2.62±0.192.62\pm0.19|4.07±0.314.07\pm0.31| ||AlphaEdit\text{AlphaEdit}|5.03±0.16\mathbf{5.03\pm0.16}|9.14±0.21\mathbf{9.14\pm0.21}|3.34±0.26\mathbf{3.34\pm0.26}|3.80±0.28\mathbf{3.80\pm0.28}|5.42±0.41\mathbf{5.42\pm0.41}| |||||||| | |MEMIT\text{MEMIT}|3.14±0.083.14\pm0.08|6.25±0.116.25\pm0.11|1.92±0.221.92\pm0.22|2.31±0.202.31\pm0.20|3.85±0.343.85\pm0.34| | GPT2-XL\text{GPT2-XL} |RECT\text{RECT}|3.72±0.063.72\pm0.06|7.48±0.247.48\pm0.24|2.12±0.262.12\pm0.26|2.60±0.212.60\pm0.21|4.13±0.294.13\pm0.29| ||AlphaEdit\text{AlphaEdit}|5.00±0.23\mathbf{5.00\pm0.23}|9.25±0.27\mathbf{9.25\pm0.27}|3.28±0.36\mathbf{3.28\pm0.36}|3.07±0.33\mathbf{3.07\pm0.33}|5.76±0.49\mathbf{5.76\pm0.49}|    

From the table, we observe the following key findings:  

  1. On MQUAKE dataset, AlphaEdit demonstrates superior performance in both two metrics, achieving scores of 9.14 and 9.75, respectively. These results significantly surpass competing methods, showcasing AlphaEdit’s capability in handling complex reasoning tasks while maintaining logical consistency across interdependent facts.  
  2. On LongformEvaluation dataset, AlphaEdit excels in all three metrics, reflecting its ability to generate accurate long-form outputs. Its consistently high performance across GPT-J and GPT2-XL highlights its reliability in executing precise edits while preserving the structural coherence and integrity of the generated text.

Additionally, to provide a more comprehensive evaluation of AlphaEdit’s performance, we have also introduced the following experiments in the revised manuscript:

  • New Datasets: Added wiki_recent and wikibio from KnowEdit database to evaluate performance across diverse knowledge content types.

  • New Baselines: Incorporated four new baselines—SERAC, GRACE, InstructEdit, and MELO, spanning two distinct categories (hypernetwork-based and memory-based approaches).

  • New Base LLMs: Integrated two additional base LLMs—Gemma and Phi.

  • Runtime Evaluation: Conducted new runtime assessments to measure the computational efficiency of AlphaEdit across various base LLMs.

If you are interested, we warmly encourage you to refer to Appendices C.5, C.6, C.7, C.8, and C.9 for the complete results and detailed analysis.  

Hope that these updates could meet your expectations, and we would be thrilled if you could let us know whether your concerns have been addressed or if you have any follow-up questions!


Once again, we deeply appreciate your thoughtful and encouraging feedback. Your suggestions have not only enhanced the current work but have also inspired us to to keep moving forward and contributing to the community!  

Best,  

Authors of Paper 3792

评论

Dear Reviewer u8aL,

Thank you for your positive feedback and thoughtful suggestion to include the Longform Evaluation and MQuAKE datasets. We hope our updates could align with your expectations.

If you have any further questions or suggestions for improving our paper, we would be truly grateful to hear them.

Once again, we deeply appreciate the time and expertise you have dedicated to reviewing our work!

Best regards,

Authors of Paper 3792

评论

Dear Reviewer u8aL,

Thank you once again for your thoughtful and constructive feedback, which has been instrumental in refining our work. Your initial comments were incredibly valuable, and we deeply appreciate the high evaluation you have already given our submission.

As the discussion phase comes to a close, we wanted to kindly ask if you have any additional suggestions or feedback that could further improve our work. Additionally, we would love to hear whether our responses and updates have satisfactorily addressed your concerns.

(To briefly reiterate our paper's contribution, we identified a common issue in existing model editing methods—the disruption of stored correct knowledge—caused by overfitting to new knowledge, leading to distributional shifts in hidden representations. Our proposed AlphaEdit addresses this issue comprehensively across current methods, achieving this with just a single line of code.)

We are sincerely grateful for the time and expertise you have shared, and we look forward to any final thoughts you might have.

Best regards,

The Authors

评论

Dear Reviewers,

We sincerely appreciate your time, efforts, and insightful feedback on our work! We are delighted that all reviewers recognized the motivation, novelty, presentation, and experimental effectiveness of our study.

In particular, we are grateful for your positive remarks about the significance of our work, such as Reviewer u8aL’s comment: "AlphaEdit addresses a significant challenge in the field of knowledge editing" and Reviewer JPxS’s note: "AlphaEdit effectively addresses a common trade-off in model editing."

Below, we provide point-by-point responses to your comments and outline the revisions made to the manuscript based on your suggestions. All revisions are highlighted in blue. Notably, most comments suggest conducting additional experiments. In response, we have conducted a comprehensive set of new experiments, which we summarize here:

  • Three new datasets: KnowEdit, LEME (Longform Evaluation), and MQuAKE.

  • Four new baselines: InstructEdit, SERAC, GRACE, and MELO.

  • Two new base LLMs: Gemma and phi-1.5.

  • Impact of dataset size: Relationship between the amount of data and AlphaEdit's performance.

  • Runtime experiments: Execution time of AlphaEdit and baselines.

We warmly encourage you to review the results in the revised manuscript. Hope our response and additional experiments could address your concerns!

Furthermore, please allow us to reiterate the key contribution of our work: AlphaEdit addresses the common trade-off in model editing—updating incorrect knowledge while preserving correct knowledge—with a single line of code. We believe this contribution is crucial for advancing the field of editing LLMs, and we are truly grateful for your recognition of its significance.

Once again, we deeply appreciate the time and expertise you have shared with us. Your encouraging feedback motivates us to continue advancing this work for the broader community, and we are more than happy to add clarifications to address any additional recommendations and reviews from you!

Best regards,

Authors of Paper 3792

评论

Dear Area Chair and Reviewers,

Thank you for your support throughout the discussion phase! We deeply appreciate the time and effort you have dedicated to reviewing and discussing our submission.

As the discussion phase comes to an end, we would like to take this opportunity to summarize the efforts we have made during this period. In response to your suggestions, we conducted extensive additional experiments, including three new datasets, two additional base LLMs, and four new baselines. We are pleased that these experiments have further validated our central claim: a common issue in existing model editing methods—the disruption of stored correct knowledge—is caused by overfitting to new knowledge, leading to distributional shifts in hidden representations. Our method, AlphaEdit, addresses this challenge comprehensively across various scenarios with just a single line of code, demonstrating its significance for advancing the model editing community.

Once again, we sincerely appreciate the constructive discussions and the opportunity to refine our work. Your contributions have been invaluable, and we look forward to continued engagement with the community!

Best regards,

Authors of Paper 3792

AC 元评审

This paper presents AlphaEdit, a novel method for knowledge editing in large language models (LLMs). The goal of AlphaEdit is to enable targeted updates to knowledge while minimizing disruption to existing information. The method involves projecting perturbations onto the null space of preserved knowledge before applying them to the model parameters, ensuring that the edited LLM’s output remains unchanged for queries related to preserved knowledge. Extensive experiments on various LLMs, including LLaMA3, GPT2-XL, and GPT-J, show that AlphaEdit improves the performance of existing editing methods by an average of 36.4% with minimal additional code.

This paper makes a significant contribution to the field of knowledge editing, presenting an elegant and well-designed method from a theoretical perspective. The proposed editing approach significantly reduces the impact on general capabilities, and the code is concise and elegant. All reviewers agree that the paper should be accepted for publication.

审稿人讨论附加意见

The authors provided many additional experiments during the rebuttal phase, and all reviewers gave the paper very high praise.

最终决定

Accept (Oral)

公开评论

Dear authors,

Congrats and thanks for this great work.

I have a question about the null space of K0\mathbf{K}_0. As K0\mathbf{K}_0 is a do×100,000d_o\times 100,000 matrix and do<100,000d_o<100,000 (a flat matrix). When K0\mathbf{K}_0 is full row rank (i.e., rank=dod_o), its left null space is only the zero vector 0\mathbf{0}. In this case, any update Δ\mathbf{\Delta} projected to the left null space of K0\mathbf{K}_0 will become 0\mathbf{0}.

So do we need to assume K0\mathbf{K}_0 has a rank smaller than dod_o? if yes, why this assumption is reasonable?

Best,

公开评论

Dear Weisen Jiang,

Thank you for your insightful question. Your observation regarding the null space of matrix K0\mathbf{K}_0 is indeed highly pertinent!

In our work, we acknowledge that K0\mathbf{K}_0 is typically full-rank in most practical scenarios. As explicitly noted in the paper (see footnote on page 4), the null space we obtain is an approximate one. This approximation stems from the pronounced long-tail characteristic observed in the singular value distribution of K0\mathbf{K}_0. To address this, footnote on page 4 exhibits a threshold-based approach for singular value selection. The null space projection matrix is constructed by considering singular values below this predetermined threshold. This methodology ensures that the projected delta has a relatively minor impact on K0\mathbf{K}_0. However, it's crucial to note that the threshold selection requires careful consideration - while an excessively low threshold would unduly restrict the magnitude of parameter updates, too high a threshold might compromise the effectiveness of our approach.

For practical implementation, we've provided empirical values in our GitHub repository (https://github.com/jianghoucheng/AlphaEdit) for reference. Additionally, our method has been integrated into the EasyEdit platform (https://github.com/zjunlp/EasyEdit), which you might find useful for experimentation and application.

Please let me know if you have any further questions or need additional clarification.

Best regards,

Authors of Paper 3792

公开评论

I read your article carefully and found the research content very profound and valuable. However, I have a question. You said that we need to project the matrix onto the null space of k0. But will the appearance of projection matrix P affect the update effect of K1?

公开评论

Thank you for your thoughtful question! You raise a critical theoretical concern. While our current empirical observations suggest that the projection matrix P does not significantly hinder the update effectiveness of K₁ in practice, this is largely attributed to the high-dimensional over-parameterization of LLMs, which inherently provides redundant degrees of freedom for parameter adjustments. However, we fully acknowledge that this assumption may not hold universally — for instance, in scenarios with low-rank knowledge updates or highly constrained parameter budgets, the impact of P could become non-negligible.

This is indeed a vital direction we overlooked, and we strongly agree that it warrants deeper theoretical analysis (e.g., quantifying the rank reduction induced by P or exploring adaptive projection thresholds). Should you decide to delve into this direction, we would be happy to support your efforts.

Thank you again for your sharp critique — it genuinely pushes the work forward.