PaperHub
6.6
/10
Poster5 位审稿人
最低5最高8标准差1.2
6
6
8
8
5
3.2
置信度
正确性2.8
贡献度2.8
表达2.8
ICLR 2025

Unsupervised Model Tree Heritage Recovery

OpenReviewPDF
提交: 2024-09-22更新: 2025-02-24
TL;DR

We propose an unsupervised method to recover the hereditary structure of model populations

摘要

关键词
Weight Space LearningDeep Weight SpaceModel TreeNeural PhylogenyPhylogenyFinetuning

评审与讨论

审稿意见
6

In this paper, the authors introduce the task of Unsupervised Model Tree Heritage Recovery(Unsupervised MoTHer Recovery) for collections of neural networks.

优点

The paper is well-written and introduces the history of the model tree well.

缺点

The paper is good as an introduction paper. However, it seems to lack novelty in the methodology part. The dense matrix construction (6) is not new.

问题

  1. You seem to use the existing graph algorithm and the model is also not new. Are there any novel points in the graph algorithm parts?
  2. Could you clarify the cluster method using (1) and (2)? Do you have any guarantee of this way? Why use (1) and (2) not other criteria? In addition, is the cluster method reliable in this task? Can you use other alternative ways?
评论

We thank the reviewer for their review. Below, we provide detailed responses to the reviewer’s concerns.


The paper is good as an introduction paper. However, it seems to lack novelty in the methodology part. The dense matrix construction (6) is not new. […] existing graph algorithm and the model is also not new. Are there any novel points in the graph algorithm parts?

We respectfully push back on this. The primary contributions of our paper are:

  1. The novel task of recovering the heritage tree of neural network models.
  2. The novel directional weight score, connecting kurtosis to weight directionality.
  3. Formulating our novel task as a minimum directional spanning tree search, incorporating a novel cost function that allows efficient optimization using an established discrete algorithm.

Our objective (6) is new and specifically tailored to our task as it combines the weight distance and our directional score. To solve our objective we use an existing minimum directed spanning tree solver, which we found adequate for our needs and did not need to develop a new solver.


Could you clarify the cluster method using (1) and (2)? Do you have any guarantee of this way? Why use (1) and (2) not other criteria? In addition, is the cluster method reliable in this task? Can you use other alternative ways?

We used standard Euclidean clustering on model weights, we validated its effectiveness in Sec. 6. We did not use (2) for clustering. Other clustering methods would probably also work, given the same distance matrix.


We believe our response addresses all the reviewer's concerns. If the reviewer has further questions or comments, we would be happy to address them during the discussion period. If we have successfully addressed the concerns, we kindly request the reviewer to consider increasing their rating.

评论

We sincerely thank you again for the time and effort you dedicated to reviewing our work, and we greatly appreciate your decision to revise the score following our rebuttal.

Since no specific outstanding concerns were mentioned, we would be happy to discuss any remaining questions or issues if there are any. Please feel free to let us know, and we will do our best to address them.

Thank you again for your time and for reconsidering your initial rating,

The Authors

评论

Thank you for updating the results and clarifying the problem. I will update my score accordingly.

评论

Thank you for considering our rebuttal and adjusting your score. We truly appreciate your thoughtful feedback and recognition of our work.

审稿意见
6

This paper introduces MoTHer Recovery, a method to automatically trace relationships between shared neural network models by analyzing their weights. The approach uses weight distances and distributions to determine which models were derived from others, creating a tree-like structure of model relationships without requiring training data or documentation. The authors validate their method through experiments and provide a dataset for future research in model heritage recovery.

优点

  • The paper writing is good, and presentation is clear

  • The paper introduces a novel and timely problem formulation (model heritage recovery) that hasn't been systematically addressed before

  • It develops an unsupervised approach that doesn't require access to training data and leverages inherent neural network weights to infer relationships

  • It provides empirical validation across different fine-tuning scenarios and demonstrates effectiveness on the Stable Diffusion model family

缺点

  • The paper doesn't address how to handle models with mixed heritage (e.g., models trained on merged weights from multiple parents) or partial weight sharing

  • The clustering approach might not scale well to web-scale model repositories - needs more analysis of computational requirements

  • It can be interesting to understand how different learning rates or optimization strategies during fine-tuning affect the reliability of weight-based relationships. For example, will aggressive optimization or pruning obscure these signals?

问题

  • What is the computational complexity of applying this method to large model repositories? Could you provide runtime analysis for different scales (e.g., 100, 1000, 10000 models)?

  • How does the method handle cases where models have been fine-tuned with different learning rates or optimization strategies? Is there a threshold where the relationship becomes undetectable?

  • For models with mixed heritage (e.g., merged weights from multiple parents), how does the method determine the primary relationship? Can it detect multiple parent relationships?

评论

We thank the reviewer for acknowledging our "novel and timely problem formulation." Below, we address the reviewer’s concerns in detail.


How does the method handle cases where models [...] fine-tuned with different learning rates [...]? Is there a threshold where the relationship becomes undetectable?

We appreciate the reviewer's suggestion, to investigate this, we fine-tuned a set of ViT models under varying learning rates and training steps. Specifically, we fine-tuned 55 models for each learning rate in the set [1e2,5e3,1e3,5e4,1e4,5e5,1e5,5e6,1e6][1e-2,5e-3,1e-3,5e-4,1e-4,5e-5,1e-5,5e-6,1e-6]. Each model was fine-tuned on CIFAR-100 with a unique seed for 1010 epochs.

Our results show that for all models that successfully converged, the Directional Weight Score was monotonic. For the two non-convergent learning rates (5e35e-3 and 1e21e-2, which achieved an accuracy below 20%) the Directional Weight Score no longer monotonically decreases. Notably, we observed that at some point during training (which varies across learning rates), the Directional Weight Score becomes noisy. Upon inspection, this corresponds to the model’s validation loss plateauing and becoming noisy, indicating convergence.

Below, we provide the results (averaged across 55 models per learning rate). However, it may be difficult to observe the trend from the numbers alone. We encourage the reviewer to refer to the corresponding graphs added to App. D.3, Fig. 16 (changes highlighted in yellow), which illustrate our findings more clearly. Each column in the table indicates the number of steps, and each row demonstrates monotonicity for a given learning rate.

lr0665133019952660332539904655532059856650
1e-06121.925121.917121.904121.889121.875121.863121.853121.845121.839121.836121.836
5e-06121.925121.841121.743121.689121.661121.646121.636121.631121.629121.628121.627
1e-05121.925121.726121.644121.622121.614121.611121.613121.612121.611121.612121.612
5e-05121.925121.501121.436121.41121.404121.398121.394121.399121.4121.406121.408
0.0001121.925121.288121.194121.081121.053121.031121.016121.024121.038121.045121.047
0.0005121.925117.73115.578114.376113.449112.774112.471112.37112.287112.3112.298
0.001121.925118.276112.029105.899.72194.10889.00484.88882.10680.57180.134
0.005121.925204.094268.338303.131319.688325.368323.988323.137318.517315.17312.983
0.01121.925183.406292.637369.073404.851433.96425.491414.763408.862398.975395.623

We've also conducted experiments on quantization and pruning of the weights.

For quantization, we fine-tuned a new ViT Model Graph similar to the FT graph in the paper (5 Model Trees, each containing 21 models). We then applied quantization to half of the models with a 50% probability, yielding a Model Graph where 50% of the models are quantized. This experiment was repeated 10 times to generate diverse quantized Model Graphs.

We tested 2 quantization methods: i) Simple quantization to fp16, ii) Int8 quantization using bitsandbytes.

The results show that our method is robust to quantization, as summarized below:

QuantizationImageNetImageNet-21kMAEDINOMSNModel Graph
Original0.90.90.80.8510.89
fp160.9±00.9±00.765±0.0220.85±01±00.883±0.004
Int80.9±00.895±0.0150.77±0.0240.85±01±00.883±0.007

For pruning, using the same (non-quantized) Model Graph as above, we incrementally pruned weights from the models using the l1_unstructured function in torch.nn.utils.prune and evaluated our method on the pruned Model Graphs.

The results show that our method is robust to significant pruning. For example, with 90% pruning, the accuracy decreases by only 4%, and even at 95% pruning, it drops by just 9%. Remarkably, when 99% of weights are pruned, our method still achieves 68% accuracy (random baseline is roughly 5%).

Pruning %ImageNetImageNet-21kMAEDINOMSNModel Graph# Pruned Params# Non-pruned Params
0% (Original)0.90.90.80.8510.89085,524,480
10%0.90.90.80.8510.898,552,43876,972,042
30%0.90.90.80.810.8825,657,33959,867,141
50%0.90.850.750.810.8642,762,24042,762,240
70%0.90.850.80.810.8759,867,14125,657,339
90%0.90.850.80.710.8576,972,0428,552,438
91%0.950.850.80.610.8477,827,2767,697,204
92%0.950.850.80.610.8478,682,5106,841,970
93%0.950.950.80.510.8479,537,7445,986,736
94%0.950.950.80.510.8480,393,0275,131,453
95%0.90.950.80.450.90.881,248,2614,276,219
96%0.90.90.80.40.90.7882,103,4953,420,985
97%0.90.850.80.40.90.7782,958,7292,565,751
98%0.90.90.60.40.850.7383,814,0121,710,468
99%0.80.90.450.450.80.6884,669,246855,234

These results have been added to Sec. 6.4 as an ablation study and elaborated upon in App. D.1 and D.2 (changes highlighted in yellow).

评论

For models with mixed heritage (e.g., merged weights from multiple parents), how does the method determine the primary relationship? Can it detect multiple parent relationships?

Handling models with mixed heritage is beyond the primary scope of our paper. However, inspired by the reviewer’s suggestion, we conducted a preliminary experiment to explore this scenario.

We started with the ImageNet, MAE, and DINO pre-trained base models and merged each pair of models using standard uniform weight averaging as described in Model Soups [1]. Subsequently, we fine-tuned 55 models from each of the original and merged models, yielding a total of 3030 fine-tuned models.

We evaluate the ability of our method to handle merged models in 22 settings:

  1. Clustering: We first clustered the 3030 fine-tuned models into 66 clusters, which resulted in perfect clustering accuracy.
  2. Parent Detection: To determine the parents of each merged model, we calculated the distance between each merged model and the centers of its potential parent clusters. Specifically, we computed the mean weights across all 55 fine-tuned models for each parent group (ImageNet, MAE, and DINO). For each merged fine-tuned model, we calculated the cosine similarity to each parent cluster’s center.

The table below summarizes the averaged cosine similarity between the merged models (rows) and the pre-trained clusters (columns). As shown, the cosine similarity to the true parents is significantly higher than to the unrelated pre-trained model, which enabled us to correctly identify both parents of all merged models with perfect accuracy. The chosen parent models according to the cosine similarity are in bold.

ImageNetDINOMAE
ImageNet + DINO0.9760.192-0.001
ImageNet + MAE0.847-0.00020.524
DINO + MAE-0.0010.3030.940

These results have been added to Sec. 7 as a discussion point and elaborated upon in App. E (changes highlighted in yellow).

[1] Wortsman, Mitchell et al. "Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time", PMLR 2022


What is the computational complexity of applying this method to large model repositories? Could you provide runtime analysis for different scales (e.g., 100, 1000, 10000 models)?

Based on the reviewer’s suggestion, we report the runtime of the clustering phase. Specifically, we simulated ViT Model Graphs of varying sizes and observed the scalability of the method. Notably, our approach allows for clustering based on the weights of a single model layer without sacrificing performance, which provides significant speedups. As can be seen, our method scales even to larger Model Graphs.

10 Samples100 Samples1k Samples10k Samples
Pairwise Distances (CPU)0.033 seconds0.504 seconds5.468 seconds5.01 minutes
Pairwise Distances (GPU)0.1 seconds0.697 seconds0.005 seconds34.742 seconds
Clustering (CPU)0.011 seconds0.001 seconds0.134 seconds4.43 minutes

The runtime of running the minimum directed spanning tree search is negligible compared to the pairwise distance calculation and clustering.

These results have been added to App. F (changes highlighted in yellow).

评论

We thank you again for the efforts put into reviewing our work. Like you, we believe that our paper is timely and therefore worth studying.

We wanted to kindly follow up to inquire if you have had the opportunity to review our response from November 19?

If there are any remaining concerns or questions, we would be happy to discuss them further and do our best to address them. If our responses have satisfactorily addressed your concerns, we would greatly appreciate your reconsideration of the score.

Thank you,

The authors

评论

Thanks for the detailed response, which addresses my comments and concerns. I would like to keep my original score.

评论

Thank you for engaging with our rebuttal and for confirming that our response addressed your comments and concerns. We truly appreciate your thoughtful review and your contributions to the discussion.

If any additional questions or thoughts arise, we would be happy to continue the discussion further.

Thank you again for your time and consideration,

The Authors

审稿意见
8

This paper investigates the problem of finding the relationship between models from the model weights.

优点

Please see the "Questions" section.

缺点

Please see the "Questions" section.

问题

My review is as follows:

  • I think this paper is well-written and investigates an interesting problem.

  • The introduction mentions legal disputes over model authorship. Out of curiosity, are there any known examples of this kind of dispute?

  • Could you please elaborate on this point? "Moreover, it can help identify models that resulted from the wrongful use of proprietary training data." It is not clear to me how the proposed method for determining model relationships could help with wrongful use of data.

  • Could the method successfully find the relationship between a quantized version of a model and the full precision model? Were there quantized models in the dataset?

  • The observation that the Directional Weight Score is monotonic with respect to the training steps is interesting but perhaps not concrete enough. I would expect this to strongly depend on the specific learning rate, number of training steps, and perhaps some other hyper parameters used in training. In my opinion, identifying when this observation tends to hold and when it does not would be important in order to solidify the findings of this paper.

  • Some follow-up questions on the monotonicity observation: Does this observation generalize across many different model types? On what kind of models has it been verified so far?

评论

Could the method successfully find the relationship between a quantized version of a model and the full precision model? Were there quantized models in the dataset?

The original dataset did not include quantized models. Based on the reviewer’s suggestion, we conducted additional experiments to evaluate the method on quantized models.

We fine-tuned a new ViT Model Graph similar to the FT graph in the paper (55 Model Trees, each containing 2121 models). We then applied quantization to half of the models with a 50% probability, yielding a Model Graph where 50% of the models are quantized. This experiment was repeated 1010 times to generate diverse quantized Model Graphs.

We tested 22 quantization methods: i) Simple quantization to fp16, ii) Int8 quantization using bitsandbytes.

The results show that our method is robust to quantization, as summarized below:

Quantization MethodImageNetImageNet-21kMAEDINOMSNModel Graph
Original0.90.90.80.8510.89
fp160.9 ± 00.9 ± 00.765 ± 0.0220.85 ± 01 ± 00.883 ± 0.0044
Int80.9 ± 00.895 ± 0.0150.77 ± 0.0240.85 ± 01 ± 00.883 ± 0.0078

We've also conducted a similar experiment (requested by other reviewers) on pruned models. Specifically, we fine-tuned a new ViT model graph with a structure similar to the FT graph in the paper (55 Model Trees, each containing 2121 models). We used this Model Graph and incrementally pruned weights from the models using the l1_unstructured function in torch.nn.utils.prune and evaluated our method on the pruned Model Graphs.

The results show that our method is robust to significant pruning. For example, with 90% pruning, the accuracy decreases by only 4%, and even at 95% pruning, it drops by just 9%. Remarkably, when 99% of weights are pruned, our method still achieves 68% accuracy (random baseline is roughly 5%).

Pruning %ImageNetImageNet-21kMAEDINOMSNModel Graph# Pruned Params# Non-pruned Params
0% (Original)0.90.90.80.8510.89085,524,480
10%0.90.90.80.8510.898,552,43876,972,042
30%0.90.90.80.810.8825,657,33959,867,141
50%0.90.850.750.810.8642,762,24042,762,240
70%0.90.850.80.810.8759,867,14125,657,339
90%0.90.850.80.710.8576,972,0428,552,438
91%0.950.850.80.610.8477,827,2767,697,204
92%0.950.850.80.610.8478,682,5106,841,970
93%0.950.950.80.510.8479,537,7445,986,736
94%0.950.950.80.510.8480,393,0275,131,453
95%0.90.950.80.450.90.881,248,2614,276,219
96%0.90.90.80.40.90.7882,103,4953,420,985
97%0.90.850.80.40.90.7782,958,7292,565,751
98%0.90.90.60.40.850.7383,814,0121,710,468
99%0.80.90.450.450.80.6884,669,246855,234

These results have been added to Sec. 6.4 as an ablation study and elaborated upon in App. D.1 and D.2 (changes highlighted in yellow).

评论

We thank the reviewer for highlighting the strengths of our paper. Below, we provide detailed responses to the reviewer’s concerns.


The observation that the Directional Weight Score is monotonic [...] is interesting but perhaps not concrete enough. I would expect this to strongly depend on the specific learning rate, number of training steps [...]. In my opinion, identifying when this observation tends to hold [...] important in order to solidify the findings of this paper.

We appreciate the reviewer’s suggestion, to investigate this, we fine-tuned a set of ViT models under varying learning rates and training steps. Specifically, we fine-tuned 55 models for each learning rate in the set [1e2,5e3,1e3,5e4,1e4,5e5,1e5,5e6,1e6][1e-2, 5e-3, 1e-3, 5e-4, 1e-4, 5e-5, 1e-5, 5e-6, 1e-6]. Each model was fine-tuned on CIFAR-100 with a unique seed for 1010 epochs.

Our results show that for all models that successfully converged, the Directional Weight Score was monotonic. For the two non-convergent learning rates (5e35e-3 and 1e21e-2, which achieved an accuracy below 20%) the Directional Weight Score no longer monotonically decreases. Notably, we observed that at some point during training (which varies across learning rates), the Directional Weight Score becomes noisy. Upon inspection, this corresponds to the model’s validation loss plateauing and becoming noisy, indicating convergence.

Below, we provide the results (averaged across 55 models per learning rate). However, it may be difficult to observe the trend from the numbers alone. We encourage the reviewer to refer to the corresponding graphs added to App. D.3, Fig. 16 (changes highlighted in yellow), which illustrate our findings more clearly. Each column in the table indicates the number of steps, and each row demonstrates monotonicity for a given learning rate.

lr0665133019952660332539904655532059856650
1e-06121.925121.917121.904121.889121.875121.863121.853121.845121.839121.836121.836
5e-06121.925121.841121.743121.689121.661121.646121.636121.631121.629121.628121.627
1e-05121.925121.726121.644121.622121.614121.611121.613121.612121.611121.612121.612
5e-05121.925121.501121.436121.41121.404121.398121.394121.399121.4121.406121.408
0.0001121.925121.288121.194121.081121.053121.031121.016121.024121.038121.045121.047
0.0005121.925117.73115.578114.376113.449112.774112.471112.37112.287112.3112.298
0.001121.925118.276112.029105.899.72194.10889.00484.88882.10680.57180.134
0.005121.925204.094268.338303.131319.688325.368323.988323.137318.517315.17312.983
0.01121.925183.406292.637369.073404.851433.96425.491414.763408.862398.975395.623
评论

Some follow-up questions on the monotonicity observation: Does this observation generalize across many different model types? On what kind of models has it been verified so far?

Yes, in the paper, we provide results for Vision Transformers (ViT), ResNet-50, and Stable Diffusion. We also tested both full fine-tuning and LoRA fine-tuning, demonstrating the robustness of the observation across diverse architectures and methods.


Could you please elaborate on this point? "Moreover, it can help identify models that resulted from the wrongful use of proprietary training data." It is not clear to me how the proposed method for determining model relationships could help with wrongful use of data.

Recovering the Model Tree implicitly reveals that any data used to train an ancestor model is also part of the training data for its descendants. This has significant implications for legal disputes. For instance, if a foundation model was fine-tuned from a dataset that was later deemed to have been used improperly (e.g., lawsuits against Stable Diffusion for training on private images), any model fine-tuned from that foundation model may also violate the original data restrictions. By recovering the Model Tree, we can identify these models and trace their lineage back to the original, improperly used data.


The introduction mentions legal disputes over model authorship. Out of curiosity, are there any known examples of this kind of dispute?

The wide spread of model sharing is quite a new phenomenon, e.g., the big jump in the number of models hosted on Hugging Face only happened over the past year. While we are not currently aware of specific legal disputes concerning model authorship, the aggressive licensing terms adopted by many model developers suggest that such litigation is likely to happen soon. For instance, the Stable Diffusion 3.5 community license restricts usage to commercial enterprises with revenues under $1 million.

评论

We thank you again for the efforts put into reviewing our work. Like you, we believe that the task and our monotonicity observation are interesting.

We wanted to kindly follow up to inquire if you have had the opportunity to review our response from November 19?

If there are any remaining concerns or questions, we would be happy to discuss them further and do our best to address them. If our responses have satisfactorily addressed your concerns, we would greatly appreciate your reconsideration of the score.

Thank you,

The authors

评论

Thanks for the additional experiments on the quantized models and the monotonicity. My view of this paper is positive. I'll update my score.

评论

Thank you for considering our rebuttal and increasing your score. We truly appreciate your thoughtful feedback and recognition of our work.

审稿意见
8

This paper targets to analyze the relation between models, aiming to shed light on which is fine-tuned from which model. This has also applications concerning copyright issues or more general licence concerns. Ths author introduces a method coined "Model Tree Heritage Recovery", which unravels the "parent-child" relations in a set of models. This method is unsupervised. Numerical examples are provided.

优点

  • Shedding light on the relation of models, in particular, in the LLM regime is crucial.
  • The numerics are convincing.

缺点

  • Due to the importance of such a method for legal aspects, some theoretical underpinning should be given, which is currently missing.
  • The running time of the method is not provided.

问题

see the weaknesses

评论

We thank the reviewer for acknowledging that the task can be “crucial” and for noting that the “numerics are convincing.” Below, we address the reviewer’s concerns in detail.


Due to the importance of such a method for legal aspects, some theoretical underpinning should be given, which is currently missing.

While the primary focus of this work is empirical, recent theoretical results provide support for the clustering of weights into trees. The literature on linear mode connectivity (LMC) has shown that models trained on the same data but with different random initializations converge to linearly related weights, up to a permutation of the neurons [1]. This implies that such models will be significantly distant from one another in 2\ell_2-norm in weight space.

In contrast, [2] demonstrated that models fine-tuned from a shared starting point (e.g., a pre-trained foundation model) experience less neuron permutation and tend to have weights that remain close to the original model. This establishes a clear distinction: root models in our Model Trees (typically foundation models) are far apart in weight space, whereas their fine-tuned descendants remain relatively close to their parent model.

In summary, these theoretical findings predict that intra-tree distances (between models within a tree) will be smaller than inter-tree distances (between models from different trees), thereby justifying the effectiveness of clustering models into trees.

We added this discussion to App. G (changes highlighted in yellow).

[1] Ainsworth, Samuel K., Jonathan Hayase, and Siddhartha Srinivasa. "Git re-basin: Merging models modulo permutation symmetries." ICLR 2023.

[2] Frankle, Jonathan, et al. "Linear mode connectivity and the lottery ticket hypothesis." ICML 2020.


The running time of the method is not provided.

In the original manuscript, we mention that our experiments took seconds to minutes even on a CPU. Furthermore, we state that the running time for recovering the structure from a given distance matrix is O(EV)O(EV). To provide a more detailed analysis, we have now conducted additional experiments to measure the runtime of the clustering phase. Specifically, we simulated ViT Model Graphs of varying sizes and observed the scalability of the method. Notably, our approach allows for clustering based on the weights of a single model layer without sacrificing performance, which provides significant speedups. As can be seen, our method scales even to larger Model Graphs.

10 Samples100 Samples1k Samples10k Samples
Pairwise Distances (CPU)0.033 seconds0.504 seconds5.468 seconds5.01 minutes
Pairwise Distances (GPU)0.1 seconds0.697 seconds0.005 seconds34.742 seconds
Clustering (CPU)0.011 seconds0.001 seconds0.134 seconds4.43 minutes

The runtime of running the minimum directed spanning tree search is negligible compared to the pairwise distance calculation and clustering.

These results have been added to App. F (changes highlighted in yellow).

评论

We sincerely thank you again for the effort you dedicated to reviewing our work. Like you, we believe that shedding light on the relation of models is very important.

We wanted to kindly follow up to inquire if you have had the opportunity to review our response submitted on November 19.

If there are any remaining concerns or questions, we would be happy to discuss them further and address them to the best of our ability. If our responses have satisfactorily addressed your concerns, we would greatly appreciate your reconsideration of the score.

Thank you,

The authors

评论

Dear authors, thank you for your detailed comments, which address my concerns. I will update my score accordingly.

评论

Thank you for considering our rebuttal and adjusting your score. We truly appreciate your thoughtful feedback and recognition of our work.

审稿意见
5

Motivated by the fact that many models have been publicly released, this paper proposes a new problem: studying the relationships between these models. Specifically, the authors aim to build a tree data structure where directed edges connect a parent model to other models that have been directly fine-tuned from it (its children). For each pair of models, this task requires: (i) determining if they are directly related, and (ii) establishing the direction of the relationship. Assuming that all models within the model tree share the same architecture, the authors propose a method based on the distance between model weights. Experiments demonstrate the performance of the proposed method.

优点

Originality: This paper addresses a new problem: estimating the relationship between models and their fine-tuned versions. However, the significance of this problem for open models is debatable; see the weakness for the detailed comments.

Simple approach: The proposed approach based on the distance of model weights is simple. But this is based on a well-known fact that fine-tuning makes small weight changes.

Writing: The clarity is mixed; some parts are easy to follow, but certain important sections, such as Section 4.2, are hard to understand.

缺点

Limitation 1: The proposed approach can only handle open models, as it relies on model weights. For important open models that have been fine-tuned, information about the pretrained models is often available at the time of release. For models without such information, one can infer relationships based on weight distance. However, it is unclear why this information is needed for all released models.

Limitation 2: The proposed approach constructs the model tree based on the weight distances between each pair of models and is thus limited to the case that all the models within a model tree share the same architecture. It can not be applied to other models that are obtained through distillation, etc.

问题

What is μ\mu in eq. (3)? What is the pretraining stage in Figure 2 (I thought it is all about fine-tuning)? Overall, I found section 4.2 is hard to comprehend.

评论

We thank the reviewer for highlighting the strengths of our paper. Below, we provide detailed responses to the reviewer’s concerns.


Limitation 1: The proposed approach can only handle open models, as it relies on model weights. For important open models that have been fine-tuned, information about the pretrained models is often available at the time of release. For models without such information, one can infer relationships based on weight distance. However, it is unclear why this information is needed for all released models.

We break down this concern into multiple parts and address each one separately:

[...] can only handle open models [...] While our approach relies on model weights, it is essential to differentiate between potential users of the method. Most users will indeed only have access to open models and can use our method to find fine-tuned versions of the foundation model of interest on model repositories such as Hugging Face. However, the issue of model attribution is particularly relevant in legal disputes. For instance, courts can order companies suspected of misusing a model to apply our method to their private weights, even if those weights remain undisclosed publicly. Consider, for example, the Stable Diffusion 3.5 community license, which restricts usage to commercial enterprises with revenue under $1 million. If Stability AI suspects a company of fine-tuning their model in violation of this license, our method could help trace the model’s origins. In such cases, courts could mandate the company to run attribution algorithms to verify compliance while the weights remain private.

In summary, while the task relies on model weights, it has applications beyond open models, particularly in legal scenarios.

[...] information about the pretrained models is often available at the time of release [...] Unfortunately, this is frequently not the case. As discussed in lines 40–45 and Appendix A, we analyzed over 800k model cards from Hugging Face and found that over 60% lack this information. This gap underscores the practical importance of our method.

[...] For models without such information, one can infer relationships based on weight distance [...] As the reviewer noted in their summary, our task is much more complex than merely using weight distance. Our approach not only determines whether an edge exists between two models but also infers its direction. In addition to defining this task, our paper introduces a key insight: kurtosis plays a key role in determining edge directionality. We use this insight to develop our final method, MoTHer.

评论

Limitation 2: The proposed approach [...] is thus limited to the case that all the models within a model tree share the same architecture. It can not be applied to other models that are obtained through pruning, distillation, etc.

The claimed limitation is not entirely accurate. While it is true that our method cannot handle distilled models (as they do not retain the original weights), it can handle pruned models.

Based on the reviewer’s suggestion, we conducted an additional experiment on pruned models. Specifically, we fine-tuned a new ViT model graph with a structure similar to the FT graph in the paper (5 Model Trees, each containing 21 models). We used this Model Graph and incrementally pruned weights from the models using the l1_unstructured function in torch.nn.utils.prune and evaluated our method on the pruned Model Graphs.

The results show that our method is robust to significant pruning. For example, with 90% pruning, the accuracy decreases by only 4%, and even at 95% pruning, it drops by just 9%. Remarkably, when 99% of weights are pruned, our method still achieves 68% accuracy (random baseline is roughly 5%).

Pruning %ImageNetImageNet-21kMAEDINOMSNModel Graph# Pruned Params# Non-pruned Params
0% (Original)0.90.90.80.8510.89085,524,480
10%0.90.90.80.8510.898,552,43876,972,042
30%0.90.90.80.810.8825,657,33959,867,141
50%0.90.850.750.810.8642,762,24042,762,240
70%0.90.850.80.810.8759,867,14125,657,339
90%0.90.850.80.710.8576,972,0428,552,438
91%0.950.850.80.610.8477,827,2767,697,204
92%0.950.850.80.610.8478,682,5106,841,970
93%0.950.950.80.510.8479,537,7445,986,736
94%0.950.950.80.510.8480,393,0275,131,453
95%0.90.950.80.450.90.881,248,2614,276,219
96%0.90.90.80.40.90.7882,103,4953,420,985
97%0.90.850.80.40.90.7782,958,7292,565,751
98%0.90.90.60.40.850.7383,814,0121,710,468
99%0.80.90.450.450.80.6884,669,246855,234

These results have been added to Sec. 6.4 as an ablation study and elaborated upon in App. D.1 (changes highlighted in yellow).


What is mu in eq. (3)?

Eq. (3) defines the directional weight score for computing the direction of an edge between two models. This score is based on kurtosis (fourth moment). In this equation, μ\mu represents the mean of the layer weights ll. We have clarified this in the updated manuscript (changes highlighted in yellow).


What is the pretraining stage in Figure 2 (I thought it is all about fine-tuning)? Overall, I found section 4.2 is hard to comprehend.

We appreciate this feedback and have updated the manuscript. Please let us know if further refinements are needed. Fig. 2 illustrates trends in the directional weight score during different training phases. While our paper focuses on fine-tuning, the figure highlights how the score increases during pretraining and decreases during fine-tuning.


We believe our response addresses all the reviewer's concerns. If the reviewer has further questions or comments, we would be happy to address them during the discussion period. If we have successfully addressed the concerns, we kindly request the reviewer to consider increasing their rating.

评论

We thank you again for the time and effort you dedicated to reviewing our work. We wanted to kindly follow up to inquire if you have had the opportunity to review our response from November 19?

If there are any remaining concerns or questions, we would be happy to discuss them further and do our best to address them. If our responses have satisfactorily addressed your concerns, we would greatly appreciate your reconsideration of the score.

Thanks,

The Authors

评论

I thank the authors for their efforts in addressing my comments. The additional results on pruning will strengthen the paper, although pruning does not fundamentally change the problem, as one can compare the subset of the weights. I have adjusted my score but am still unsure about the problem, which I will discuss further with other reviewers and AC.

评论

Thank you for considering our rebuttal and increasing your score, we truly appreciate your feedback.

If any additional questions or thoughts arise, we would be happy to continue the discussion further.

Thank you again for your time and consideration,

The Authors

AC 元评审

The paper proposes the task of Unsupervised Model Tree Heritage Recovery to study the relationship between different models. The idea is to build a tree where children are obtained by fine-tuning the parent, and then use weight distances and distributions to determine which models were derived from others, without requiring training data or documentation. The performance of the proposed method is assessed via numerical simulations.

The reviewers appreciated the originality of the idea, the simplicity of the approach and the convincing experiments. The main weaknesses are related to the motivation (the authors mention legal disputes but their justification during the rebuttal is not fully convincing, as they could not provide a concrete example) and to the restriction to cases where model weights are publicly available. Upon weighing strengths and limitations, I am inclined to accept the paper given the novelty of the approach that could be of interest to the ICLR community.

审稿人讨论附加意见

A few issues were raised in the reviews and most of these have been addressed -- one notable exception being the concerns of reviewer LoWV about motivation (that cannot really be addressed in the short period of the discussion).

最终决定

Accept (Poster)