PaperHub
7.2
/10
Poster4 位审稿人
最低3最高5标准差0.8
5
3
3
4
ICML 2025

Balancing Model Efficiency and Performance: Adaptive Pruner for Long-tailed Data

OpenReviewPDF
提交: 2025-01-24更新: 2025-07-24

摘要

关键词
Neural network pruning,Long-tail learning

评审与讨论

审稿意见
5

This paper introduces Long-Tailed Adaptive Pruner (LTAP), a novel pruning strategy designed to enhance neural network efficiency while preserving performance on long-tailed datasets. LTAP addresses this challenge by incorporating multi-dimensional importance scoring and a dynamic weight adjustment mechanism, ensuring that essential parameters for tail classes are retained. The method employs progressive multi-stage pruning, gradually removing redundant parameters. Extensive experiments on multiple benchmark datasets demonstrate LTAP's effectiveness.

给作者的问题

Please see above.

论据与证据

Yes.

方法与评估标准

  • In Section 2.3, the authors state that the updating rule strengthens the criteria that lead to improved class performance. However, in Eq. (6), it appears that if the accuracy of class cc improves, the weights for all criteria of class cc are increased by the same value β\beta. This suggests that the method does not actually select the most effective criteria for class cc. More explanation is needed to clarify this point.

  • The CIFAR-LT-100 dataset with an imbalance ratio of 10 is widely used. Including this dataset could be better but is not strictly necessary.

理论论述

Yes.

实验设计与分析

The authors do not present the final criteria weight matrix DD in the experiments. It would be helpful to include it to validate the effectiveness of the weight adjustment mechanism.

补充材料

Yes.

与现有文献的关系

The proposed method contributes to the broader scientific literature by tackling the challenges of model pruning in the presence of long-tailed data distributions.

遗漏的重要参考文献

Several relevant works of long-tailed learning have not been cited or discussed, including:

  • Parametric Contrastive Learning, ICCV 2021
  • Label-Imbalanced and Group-Sensitive Classification under Overparameterization, NeurIPS 2021
  • Self-Supervised Aggregation of Diverse Experts for Test-Agnostic Long-Tailed Recognition, NeurIPS 2022
  • Long-Tailed Recognition via Weight Balancing, CVPR 2022
  • A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment for Imbalanced Learning, NeurIPS 2023
  • Balanced Product of Calibrated Experts for Long-Tailed Recognition, CVPR 2023
  • Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition, ICML 2024

其他优缺点

Strengths

  • Addressing model pruning in the presence of long-tailed data distributions is both meaningful and highly relevant to real-world scenarios.

  • The proposed method achieves a relatively significant performance improvement.

Weaknesses

  • The novelty of the method is somewhat limited. It primarily focuses on combining different criteria with adaptive weights. However, as mentioned in the Method and Evaluation Criteria part, the weight adjustment mechanism does not seem effective in selecting the most effective criteria. This aspect may require further explanation.
  • Some notations are not properly defined. For example, the definition of ngn_g in Eq. (3) and β\beta in Eq. (6) is missing.

其他意见或建议

  • There are some typos. In the Implementation Details section, there appears to be a formatting issue with the superscripts in the learning rate and weight decay values.
作者回复

Thank you for your recognition of our work. We have carefully understood the relevant weaknesses you mentioned and have made the following efforts:

  • About broader references

We have incorporated the studies you mentioned into our discussion and references as follows:

Recent advances in long-tailed learning address class imbalance through diverse strategies:

  1. Theoretical & Unified Frameworks: [Wang et al., 2023] proposed a data-dependent contraction technique to unify re-weighting and logit adjustment.
  2. Hierarchical Label Variation: [Yang et al., 2023] introduced DirMixE, leveraging Dirichlet meta-distributions to model global-local test-time variations, with variance regularization for stable generalization.
  3. Ensemble Calibration: [Aimar et al., 2023] formulated BalPoE, a Fisher-consistent ensemble combining logit-adjusted experts calibrated via mixup.
  4. Weight Regularization: [Alshammari et al., 2023] demonstrated that simple weight decay and MaxNorm constraints outperform complex rebalancing methods.
  5. Contrastive Rebalancing: [Cui et al., 2023] proposed PaCo, integrating parametric class centers into contrastive learning to mitigate gradient bias toward head classes.
  6. Margin Adjustment Theory: [Kini et al., 2023] analyzed VS-loss, combining additive/multiplicative logit adjustments to optimize margins in overparameterized regimes.
  7. Test-Agnostic Aggregation: [Zhang et al., 2022] developed SADE, a self-supervised expert aggregation framework for unknown test distributions.

These works collectively advance the field by bridging theory-practice gaps [1,6], enhancing model flexibility [2,7], and simplifying regularization [4], while addressing both label and group imbalances [3,5,6].

  • About the supplementary experiment on CIFAR-100-LT (IR=10)

According to your suggestion, we supplemented the comparison experiment with IR=10, and the results are shown in the table below.

MethodF(%)↓Head↑Medium↑Tail↑All↑C(%)↑C/F↑
BS100.064.955.50.061.5100.01.0
BS + ATO84.745.137.20.041.166.80.8
BS + RReg52.146.238.70.042.569.11.3
BS + ours22.656.447.30.053.887.43.8
LDAM-DRW100.060.843.30.055.4100.01.0
LDAM-DRW + ATO84.741.526.20.037.367.30.8
LDAM-DRW + RReg52.139.825.10.035.263.51.2
LDAM-DRW + ours22.651.935.70.047.185.03.7
DBLP100.065.343.40.058.7100.01.0
DBLP + ATO84.749.528.30.042.372.00.9
DBLP + RReg52.148.327.20.041.170.11.3
DBLP + ours22.657.330.00.049.283.83.7
  • About criteria weight matrix

To facilitate your understanding, we have supplemented the images of the weight matrix DD in the last training process, where the horizontal axis represents the index of the class, different color blocks represent different pruning weight calculation methods, and the length of different color blocks represents the numerical value of different pruning weights in the weight matrix DD.

https://anonymous.4open.science/r/AEFCDAISJ/D_matrix_pic.png

  • About weight adjustment mechanism

The observations you make are very sharp. In a single step of equation (6), for the improved category cc, the weights of all relevant criteria indeed increase by the same value β\beta. We understand that this may raise questions about how the mechanism actually chooses the 'most effective criteria', and we provide the following clarifications in this regard:

(i) Equation (6) is not designed to directly identify a single 'most efficient criterion', but rather to implement a more nuanced and multi-dimensional dynamic balancing mechanism.

(ii) Among other things, the reasons for this design rather than directly identifying the 'best criteria' are:

  1. In practice, multiple criteria often work together to achieve the best results, rather than a single criterion being dominant.
  2. As reviewer LjiE said, the core goal of this paper lies in introducing distribution-aware capabilities for pruning strategies. The above standard is the carrier of this distribution-aware capability to effective pruning. In future work, we plan to explore mechanisms for more precise and interpretable distribution-aware parameter pruning.
  • About novelty

As stated in the previous point, our goal is to introduce distributional awareness to pruning, and to this end, LTAP redefines the pruning problem in the long-tailed scenario from 'uniform compression' to 'differentiated parameter assignment', a conceptual shift that is significantly innovative in its own right. This method of directly correlating distribution properties with parameter importance opens a new path for the application of neural network compression on imbalanced data.

  • About notation and spelling errors

We appreciate your comments and will fully revise them in a subsequent version.

审稿人评论

I have reviewed the author's response. I am satisfied with the author's efforts. The supplementary experiments fully demonstrate the necessity and rationality of the proposed method. Additionally, the author's additional explanations regarding novelty and motivation are clear to me. I also agree with the recognition of the theoretical and experimental contributions of this paper by other reviewers, as well as their views on its significance in the long-tail domain. I have decided to raise my score.

审稿意见
3

This paper introduces an adaptive pruning method called LTAP to address the challenge of handling long-tailed distribution data. The authors propose a multi-dimensional importance scoring criterion and design a dynamic weight adjustment mechanism to adaptively determine the pruning priority of parameters for different classes. Experimental results on various benchmark datasets, such as CIFAR-100-LT and ImageNet-LT, demonstrate improvements in both computational efficiency and classification accuracy for tail classes.

update after rebuttal

Thank you for your rebuttal. I will keep my score unchanged and remain positive about this paper.

给作者的问题

Please refer to Strengths And Weaknesses.

论据与证据

Please refer to Strengths And Weaknesses.

方法与评估标准

Please refer to Strengths And Weaknesses.

理论论述

Please refer to Strengths And Weaknesses.

实验设计与分析

Please refer to Strengths And Weaknesses.

补充材料

Yes

与现有文献的关系

Please refer to Strengths And Weaknesses.

遗漏的重要参考文献

Please refer to Strengths And Weaknesses.

其他优缺点

Strengths

  1. The concept of adaptive pruning to balance model efficiency and performance on long-tailed data is intriguing and has the potential to attract interest from the research community.
  2. The author provides theoretical analysis to strengthen the persuasiveness of the proposed method.
  3. The overall organization of the paper is well-structured.

Weaknesses

  1. Personally, I believe that instead of presenting numerous theorems and textual explanations in Section 3, incorporating visual elements would be more effective. This could help readers gain deeper insights into the methodology more easily.
  2. A minor issue: Some symbols used in the formulas are not well explained, sometimes I feel confused when reviewing this paper. I suggest that the author improve the writing and enhance the clarity of the paper’s content.

其他意见或建议

Please refer to Strengths And Weaknesses.

作者回复

Thank you for your recognition of our work. We have carefully understood the relevant weaknesses you mentioned and have made the following efforts:

  • About presentation form

Thank you for your valuable suggestions. We will supplement the theorems and textual explanations in Section III with some visual instructions to help the reader understand. We will illustrate (i) why pruning in the long-tail scenario requires special protection, and (ii) the core idea of distribution-aware pruning through a schematic illustration.

  • About symbolic expression

Thank you for your advice. We carefully proofread the notations used in this paper and improved some inappropriate descriptions.

Thank you again for your comments and we will continue to work on improving the quality of the manuscript!

审稿意见
3

This paper introduces ​LTAP (Long-Tailed Adaptive Pruner), a pruning strategy tailored for long-tailed data distributions. LTAP addresses the challenge of class imbalance by dynamically adjusting pruning priorities through a multi-criteria importance evaluation framework.

给作者的问题

NA.

论据与证据

Formal proofs establish that tail classes inherently demand higher overparameterization, justifying LTAP’s tail-biased parameter retention strategy.

方法与评估标准

Extensive experiments on CIFAR-100-LT, ImageNet-LT, and iNaturalist 2018 demonstrate LTAP’s superiority over baseline methods.

理论论述

The paper provides theoretical analysis (e.g., Theorem 1–4) to justify why tail classes require higher parameter protection.

实验设计与分析

LTAP achieves state-of-the-art efficiency-accuracy trade-offs across multiple architectures (ResNet-32/50) and datasets.

补充材料

Yes.

与现有文献的关系

NA.

遗漏的重要参考文献

NA.

其他优缺点

Strength:

LTAP is the first pruning framework explicitly designed for long-tailed data. The LT-Vote mechanism effectively mitigates pruning bias toward head classes by dynamically reweighting criteria (magnitude, gradient alignment, Taylor impact) based on per-class validation accuracy. This innovation directly addresses the core challenge of class imbalance in pruning.

Overall, this work makes a contribution to long-tailed learning by bridging pruning and class imbalance mitigation. The LTAP framework is both theoretically grounded and empirically robust, offering a practical solution for efficient model deployment.

其他意见或建议

NA.

作者回复

Thank you for your time and effort. We are encouraged by your high appreciation of the novelty and contribution of our paper. We will continue to work on exploring how to better solve long-tailed problems in real-world scenarios. If you have any further questions, we will address them at any time. Thanks again!

审稿意见
4

This paper introduces Long-Tailed Adaptive Pruner (LTAP), a model pruning framework designed for long-tailed class distributions. LTAP integrates multi-criteria importance scoring and a dynamic LT-Vote mechanism to prioritize preserving parameters crucial for tail classes. It employs multi-stage pruning, gradually refining the model while maintaining performance. Theoretical analysis supports the claim that tail classes require more capacity, and experiments on CIFAR-100-LT, ImageNet-LT, and iNaturalist 2018 demonstrate that LTAP outperforms traditional pruning and long-tail learning methods, improving tail-class accuracy while reducing model size by 70%. LTAP provides a new approach to balancing efficiency and fairness in imbalanced learning.

给作者的问题

How does an LTAP-pruned model (30% parameters retained) compare to a manually designed smaller model (with ~30% of the original parameters) trained with long-tail techniques? Does LTAP find a more effective sub-network than simply training a smaller model from scratch?

论据与证据

The paper makes three major claims: (a) Tail classes require more model capacity than head classes, (b) Dynamically adjusting pruning criteria improves tail-class retention, and (c) LTAP achieves better accuracy-efficiency trade-offs than existing methods. Overall, the claims are well-supported by theoretical proofs and extensive experiments across multiple datasets. The results consistently show LTAP’s advantage in balancing efficiency and tail-class performance.

方法与评估标准

The methodology is well-designed, effectively addressing pruning bias in imbalanced data while maintaining efficiency. The evaluation is comprehensive, though hyperparameter robustness (e.g., pruning schedule, weight updates) remains an open question.

理论论述

The paper presents a strong theoretical foundation supporting LTAP’s approach, with multiple theorems validating its core ideas. These claims are supported by rigorous proofs in the appendix, which follow established generalization theory and sample complexity principles. While some assumptions may not hold perfectly in practice, the results provide a strong theoretical justification for LTAP's adaptive pruning strategy. The presence of formal guarantees enhances the paper’s credibility, making LTAP a well-grounded contribution to long-tailed learning and pruning research.

实验设计与分析

The experimental design is rigorous and well-structured, validating LTAP across CIFAR-100-LT, ImageNet-LT, and iNaturalist 2018, covering both controlled and real-world imbalance scenarios. The method consistently improves tail-class accuracy while reducing FLOPs, with results showing higher accuracy-per-FLOP (C/F) than competing methods. The evaluation metrics include head/medium/tail accuracy breakdowns, highlighting tail-class improvements.

补充材料

The supplementary material enhances the clarity and credibility of the paper by providing detailed proofs of all theoretical results, an extensive survey of related work, and additional empirical analyses, reinforcing LTAP’s claims. However, Section E, which includes the pseudocode for LTAP, appears to be incomplete. This is a minor issue, as the overall method is well-described in the main paper and supplementary text. Despite this, the supplementary document significantly strengthens the paper, providing both theoretical depth and practical insights.

与现有文献的关系

This paper bridges long-tailed learning and model pruning, two traditionally separate research areas, by introducing adaptive pruning tailored for class imbalance. Unlike prior long-tail learning methods that focus on reweighting, resampling, or architectural modifications (e.g., expert models, transfer learning, logit adjustments), LTAP optimizes model structure dynamically to preserve tail-class critical parameters, making it a novel contribution to long-tail research. Similarly, while pruning methods have primarily targeted efficiency and overall accuracy, LTAP introduces a distribution-aware pruning strategy that prioritizes fairness across head and tail classes.

遗漏的重要参考文献

While the paper includes strong baselines such as LDAM-DRW and DBLP, there exist many other long-tailed learning approaches that could have further strengthened the evaluation. Methods like Focal Loss, BBN, and Logit Adjustment are widely used in long-tailed classification but were not explicitly included in the comparisons. Additionally, some recent state-of-the-art long-tailed learning strategies, such as MiSLAS and RIDE, which utilize representation learning and multi-expert models, could have been relevant baselines. While this omission does not undermine the paper’s key contributions, including a broader range of baselines would have further solidified LTAP’s effectiveness by demonstrating its adaptability across different long-tail learning paradigms.

其他优缺点

Please refer to the above response.

其他意见或建议

The text in Figure 2 is too small, making it difficult to read.

作者回复

Thank you for your time, effort and recognition of our work. Your comments are very important for us to continue to improve this work. Based on your comments, we continue to make the following efforts:

  • About Pseudocode

This is our negligence. We have completed and corrected the pseudocode in the original paper.

  • About the broader baselines

Thank you for your approval of the experimental section. Based on your comments, we tried our best to supplement some baselines in the limited time, including Focal Loss, Logit Adjustment, and RIDE. The following are the experimental results regarding supplemental baseline on the CIFAR-100-LT dataset. The experimental results show the superior performance of LTAP.

MethodF(%)↓IR=10 Head↑IR=10 Medium↑IR=10 Tail↑IR=10 All↑IR=10 C(%)↑IR=10 C/F↑IR=50 Head↑IR=50 Medium↑IR=50 Tail↑IR=50 All↑IR=50 C(%)↑IR=50 C/F↑IR=100 Head↑IR=100 Medium↑IR=100 Tail↑IR=100 All↑IR=100 C(%)↑IR=100 C/F↑
Focal Loss100.065.143.40.058.8100.01.067.238.613.846.2100.01.067.839.38.040.2100.01.0
Focal Loss + ours22.057.132.30.049.784.53.861.929.65.438.783.73.863.435.94.836.490.54.1
logit adjust100.057.563.20.059.4100.01.059.346.743.251.4100.01.062.447.127.946.9100.01.0
logit adjust + ours22.045.258.30.049.483.13.750.743.736.445.588.54.055.145.523.042.590.64.1
RIDE100.070.542.00.061.6100.01.068.548.844.051.1100.01.068.149.223.948.0100.01.0
RIDE + ours22.062.832.50.053.486.63.962.840.132.043.985.93.962.046.818.245.390.64.1
  • About Figure 2

Thanks for your suggestion, we have adjusted the font of Figure 2 in the original paper to improve readability.

  • About Questions

Indeed, the comparison with smaller models is an interesting point, which is fundamental to pruning studies but has unique significance in long-tailed scenarios.

First, as visualizations in Figures 2 to 6 demonstrate, LTAP pruned parameters in a non-uniform, category-aware manner, which is difficult to achieve with manual architecture design. By simultaneously considering multiple importance metrics and adjusting their weights based on accuracy feedback, LTAP may identify subtle parameter interactions that might have been missed in simple architectural reductions. This is fundamentally different from simply reducing the model size uniformly.

In addition, we try our best to organized a contrast experiment: the performance comparison between the manually designed small model and the original standard model, where the flops of the small model is 30% of the original standard model. If needed, you may kindly compare this with Table 1 in the original paper.

MethodF(%)↓IR=10 Head↑IR=10 Medium↑IR=10 Tail↑IR=10 All↑IR=10 C(%)↑IR=10 C/F↑IR=50 Head↑IR=50 Medium↑IR=50 Tail↑IR=50 All↑IR=50 C(%)↑IR=50 C/F↑IR=100 Head↑IR=100 Medium↑IR=100 Tail↑IR=100 All↑IR=100 C(%)↑IR=100 C/F↑
BS100.064.955.50.061.5100.01.062.346.137.051.2100.01.062.648.527.047.2100.01.0
BS30.054.845.10.052.184.72.853.340.830.044.186.12.855.141.621.040.485.52.8
LDAM-DRW100.060.843.30.055.4100.01.064.543.026.449.1100.01.065.148.120.145.8100.01.0
LDAM-DRW30.051.030.10.045.081.22.754.834.120.440.281.82.756.036.014.636.980.32.6
DBLP100.065.343.40.058.7100.01.061.246.532.350.2100.01.061.446.923.645.3100.01.0
DBLP30.057.930.00.049.584.32.862.829.35.538.977.42.563.732.13.334.676.32.5
最终决定

After review, the paper received four positive evaluations. Following the authors' rebuttal, two reviewers increased their ratings, while the other two maintained their original scores. All reviewers acknowledged the paper's contributions and expressed satisfaction with its technical merits.

The AC concurs with the reviewers' assessments and recommends acceptance.