Federated Virtual Learning on Heterogeneous Data with Local-global Distillation

Chun-Yin Huang,Ruinan Jin,Can Zhao,Daguang Xu,Xiaoxiao Li

OpenReview PDF

提交: 2023-09-16更新: 2024-02-11

摘要

关键词

Federated LearningDataset DistillationData Heterogeneity

评审与讨论

审稿意见

评分: 6置信度: 22023-10-14

This paper proposes a federated virtual learning approach that leverages local and global dataset distillation techniques to simultaneously tackle the challenge of data heterogeneity as well as efficient training in federated learning. The authors claim that dataset distillation can exacerbate the heterogeneity among clients’ local data and propose to alleviate this issue with distribution matching.
The problem addressed in this paper is novel and interesting. The adverse effect of dataset distillation in a federated learning setting is insightful. The proposed approach seems feasible and promising.
In your paper, the model on clients is split into feature extractors and classification heads. This split learning-like paradigm has been widely adopted by a series of prior works [1,2,3]. Please explain the deplorability of your approach on existing methods. More elaboration on how your proposed method relates to these works would be appreciated.
If I understand you correctly, FedProx is proposed by [4] rather than [5]. Do I misunderstand something?
Some of the benchmark algorithms, such as FedProx [4], Scaffold [6], are somewhat outdated. In your experiments, you have used different open-sourced datasets as private data for clients, and this degree of data heterogeneity is apparently unfavorable for the regularization-based methods mentioned above. Would it be possible to compare your approach with some novel federated learning methods based on GANs [7], which seem to be more suitable for your scenario?

[1] "FedICT: Federated Multi-task Distillation for Multi-access Edge Computing." IEEE Transactions on Parallel and Distributed Systems (2023).

[2] "Group knowledge transfer: Federated learning of large cnns at the edge." Advances in Neural Information Processing Systems 33 (2020): 14068-14080.

[3] "Exploring the distributed knowledge congruence in proxy-data-free federated distillation." arXiv preprint arXiv:2204.07028 (2022).

[4] "Federated optimization in heterogeneous networks." Proceedings of Machine learning and systems 2 (2020): 429-450.

[5] "On the convergence of fedavg on non-iid data." arXiv preprint arXiv:1907.02189 (2019).

[6] "Scaffold: Stochastic controlled averaging for federated learning." International conference on machine learning. PMLR, 2020.

[7] "Data-free knowledge distillation for heterogeneous federated learning." International conference on machine learning. PMLR, 2021.

优点

The problem addressed in this paper is novel and interesting. The adverse effect of dataset distillation in a federated learning setting is insightful. The proposed approach seems feasible and promising.

缺点

1.In your paper, the model on clients is split into feature extractors and classification heads. This split learning-like paradigm has been widely adopted by a series of prior works [1,2,3]. Please explain the deplorability of your approach on existing methods. More elaboration on how your proposed method relates to these works would be appreciated.

2.If I understand you correctly, FedProx is proposed by [4] rather than [5]. Do I misunderstand something?

3.Some of the benchmark algorithms, such as FedProx [4], Scaffold [6], are somewhat outdated. In your experiments, you have used different open-sourced datasets as private data for clients, and this degree of data heterogeneity is apparently unfavorable for the regularization-based methods mentioned above. Would it be possible to compare your approach with some novel federated learning methods based on GANs [7], which seem to be more suitable for your scenario?