Diffusion-based Graph Masked Autoencoders for Out-of-Distribution Generalization

Jiahao Liang,Zhiwen Yu,Yang Hu,Xiaoqing Liu,Tong Zhang,Kaixiang Yang

提交: 2024-09-28更新: 2024-10-09

摘要

Graph Out-of-Distribution (GraphOOD) problems have become increasingly significant in the field of graph neural networks. Graph Neural Networks (GNNs) are particularly vulnerable to performance degradation when facing distribution shifts. This is due to the intricate interconnections between nodes in graph data and the lack of environmental labels, making it difficult to ensure model reliability. Recent advances in computer vision have shown that Diffusion Models(DMs) have strong generalization capabilities, providing a natural advantage in mitigating the effects of distribution shifts. Specifically, DMs can effectively capture and generate details of data distributions through a stepwise denoising process, thereby enhancing model robustness. However, applying diffusion to GraphOOD problems presents challenges, such as learning invariant knowledge that remains unaffected by distribution shifts. To address this, we propose a diffusion-based pre-training model for GraphOOD, termed $D$iffusion-based $M$asked $A$uto$E$ncoders on Graph Out-of-Distribution Generalization (DiffGMAE). Firstly, we propose a novel empirical risk minimization (ERM) approach that enhances the data by progressively adding noise, called the NoisedERM module, which aims to learn invariant features and avoid corrupting the discrete information of the original graph. Then, we design a self-supervised learning module called DiGMAE, which replaces the traditional MAE decoder with a diffuse-based denoising process. The aim is to use the invariant features obtained by NoisedERM for conditional diffusion and improve the robustness of the model in a self-supervised way to cope with the distribution shift of GraphOOD problem. We demonstrate significant improvements in DiffGMAE on OOD benchmarks. In addition, our ablation experiments show that the diffusion process is superior to traditional graph generation methods in solving OOD problems. The implementation code is available in Supplementary material for reproducibility.

关键词

Machine Learning; Deep learning; Graph learning; Self-supervised learning

评审与讨论

撤稿通知

2024-10-09

Algorithm in Supplementary Material did not submit