3.0

/10

Rejected4 位审稿人

最低3最高3标准差0.0

4.5

置信度

ICLR 2024

G-Local Attention Graph Pooling for Graph Classification

Waqar Ali,Sebastiano Vascon,Thilo Stadelmann,Marcello Pelillo

OpenReview PDF

提交: 2023-09-24更新: 2024-02-11

TL;DR

We propose a new GNN pooling layer considering both global and local structural properties of a graph.

摘要

Graph pooling is an essential operation in Graph Neural Networks that reduces the size of an input graph while preserving its core structural properties. This compression operation improves the learned representation of the graph, yielding to a performance boost on downstream tasks. Existing pooling methods find a compressed representation considering the Global Topological Structures (e.g., cliques, stars, clusters) or Local information at node level (e.g., top-$k$ informative nodes). However, there is a lack of an effective graph pooling method that integrates both Global and Local properties of the graph. To this end, we propose a two-channel Global-Local Attention Pooling (GLA-Pool) layer that exploits the aforementioned graph properties, generating more robust graph representations. The GLA-Pool can be integrated into any GNN-based architectures. Further, we propose a smart data augmentation technique to enrich small-scale datasets. Exhaustive experiments on eight publicly available graph classification benchmarks, under standard metrics, show that GLA-Pool significantly outperforms thirteen state-of-the-art models on six datasets while being on par for the remaining two. The code will be available at this link.

关键词

Graph neural networksgraph poolingpooling layerdata augmentation

评审与讨论

审稿意见

评分: 3置信度: 52023-10-23

This paper investigates graph pooling techniques in graph neural networks for graph classification task. Existing methods either use node clustering or node selection to reduce the size of the graph. In this paper, the authors propose GLA-Pool to incorporate both global and local information of the graph. Specifically, clique algorithm is utilized to extract all the possible maximal cliques, then each clique is transformed into a single node to form a pooled graph, which captures the global property. To capture its local property, an attention mechanism is performed in each clique to select important nodes. Experimental results on several public datasets and methods demonstrate that the proposed model can achieve satisfied performance.

优点

This paper studies graph pooling for graph classification, which is an important topic in graph neural networks.
Different types of datasets are utilized to evaluate the model’s performance.
Ablation studies are given to show the effectiveness of the proposed components.
Visual figures are given to help the readers to understand the model.

缺点

The novelty of the proposed model is limited since it simply combines CliquePool and SAGPool. There are almost no key modifications in the modules.
The used datasets are too small and all the datasets are binary classification task. More large-scale datasets are suggested like ogbg-molpcba and ogbg-ppa.
The experimental settings are not consistent. In Table 1, the authors directly cited the results from existing methods. However, their settings are not same and directly using their results are not fair. For instance, in MuchPool [2021], it used 10-fold cross validation. In Wit-TopoPool [2023], it utilized 90/10 random training/test split. In this paper, the authors use 10-fold cross validation with 80% training, 10% validation and 10% testing. Therefore, reproducing the results under same setting is suggested.
It is not clear why GAT and GIN achieve such a poor performance in some of the datasets. For instance, GAT is 47.6 in Reddit-B and GIN is 57.49 in NCI-1. More discussions are encouraged in these special scenarios.
In Figure 3, the proposed GLAPool has a lower time complexity compared with CliquePool. Is the time of maximal clique extraction included?

问题

It is not clear whether the baselines are also using node augmentation in the training procedure.
The motivation of using GCN and GAT as two views is not clear. What if we only use one of them?
In Eq. (6), $S(idx, :) \in R^{N^{l+1} \times 1}$ cannot element-wisely multiply with $X^{'}(idx, :) \in R^{N^{l+1} \times d}$ . There should be some transformation operations on $S(idx, :)$ .
It is not clear which GNN backbones are used in the experimental results. Although the authors claim that any backbone is applicable, there is not experimental results for support.

伦理问题详情

No ethics review needed.

审稿意见

评分: 3置信度: 52023-10-29

This paper introduces a method called GLA-Pool, which learns pooled graphs from both local and global perspectives. Extensive experiments have been conducted on the pooling operation to verify its effectiveness.

优点

The consideration of both local and global information when designing the pooling operations is a significant aspect of this study.

缺点

The limited literature review results in a weak contribution to the field. The main challenge in this paper appears to be the design of the local and global structure learning components. For methods that incorporate global information, such as clique, cluster, and stars, there is a lack of comparison with these methods, leading to a weak justification for the first contribution. The same issue arises with methods for learning local information. Moreover, method [1] also focuses on capturing global structures, and [2] provides a detailed discussion on pooling operations. A comparison between existing methods and the two components designed in this study should be provided to justify their effectiveness.
The evaluation of data augmentations is overlooked. Although data augmentations are provided in this paper, their evaluations are ignored. It appears that GLA employs a data augmentation trick while the baselines do not, which creates an unfair advantage in the experiments.
The classification of LTS and GTS. This paper seems to categorize methods into two classes based on local and global topology extractions. It would be beneficial to explain how this differs from the selection and grouping-based methods mentioned in [2] and [3].

[1] Spectral clustering with graph neural networks for graph pooling. ICML 2020 [2] Understanding Pooling in Graph Neural Networks. TNNLS 2022 [3] Graph pooling for graph neural networks: Progress, challenges, and opportunities.

问题

Please check the weakness.

伦理问题详情

None

审稿意见

评分: 3置信度: 42023-10-30

This paper introduces a two-channel attention-based graph pooling technique GLA-Pool that effectively incorporates both graph topology and node information into hierarchical graph pooling. The importance of graph pooling in GNNs is discussed. The authors conduct experiments on various datasets, demonstrating that GLA-Pool outperforms several existing GNNs and graph pooling methods.

优点

The concept of integrating global topology and node information in graph pooling is straightforward and well-motivated.
The proposed method exhibits good performance on most datasets when compared to other graph pooling baselines.

缺点

The major concern on the paper is the lack of novelty. The paper appears to be an incremental amalgamation of existing works. In particular, the dual-strategy-based pooling resembles SAGPool in the way it generates attention with reference to clique information. The authors should provide a better positioning of their work in the existing literature.
The notation used in this paper lacks consistency and is confusing. Conventionally, bold capital letters are used to represent matrices, bold lowercase letters signify vectors, and lowercase letters denote scalars. However, the notation system in the paper mixes up these conventions: e.g., using "X," "M," and "C_r" to represent a matrix, vector, and scalar, respectively, making the equations hard to understand. Additionally, if "C" represents a set of total cliques, it should be denoted as "|C|" in Equation (2).
The paper lacks adequate discussions and comparisons with substructure-counting based methods, such as references [1], [2], and [3].
The experiments are conducted on small-scale datasets. It would be beneficial to include additional experimental results on large datasets, such as OGBG-MOLHIV and ZINC, to demonstrate the model's scalability and generalizability.
The ablation study is limited. Some aspects of the model design, such as node augmentation and the inclusion of GCN and GAT in the dual-channel, require further discussions and analyses.
The visualizations provided in the paper do not effectively support the motivation of using cliques in graph pooling. Given the limited presence of cliques with three or more nodes, it may be more informative to highlight the significance of capturing cycles in graph structures.
The presentation should be improved, especially for the methodology section. It would be helpful to polish the writing and incorporate illustrative figures or examples.

[1] "Uplifting any GNN with local structure awareness." [2] "Improving graph neural network expressivity via subgraph isomorphism counting." [3] "Boosting the cycle counting power of graph neural networks with I^2-GNNs."

问题

Please refer to Weaknesses. Some additional questions are:

Why does the proposed method take the high-degree nodes as the core part of the graph in data augmentation? The authors treat nodes with low degrees unimportant and drop them. However, in applications such as molecular datasets with toxic/non-toxic compounds, the functional groups often contain low-degree nodes with benzene rings. The proposed method may not work in such applications.
How is M_e generated? Is it based on the selected node or by another network?

审稿意见

评分: 3置信度: 42023-11-01

In this paper, the authors aim to propose a graph pooling method for enhancing the graph classification performance. In particular, the authors aim to capture both global and local properties of the graph for graph pooling. In addition, the authors also propose a data augmentation strategy to enrich small-scale datasets.

优点

Clarity: In general, the paper is well-organized and easy to follow.

Quality: The paper conducted a set of experiments to verify the effectiveness of the proposed pooling method. The authors also conducted an ablation study to understand how different components help the model.

缺点

The novelty of this paper is somewhat limited. The proposed method seems to have very marginal contributions compared with existing solutions. For example, in Section 4.3, the local topology pooling operation is simply similar to SAGPool but with a combination of GCN and GAT for learning the importance score.
Some of the arguments are not very strict. For example, the authors propose to augment the graphs by alerting the nodes with low-degree. They claimed this is due to that the graph's core structural properties and patterns are often represented by high-degree nodes. It would be better if the authors could provide some references or investigations for such claims, especially for graph classification tasks.
Some of the model designs are not well-motivated. For example, it is argued that "SAGPool yields less robust node rankings due to a single strategy for calculating node importance". However, it is not very clear how combining GCN and GAT can help address this issue.

Minor Issues:

The combination of global and local information is not detailed in the main text of the paper. It is demonstrated in Figure 1. It might be better if the authors could provide some description for this part.

问题

Please answer the question listed in the weakness.

伦理问题详情

N/A

AC 元评审

2023-12-06

This paper proposes a new graph pooling method, GLA-Pool, in graph neural networks for graph classification task. The GLA-Pool method transforms each clique into a single node, and uses an attention mechanism in each clique to select important nodes. Experiments on several public datasets and methods show satisfied performance.

The studied topic is an important one for graph neural network. The main idea is straightforward and well-motivated. Various types of datasets are utilized to evaluate the model's performance.

However, the novelty of this paper is very limited, since it is a simple combination of two existing methods. Secondly, the datasets are too small, and only are for binary classification. In addition, the experimental settings are not fair for all the compared methods. Finally, the paper should be further polished carefully.

为何不给更高分

The major concern is the limited novelty.

为何不给更低分

N/A

最终决定Reject

2024-01-16

Reject