Fully Hyperbolic Representation Learning on Knowledge Hypergraph
摘要
评审与讨论
This paper proposes a method called Hyperbolic Hypergraph GNN (H2GNN) for representation learning on knowledge hypergraphs. The authors argue that existing methods for knowledge hypergraph representation learning either transform hyperedges into binary relations or ignore the adjacencies between hyperedges, resulting in information loss and sub-optimal models. To address these issues, H2GNN uses a hyper-star message passing scheme to capture the semantics of hyper-relations, and it works in the fully hyperbolic space to reduce distortion and improve efficiency.
优点
- The paper proposes a novel method, H2GNN, for representation learning on knowledge hypergraphs. It addresses the limitations of existing methods by considering the adjacencies between hyperedges and explicitly incorporating entity positions.
- H2GNN introduces the hyper-star message passing scheme, which is a novel approach to capture the semantics of hyper-relations.
- The paper demonstrates the effectiveness of H2GNN through extensive experiments on node classification and link prediction tasks, comparing it with state-of-the-art baselines on various datasets.
缺点
- For innovation, this paper lies mainly in the manipulation of star extensions and hyperbolic transformations of hypergraphs and is not very innovative.
- The article lacks logic and does not reflect the necessity of combining the star extension and hyperbolic operation of the hypergraph. Hypergraphs themselves have a hierarchical structure, and following the logic in the article it would be possible to get low distortion results by learning hypergraphs directly on hyperbolic space.
- Insufficient theoretical depth in experimental analyses. The good results of an experiment do not necessarily mean the quality of the article. It is important to start with the results and further compare and analyze them to explore the reasons why the method is SOTA.
问题
- please elaborate on the reasons that hyperbolic space outperforms Euclidean space, such as how the distortion is reduced. The argument of the article is that the graph structure after the star extension of the hypergraph structure has a hierarchical structure and therefore has less distortion in the hyperbolic space. But the hypergraph structure and the graph structure after the star extension are bijective equivalences, so why must one need to combine the hyperbolic operation with the star extension? The writer needs to sort out the logic of the article again.
- Is it possible to visualize star-expanded hypergraphs on hyperbolic space? This would enhance the persuasiveness of the article.
- Can you further analyze in your experiments why your method can perform better compared to other hypergraphic learning methods?
In this work the authors propose a hypergraph-specific message passing scheme intended to capture higher-order relations in knowledge hypergraphs. The main novel contribution of the authors is their Hyper-star Message Passing algorithm, which describes how to update hyperedge and node representations in the hyperboloid . Given existing node features and a hyperedge , the authors define the representation of this hyperedge as the centroid (in the hyperboloid) of .
To get an updated representation of node , they first consider , the set of all hyperedges containing . From each , they create which is a relation-type and position-specific embedding based on the position of in . They then update the node embedding for by calculating the centroid of and .
The authors evaluate this message passing approach on two tasks - node classification, and link prediction - where they observe improvements over a number of baselines. They also perform a comparative study evaluating different decoders for link prediction as well as different encoders, as well as an ablation of the model which suggests both the hyperbolic aspects of the model and position information of the model are beneficial for link prediction.
优点
The proposed hypergraph-specific message passing scheme is original, and intuitively seems to be a reasonable approach to capturing information in the hypergraph setting. The formulation of their model allows for a flexible encoder/decoder architecture which the authors highlight in one of their comparative evaluations.
缺点
Unfortunately, I was not able to fully understand the author's proposed model. For example, in section 3.1 the authors talk about implementing a "matrix function to perform linear transformations in hyperbolic space", but it is not clear to me when this was used. In looking at the diagram of the model in Figure 3, it looks like a hypergraph is passed as input to the linear transformation, which is then passed to the message passing component, but this does not make sense to me. I would love to see a clear step-by-step explanation of the model, taking a hypergraph as input and outputting node embeddings.
The empirical evidence is also somewhat mixed. The authors claim their model "significantly outperforms other methods on all datasets", but actually it is relatively close (and almost always within one standard deviation) to UniSAGE. The link prediction results are somewhat similar - while the author's proposed model performs well, HypE is a close contender (and outperforms in a few cases). Link prediction is also a task fraught with evaluation difficulties, with test set bias often explaining minor differences in performance.
问题
- Could you provide a clear step-by-step explanation of how your model starts from a hypergraph and creates node embeddings?
- I was surprised that the position-specific embedding modifies the representation of a hyperredge in the aggregation for creating a new node representation as opposed to modifying the representation of the node in the aggregation for creating a hyperedge representation. Is there some benefit to your proposed approach compared to the one I describe here?
- The notation for is confusing. First, it is stated that , but then it is subtracted from a vector in . Secondly, the dependencies are not clear from the notation. From the text, it is clear that is dependent on both a relation and a position, but in equation (4) this is left to determine from context. I assume that in equation (4), is different for each and depends on the relation in as well as the position of in , but the current notation really doesn't make this clear.
- In section 3.3 the loss functions are presented, but it is not explained how the encodings from the message passing algorithm are converted to conditional probabilities equation (6) or confidence scores in equation (7). I assume an MLP in the first case, and I see that there is a later discussion about decoders for link prediction, but details here are lacking.
- To my understanding, m-DistMult is not particularly well-aligned to extract information from H^2GNN, yet it is shown in Figure 4 that it performs far worse when paired with UniGNN or UniSAGE. Were these encoders trained with m-DistMult as the decoder, or are these the result of simply using m-DistMult on top of frozen encoders? What is the explanation for the incredibly poor performance of UniGNN and UniSAGE when using m-DistMult?
- In section 4.4 there is an ablation of Hyperbolic Operation (HO) and Position Information (PI) modules, which is the first time these terms are mentioned in the text. What are these modules? (This goes back to my primary question related to a clear step-by-step structure of the model.)
Typos / Suggestions
- Page 2: Additional criteria are needed to complete this statement: "The goal of representation learning is to obtain embedded representations for each node x ∈ V and relation type r ∈ R in the knowledge hypergraphs." As it stands, this could be satisfied by randomly assigning a vector to each node. The goal here should include some additional objective, eg. "... such that they can be used to reconstruct the original graph".
- Page 3: "In Figure 2, when we embed the tree structure into Euclidean space, the distance between the yellow and pink nodes on the tree is 8 nodes apart, but in reality, these tree-like structures are very close in real networks." I do not understand the latter part of this sentence. As I have understood it, the problem is typically the reverse - that embedding a tree in Euclidean space makes leaves from different paths closer than they should be, which is exactly as depicted in Figure 2 and also concordant with the sentence which follows this one in your paper.
- Page 4-5: The notation doesn't make sense to me, because the integer is not in . At best, I could see something like , or maybe just using as the argument to the centroid function and putting "for " to the right or in the text above the block equation.
- Page 5: The squared Lorentzian norm is mentioned but has not been defined, it would be good to include the definition for completeness.
- Page 6: I would advise against claiming the model "significantly outperforms" the baselines when, in fact, many of the results are within 1 standard deviation of the UniSAGE baseline. Also, the accuracy range of 89.75% to 74.89% quoted in the text is incorrect based on the numbers in Table 2.
This paper developed the Hyperbolic Hypergraph GNN (H2GNN) for modeling complex relationships in knowledge hypergraphs using hyperedges. This method utilizes hyper-star message passing for a non-lossy hierarchical representation of hyperedges, embedding them in hyperbolic space for reduced distortion and improved efficiency. H2GNN outperforms well in experiments, but it does not compare with the strong baselines.
优点
The paper is well-organized.
The paper provides a comprehensive literature review.
缺点
The paper is not organized clearly, which is not easy to understand. For instance, there is a lack of details for hyperbolic space explanation.
There lacks a comparison between the strong baselines, such as [1] [2], which makes the performance not comparable.
[1] Searching to sparsify tensor decomposition for n-ary relational data
[2] Role-aware modeling for n-ary relational knowledge bases.
问题
Please refer to the Weakness.
The paper proposes a new representation learning method and a message passing algorithm for hypergraphs.
Weakness: The method is close to the well-known star extension and lacks of clarity to explain the experimental results, in particular there is no theoretical insight and no fair comparison with competing methods.
The authors did not reply to questions from the reviewers. I consider that they adhere to the comments that have been given.
为何不给更高分
The authors did not reply to questions from the reviewers. I consider that they adhere to the comments that have been given.
为何不给更低分
N/A
Reject