PaperHub
5.8
/10
Poster4 位审稿人
最低5最高7标准差0.8
5
5
6
7
3.8
置信度
正确性3.0
贡献度2.5
表达3.0
NeurIPS 2024

DECRL: A Deep Evolutionary Clustering Jointed Temporal Knowledge Graph Representation Learning Approach

OpenReviewPDF
提交: 2024-05-13更新: 2024-12-19

摘要

关键词
Event PredictionTemporal Knowledge GraphsRepresentation LearningEvolutionary ClusteringGraph Neural Networks

评审与讨论

审稿意见
5

This paper proposed an interesting model to integrate high-order clusters into the TKG representation learning. A cluster-aware unsupervised alignment mechanism is introduced to ensure the alignment of soft overlapping clusters across timestamps. An implicit correlation encoder is also proposed to capture latent correlations between clusters. Experimental results show the effectiveness of the proposed model.

优点

  1. Integrating high-order structure in TKG representation learning is interesting.
  2. The proposed model seems technically reasonable.

缺点

  1. Entity graph and cluster graph are two important concepts for understanding the proposed method. However, they are never mentioned in the introduction and their definitions in Section 3 are very brief, making the intuition behind the proposed method unclear to me.
  2. The authors use a relatively simple task future relation prediction to evaluate the quality of the learned TKG representations. However, the future entity prediction (i.e., [s, r, ?, t]) is more challenging due to the large size of the entity set and the evolution of entity semantics. Intuitively, modeling high-order correlation among entities can also be beneficial for entity prediction. It is better to also evaluate the performance of the learned representations on this task.
  3. The authors cluster entities only based on their representations. However, in knowledge graphs, entities are connected via various relations and different relations may indicate different correlations. For example, the relation "leave from" means an athlete will not interact with a club, but "transfer to" means they will have more interactions. How can the proposed method handle such correlations brought by relation semantics?
  4. Only two benchmark datasets are used for evaluation, making the experimental results unconvincing. ICEWS14C and ICEWS18C datasets are actually subsets of ICEWS14 and ICEWS18, and they are both derived from the same resource (i.e., ICEWS). More TKGs especially from different resources such as GDELT, YAGO, and Wikidata should also be used for evaluation.

问题

See weakness

局限性

Yes

作者回复

We sincerely appreciate the reviewer’s meaningful comments and insightful questions.

Q1: Lack of clarity in introducing and defining entity graph and cluster graph concepts

Thank you for raising this concern. We acknowledge that these crucial concepts were not sufficiently introduced or defined, potentially obscuring the intuition behind our approach. Therefore, in the camera-ready version, we will incorporate a concise yet informative discussion of entity graphs and cluster graphs in the introduction, providing readers with an early understanding of these key concepts and their role in our approach. In Section 3 (Preliminaries), we will expand our definitions of entity graphs and cluster graphs, including more detailed explanations of their structures, properties, and significance in our approach.

Q2: Lack of evaluation on future entity prediction task

In response to this valuable feedback, we have conducted additional experiments to evaluate our model’s performance on the future entity prediction task, as shown in Table 2 of the rebuttal PDF (which is attached in the global rebuttal in the beginning). Although our approach do not achieve the SOTA performance in terms of MRR and Hits@1, it achieves the best results for Hits@3 and Hits@10. These results demonstrate the effectiveness and robustness of our approach, particularly in capturing a broader range of relevant entities.

Q3: Handling of different relation semantics in entity clustering

Firstly, it is important to note that TKG datasets do not provide explicit semantic descriptions of relations like “leave from” or “transfer to”. The training process typically uses only entity and relation IDs. However, we do model different relation types using a Relation-Aware Graph Convolutional Network, capturing distinct characteristics of various relation types even without explicit semantic descriptions.

Moreover, our DECRL approach is designed to capture the temporal evolution of high-order correlations, which indirectly addresses the issue of different relation semantics. For example, if entities consistently interact over a continuous period, they have a higher probability of being clustered together at each timestamp. By capturing the temporal evolution of high-order correlations, our approach reinforces the closeness of their relationship over time. Conversely, if entities do not consistently interact over time, they have a lower probability of being clustered together. The temporal evolution component of our approach allows for the gradual distancing of these entities in the representation space. Therefore, DECRL can effectively handle scenarios where different relations may indicate varying levels of future interaction, without relying on explicit semantic information.

To illustrate this capability, we would like to draw attention to the comparison between Figure 2d (Final DECRL) and Figure 2f (Final DECRL-w/o-fusion, which only models high-order correlations without capturing their temporal evolution) in the manuscript. This comparison clearly illustrates that capturing the temporal evolution of high-order correlations leads to superior entity representations, as evidenced by the larger inter-cluster distances and tighter intra-cluster entity groupings. Furthermore, by comparing the first and third columns of Figure 2 in the manuscript, we can observe the progression of training. This comparison demonstrates that capturing the temporal evolution of high-order correlations gradually increases the separation between clusters while simultaneously tightening the grouping of entities within clusters.

Q4: Omission of Wikidata, YAGO and GDELT datasets in the experiments.

Firstly, there is a fundamental difference in timestamp types between ICEWS and the Wikidata and YAGO datasets. ICEWS uses single-point timestamps, while Wikidata and YAGO use time intervals for events. Our research focuses on modeling the temporal evolution of high-order correlations between entities. Datasets with single-point timestamps, e.g., ICEWS, show more frequent temporal changes and higher temporal complexity, making them more suitable for capturing the temporal evolution of high-order correlations. This is why we initially focused on the ICEWS dataset. Moreover, the SOTA models have already achieved very high performance on Wikidata and YAGO datasets, with MRR scores exceeding 99% for relation prediction. Given the limited room for improvement, we initially excluded these datasets from our preliminary manuscript.

In addition, we initially excluded GDELT dataset due to its known issues with false positives [1] and a high proportion of abstract conceptual entities (e.g., POLICE and GOVERNMENT) [2], since we cannot predict a government’s activities without knowing which country it belongs to.

However, we acknowledge that including a wider range of datasets would provide a more comprehensive evaluation of our approach’s performance and robustness. In light of your valuable feedback, we have conducted additional experiments on the GDELT dataset, as well as on the WIKI and YAGO datasets. The results of these experiments are presented in Tables 1 and 3 of the attached rebuttal PDF (which is attached in the global rebuttal in the beginning). We are pleased to report that our approach has achieved the SOTA relation prediction performance across all these datasets, demonstrating the effectiveness and robustness of our approach.

[1] Ward, M. D., Beger, A., Cutler, J., Dickenson, M., Dorff, C., & Radford, B. Comparing GDELT and ICEWS event data. Analysis, 21(1): 267-297, 2013.

[2] Li, Z., Jin, X., Li, W., Guan, S., Guo, J., Shen, H. et al. Temporal knowledge graph reasoning based on evolutional representation learning. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 408-417, 2021.

评论

Thank the authors for their efforts in addressing my concerns. My main concerns have been addressed and I trust that the authors can address others in the final version of the paper.

评论

We would like to thank Reviewer fmj8 for providing a valuable and constructive review, which has inspired us to improve our paper substantially. We will be dedicated to updating our manuscript as suggested.

Thanks again for your response and raising the score!

审稿意见
5

The paper addresses Temporal Knowledge Graph (TKG) representation learning, which aims to embed temporally evolving entities and relations into a continuous low-dimensional vector space. Existing methods struggle to capture the temporal evolution of high-order correlations in TKGs. The authors propose a novel approach called Deep Evolutionary Clustering jointed temporal knowledge graph Representation Learning (DECRL). DECRL is the first to integrate deep evolutionary clustering with TKG representation learning to capture the temporal evolution of high-order correlations.

优点

1 The author clearly describes the motivation for the paper and the methods used. 2 The experimental results demonstrate that DECRL achieves state-of-the-art (SOTA) performance.

缺点

1 Most event prediction models are often tested on ICEWS05-15 and GDELT datasets. The author does not give experimental results on these two datasets.

问题

1 When clustering nodes using fuzzy clustering, have the authors considered incorporating domain knowledge for node classification? Additionally, have they compared their approach with other node classification methods? 2 Some nodes may not have clear relationships with other nodes. Will clustering them together affect the behavior prediction of these nodes?

局限性

Please refer to the weaknesses.

作者回复

We sincerely appreciate the reviewer’s meaningful comments and insightful questions.

Q1: Absence of experimental results on ICEWS05-15 and GDELT datasets

ICEWS05-15 dataset shares the same source as ICEWS and has a similar scale to ICEWS18, which we included in our initial experiments, so we do not test our approach on ICEWS05-15 dataset. In addition, we initially excluded GDELT dataset due to its known issues with false positives [1] and a high proportion of abstract conceptual entities (e.g., POLICE and GOVERNMENT) [2], as we cannot predict a government’s activities without knowing which country it belongs to.

However, we acknowledge that including a wider range of datasets would provide a more comprehensive evaluation of our approach’s performance and robustness. In light of your valuable feedback, we have conducted additional experiments on the GDELT dataset, as well as on the WIKI and YAGO datasets. The results of these experiments are presented in Tables 1 and 3 of the attached rebuttal PDF (which is attached in the global rebuttal in the beginning). We are pleased to report that our approach has achieved the SOTA relation prediction performance across all these datasets, demonstrating the effectiveness and robustness of our approach.

[1] Ward, M. D., Beger, A., Cutler, J., Dickenson, M., Dorff, C., & Radford, B. Comparing GDELT and ICEWS event data. Analysis, 21(1): 267-297, 2013.

[2] Li, Z., Jin, X., Li, W., Guan, S., Guo, J., Shen, H., et al. Temporal knowledge graph reasoning based on evolutional representation learning. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 408-417, 2021.

Q2: Consideration of domain knowledge in node classification and comparison with other node classification methods

We did not incorporate domain knowledge into our fuzzy clustering process for two primary reasons: First, the publicly available datasets we used do not provide corresponding domain knowledge information. Second, since additional domain knowledge are typically not employed in other temporal knowledge graph research, doing so could lead to unfair comparisons with existing methods.

In temporal knowledge graphs, there are no explicit ground truth entity categories. Given this lack of predefined classes, we chose an unsupervised clustering algorithm to model high-order correlations among entities, allowing us to discover latent structures without relying on pre-existing labels.

However, we acknowledge the importance of comparing our approach with alternative clustering techniques. To address this, we have conducted additional experiments using different clustering algorithms as variants of our approach. The results of these experiments are shown in Table 4 of the attached rebuttal PDF, providing a more comprehensive evaluation of our approach’s effectiveness compared to other potential clustering strategies.

Q3: Potential impact of clustering nodes with unclear relationships on behavior prediction

Firstly, in our temporal knowledge graph datasets, nodes cannot exist in complete isolation as the data is structured around events, ensuring each node participates in at least one event and thus has a relationship with at least one other node.

We acknowledge that some nodes may have fewer interactions than others. Our approach addresses this variation effectively through a fuzzy clustering algorithm, which allows nodes to belong to multiple clusters with varying degrees of membership. The fuzzy smoothing hyperparameter controls node membership distribution across clusters, preventing scenarios where clusters contain very few nodes. This method enables effective cluster construction even for less frequently interacting nodes, allowing them to have partial memberships in multiple clusters and reflecting their potentially ambiguous relationships.

To further address this concern, in our camera-ready version, we will group nodes based on their interaction frequency in the training set and analyze relation prediction performance across these different node groups.

审稿意见
6

This paper studies the temporal knowledge graph representation learning task and proposes a temporal evolution-aware framework DECRL.

By assigning different entities to distinct clusters at each timestamp and modeling the evolution and shifts of these clusters, cluster-aware information is explicitly incorporated into both entity and relation embeddings. This enables better temporal intelligence for making precise predictions.

Extensive experiments demonstrate promising results on several benchmarks.

优点

  • This paper is well-written and easy to follow.
  • Technical details are well presented and, to some extent, clearly explained.
  • The experiments are extensive, covering various benchmarks, comparing diverse baselines, and demonstrating effectiveness through both qualitative and quantitative analyses.

缺点

  • The motivation for introducing a cluster graph at each timestamp to capture temporal shift information at the clustering scale remains unclear. The authors should further elaborate on this overall motivation. Additionally, the authors mentioned that “some researchers have leveraged derived structures, e.g., communities, entity groups, and hypergraphs, to model high-order correlations among …” (Lines 29-30). In my understanding, the modeling of communities and groups is similar to this paper’s clustering, and hypergraphs can be regarded as another approach for group modeling since a hyperedge connects different nodes. The differences in motivation and technique between DECRL and these methods should be briefly discussed.
  • The methodology section of DECRL incorporates many minor techniques without corresponding ablation studies to demonstrate their effectiveness or detailed explanations of the rationale. For example, the effectiveness of temporal attentive pooling (Section 4.5) lacks ablation study evidence. Besides, detailed operations for each variant in the ablation study should be provided.
  • For Figure 2 in the case study, additional textual explanations are needed, such as clarifying what the red dots represent. Additionally, the authors should consider including baseline methods in the figure, rather than only showing DECRL and its variants.

问题

Please see weaknesses

局限性

NA

作者回复

We sincerely appreciate the reviewer’s meaningful comments and insightful questions.

Q1: Unclear motivation for introducing a cluster graph at each timestamp to capture temporal shift information at the clustering scale

We consider the complex dynamics of international alliances and conflicts, which exhibit high-order correlations that evolve over time. Countries rarely interact in isolation, for instance, the relationship between the USA and Russia affects not only these two countries but also influences their respective allies and trade partners. We use clustering to capture these complex high-order correlations.

These high-order correlations evolve smoothly over time. The Cold War, for example, did not end abruptly but gradually thawed through a series of events. When constructing our cluster graph, we consider how past high-order correlations (previous clusters) influence current ones, modeling this smooth temporal evolution.

Furthermore, within a single timestamp, different clusters of entities can affect each other. For example, tensions between NATO countries and Russia might influence relations between OPEC members and Western nations. We model these intra-timestamp influences through an implicit correlation encoder in our proposed approach.

By capturing these aspects, our approach can represent the nuanced, temporal evolving nature of high-order correlations more accurately.

Q2: Lack of discussion on the differences in motivation and technique between DECRL and other methods using derived structures

While entity groups, hypergraphs, and clustering can all model high-order correlations, our approach offers unique advantages:

Firstly, methods using entity groups and hypergraphs require learning entity assignment mappers at each timestamp, a process that is complex to update and maintain, resulting in significant computational overhead. Our clustering-based approach, however, adapts more flexibly to dynamic data changes and is lightweight, facilitating easier integration with other techniques.

Secondly, we use a fuzzy clustering algorithm with a fuzzy smoothing hyperparameter that controls node membership distribution, preventing clusters with very few nodes. This approach allows for effective cluster construction even for nodes with limited interactions. Using entity graphs or hypergraphs to achieve a similar advantage would be much more computationally expensive and resource-intensive.

In addition, our experiments demonstrate the effectiveness of our approach compared to methods using entity groups and hypergraphs. For example, the DECRL-w/o-fusion variant, which uses only clustering for representation learning, achieves MRR, Hits@1, Hits@3, and Hits@10 scores of 57.98, 41.90, 66.97, and 92.00, respectively. These results outperform the hypergraph-based method, i.e., DHyper, which scores 56.15, 43.76, 65.46, and 85.89 on the same metrics. This superior performance can be partly attributed to fuzzy clustering’s ability to prevent the formation of extremely small clusters.

Q3: Lack of ablation studies for minor techniques and insufficient explanation of variant operations in existing ablation studies

Thank you for raising this concern. We recognize the oversight in not including an ablation study for the temporal attentive pooling component. To address this, we have conducted an additional experiment to demonstrate its effectiveness, as shown in Table 4 of the rebuttal PDF (which is attached in the global rebuttal in the beginning). These results clearly illustrate the impact of temporal attentive pooling on our approach’s performance.

In addition, in light of your valuable feedback, we acknowledge that our initial description of the ablation study variants may have been insufficient. To address this, we will prepare more detailed explanations for each variant in our camera-ready version.

Q4: Need for additional explanations in Figure 2 and inclusion of baseline methods in the case study

We will enhance the explanations of Figure 2 in the manuscript, clarifying that red dots represent individual entities in the temporal knowledge graph, with their groupings indicating entity clusters.

Furthermore, we would like to highlight key observations from the Figure 2. The comparison between Figure 2d (Final DECRL) and Figure 2f (Final DECRL-w/o-fusion, which only models high-order correlations without capturing their temporal evolution) in the manuscript clearly illustrates that capturing the temporal evolution of high-order correlations leads to superior entity representations, as evidenced by the larger inter-cluster distances and tighter intra-cluster entity groupings. In addition, by comparing the first and third columns of Figure 2 in the manuscript, we can observe the progression of training. This comparison demonstrates that capturing the temporal evolution of high-order correlations gradually increases the separation between clusters while simultaneously tightening the grouping of entities within clusters. This observation reveals that the capability of DECRL to model the temporal evolution of high-order correlation significantly enhances its ability to capture more nuanced cluster representations.

We also acknowledge the importance of comparing our approach with baselines in the case study. We have conducted additional visualizations for DHyper, the second-best baseline model, in Figure 2 of the attached rebuttal PDF.

In light of your valuable feedback, we will refine all the content above and incorporate it into the camera-ready version.

评论

Thank the authors for their rebuttal in response to the weaknesses I pointed out in the review, including the motivation, extra ablation study, etc.

The authors sufficiently address these weaknesses in their rebuttal, and I trust that authors are able to address them in the final version of the paper.

评论

We would like to thank Reviewer S3KV for providing a detailed and valuable review, which has greatly assisted us in the paper revision. We will address all the weaknesses you pointed out and incorporate them in the final version of the paper.

We are profoundly grateful for the generous score increases from reviewers C7a5 and fmj8. In light of this, we humbly and respectfully ask if you might consider increasing your score. Any potential increase in your score would be received with the utmost gratitude and appreciation. We fully understand the time and effort involved in the review process and are sincerely thankful for your valuable suggestions.

审稿意见
7

The paper proposed a deep evolutionary clustering method for TKGE to capture the temporal evolution of high-order correlation in TKGs. A cluster-aware unsupervised alignment mechanism is introduced to ensure the precise one-to-one alignment of soft overlapping clusters across timestamps. Extensive experiments on four real-world datasets demonstrate the remarkable improvement of the proposed method compared to other baselines.

优点

  1. The experimental results are remarkable for the improvement of the ICEWS datasets.
  2. The paper is well-organized and clearly presented.

缺点

  1. The paper proposed to capture the temporal evolution of high-order correlation, while no intuitional case study is provided.
  2. Though extensive experiments are conducted, two other main TKGE datasets are omitted, namely Wikidata and YAGO.

问题

  1. Will the proposed method still outperform other baselines by large margins on Wikidata and YAGO datasets? The type of time information in these two datasets is different from ICEWS datasets.

局限性

The proposed method mainly focuses on improving the accuracy of link prediction for TKGs, while other aspects such as efficiency and transparency are not discussed.

作者回复

We sincerely appreciate the reviewer’s meaningful comments and insightful questions.

Q1: Lack of intuitive case study demonstrating temporal evolution of high-order correlations

We would like to draw attention to the comparison between Figure 2d (Final DECRL) and Figure 2f (Final DECRL-w/o-fusion, which only models high-order correlations without capturing their temporal evolution) in the manuscript. This comparison clearly illustrates that capturing the temporal evolution of high-order correlations leads to superior entity representations, as evidenced by the larger inter-cluster distances and tighter intra-cluster entity groupings. Furthermore, by comparing the first and third columns of Figure 2 in the manuscript, we can observe the progression of training. This comparison demonstrates that capturing the temporal evolution of high-order correlations gradually increases the separation between clusters while simultaneously tightening the grouping of entities within clusters. This observation reveals that the capability of DECRL to model the temporal evolution of high-order correlation significantly enhances its ability to capture more nuanced cluster representations.

In addition, we also incorporate the case study results from DHyper to further substantiate the effectiveness of our approach, as shown in Figure 2 of the rebuttal PDF (which is attached in the global rebuttal in the beginning).

Q2: Omission of Wikidata and YAGO datasets in the experiments

Firstly, there is a fundamental difference in timestamp types between ICEWS and the Wikidata and YAGO datasets. ICEWS uses single-point timestamps, while Wikidata and YAGO use time intervals for events. Our research focuses on modeling the temporal evolution of high-order correlations between entities. Datasets with single-point timestamps, e.g., ICEWS, show more frequent temporal changes and higher temporal complexity, making them more suitable for capturing the temporal evolution of high-order correlations. This is why we initially focused on the ICEWS dataset.

Moreover, the SOTA models have already achieved very high performance on Wikidata and YAGO datasets, with MRR scores exceeding 99% for relation prediction. Given the limited room for improvement, we initially excluded these datasets.

However, we acknowledge the importance of comprehensive evaluation. In light of your valuable feedback, we have conducted additional experiments on the Wikidata, YAGO, and GDELT datasets, as shown in Tables 1 and 3 of the attached rebuttal PDF. We are pleased to report that our approach has achieved the SOTA relation prediction performance across all these datasets, demonstrating the effectiveness and robustness of our approach.

Q3: Lack of discussion on efficiency and transparency aspects of the proposed method

We have indeed considered the efficiency of our approach and have calculated its time complexity, which is detailed in Appendix A.1 of our manuscript. Our approach employs evolutionary clustering to capture the temporal evolution of high-order correlations, requiring fewer parameters and lower memory resources compared to approaches using learnable structures like hypergraphs. This design choice significantly contributes to the overall efficiency of our approach.

To further demonstrate the efficiency of our approach, we have conducted additional experiments comparing the training time of our approach with DHyper. The results of these experiments are presented in Figure 1 of the attached rebuttal PDF, clearly illustrating the computational advantages of our approach.

Regarding transparency, we acknowledge that this aspect is not thoroughly addressed. We appreciate the reviewer bringing this to our attention. We will include a comprehensive discussion on the transparency of our approach in the limitations section of the camera-ready version. This addition will highlight areas for future improvement.

作者回复

Summary of Revision:

We sincerely thank all the reviewers for their insightful reviews and valuable comments, which are instructive for us to improve our paper further.

The reviewers generally held positive opinions of our paper, in that the proposed approach is “technically reasonable”, “well presented”, “clearly explained”, and we “clearly describes the motivation for the paper and the methods used”; this paper is “well-organized”, “clearly presented”, “well-written”, and “easy to follow”; we “demonstrate effectiveness through both qualitative and quantitative analyses”, the experiment results are “remarkable”; and the proposed approach “achieves state-of-the-art (SOTA) performance”.

The reviewers also raised insightful and constructive concerns. We made every effort to address all the concerns by clarifying DECRL’s distinctions from related methods and the ability to address diverse relation semantics. We supplement with new experiments on Wikidata, YAGO, and GDELT datasets, along with additional ablation and case studies.

Q1: DECRL’s distinctions from related methods

While entity groups, hypergraphs, and clustering can all model high-order correlations, our approach offers unique advantages:

Firstly, methods using entity groups and hypergraphs require learning entity assignment mappers at each timestamp, a process that is complex to update and maintain, resulting in significant computational overhead. Our clustering-based approach, however, adapts more flexibly to dynamic data changes and is lightweight, facilitating easier integration with other techniques.

Secondly, we use a fuzzy clustering algorithm with a fuzzy smoothing hyperparameter that controls node membership distribution, preventing clusters with very few nodes. This approach allows for effective cluster construction even for nodes with limited interactions. Using entity graphs or hypergraphs to achieve a similar advantage would be much more computationally expensive and resource-intensive.

In addition, our experiments demonstrate the effectiveness of our approach compared to methods using entity groups and hypergraphs. For example, the DECRL-w/o-fusion variant, which uses only clustering for representation learning, achieves MRR, Hits@1, Hits@3, and Hits@10 scores of 57.98, 41.90, 66.97, and 92.00, respectively. These results outperform the hypergraph-based method, i.e., DHyper, which scores 56.15, 43.76, 65.46, and 85.89 on the same metrics. This superior performance can be partly attributed to fuzzy clustering’s ability to prevent the formation of extremely small clusters.

Q2: The ability to address diverse relation semantics

Firstly, it is important to note that TKG datasets do not provide explicit semantic descriptions of relations like “leave from” or “transfer to”. The training process typically uses only entity and relation IDs. However, we do model different relation types using a Relation-Aware Graph Convolutional Network, capturing distinct characteristics of various relation types even without explicit semantic descriptions.

Moreover, DECRL is designed to capture the temporal evolution of high-order correlations, which indirectly addresses the issue of diverse relation semantics. For example, if entities consistently interact over a continuous period, they have a higher probability of being clustered together at each timestamp. By capturing the temporal evolution of high-order correlations, our approach reinforces the closeness of their relationship over time. Conversely, if entities do not consistently interact over time, they have a lower probability of being clustered together. The temporal evolution component allows for the gradual distancing of these entities in the representation space. Therefore, DECRL can effectively handle scenarios where different relations may indicate varying levels of future interaction, without relying on explicit semantic information.

To illustrate this capability, we would like to draw attention to the comparison between Figure 2d (Final DECRL) and Figure 2f (Final DECRL-w/o-fusion, which only models high-order correlations without capturing their temporal evolution). This comparison clearly illustrates that capturing the temporal evolution of high-order correlations leads to superior entity representations, as evidenced by the larger inter-cluster distances and tighter intra-cluster entity groupings. Furthermore, by comparing the first and third columns of Figure 2 in the manuscript, we can observe the progression of training. This comparison demonstrates that capturing the temporal evolution of high-order correlations gradually increases the separation between clusters while simultaneously tightening the grouping of entities within clusters.

Q3: Performance on Wikidata, YAGO, and GDELT datasets.

We have conducted additional experiments on WIKI, YAGO, and GDELT datasets. The results of these experiments are presented in Tables 1 and 3 of the attached rebuttal PDF. We are pleased to report that our approach has achieved the SOTA relation prediction performance across all these datasets, demonstrating the effectiveness and robustness of our approach.

New experimental results (see the rebuttal PDF):

  1. Performance on different datasets: Tables 1 and 3 in the rebuttal PDF show the relation prediction performance of DECRL on WIKI, YAGO, and GDELT.

  2. Performance of entity prediction task: Table 2 in the rebuttal PDF shows the entity prediction performance of DECRL on GDELT.

  3. The contributions of attentive temporal encoder and the fuzzy c-means clustering method: Table 4 in the rebuttal PDF shows the performance comparison of DECRL and its variants on ICEWS14.

  4. Model efficiency: Figure 1 in the rebuttal PDF illustrates the training time comparison with DHyper on ICEWS14 (in seconds).

  5. Case study: Figure 2 in the rebuttal PDF illustrates the entity representations of DHyper on ICEWS14C.

最终决定

This work proposes a Deep Evolutionary Clustering jointed temporal knowledge graph Representation Learning approach (DECRL) to learn representations of temporally evolving entities and relations. The main ideas include a cluster-aware alignment mechanism for one-to-one alignment of soft overlapping clusters across timestamps and a correlation encoder for capturing latent correlations between any pair of clusters. While reviewers acknowledged the empirical performance of the proposed method and commented that the idea seems reasonable, many critical points were addressed during the rebuttal, which should be added to the final version of the paper.

  1. The authors should include all additional experimental results shown in the rebuttal PDF, including the results on Wikidata, YAGO, and GDELT datasets and the entity prediction performance.
  2. The authors should clarify the definitions of 'entity graph' and 'cluster graph' and add more detailed explanations to Figure 2.
  3. Regarding the work's limitations, the authors wrote a single sentence: "This study overlooks the continuous temporal evolution of diverse high-order correlations, a limitation that future work will address.", which may not be informative. In the final version, the authors should elaborate on their work's limitations in much more detail.